Table of Contents
- 1. os.walk目录遍历
- 1.1. os.walk
- 1.2. 例子
- 1.2.1. 测试topdown
- 1.2.2. 运行时修改遍历目录
- 2. 参考资料
os.walk目录遍历
每个月都有那么几天想划水,又到划水的日子了,今天分享的是刚在处理遍历目录相关用到的相关方法。
os.walk
os.walk的参数如下:
os.walk(top, topdown=True, onerror=None, followlinks=False)其中:
- top是要遍历的目录。
- topdown是代表要从上而下遍历还是从下往上遍历。
- onerror可以用来设置当便利出现错误的处理函数(该函数接受一个OSError的实例作为参数),设置为空则不作处理。
- followlinks表示是否要跟随目录下的链接去继续遍历,要注意的是,os.walk不会记录已经遍历的目录,所以跟随链接遍历的话有可能一直循环调用下去。
os.walk返回的是一个3个元素的元组
(root, dirs, files) ,分别表示遍历的路径名,该路径下的目录列表和该路径下文件列表。注意目录列表和文件列表不是具体路径,需要具体路径(从root开始的路径)的话可以用
os.path.join(root,dir) 和
os.path.join(root,dir) 。
例子
假设现在存在如下的文件和目录结构:
?test_os_walk git:(master) ? tree.├── a.py├── b.py├── c.py├── dir1│ ├── dir4│ │ ├── g.py│ │ └── h.py│ ├── dirx│ │ ├── diry│ │ │ └── k.py│ │ └── z.py│ ├── e.py│ ├── f.py│ └── g.py├── dir2│ ├── dira│ │ └── dirb│ │ └── dirc│ │ └── aha.py│ ├── k.py│ ├── l.py│ └── m.py└── dir3├── dir5│ └── z.py├── x.py└── y.py10 directories, 17 files测试topdown
当我用
os.walk 遍历这个目录时,程序和输出如下:
import ospath = "/Users/nisen/Projects/python_advanced_class/test/test_os_walk"for root, dirs, files in os.walk(path, True):print "root: %s" % rootprint "dirs: %s" % dirsprint "files: %s" % filesprint ""结果如下,从root的路径可以看出遍历是自上而下的:
?test git:(master) ? python test11.pyroot: /Users/nisen/Projects/python_advanced_class/test/test_os_walkdirs: ["dir1", "dir2", "dir3"]files: ["a.py", "b.py", "c.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1dirs: ["dir4", "dirx"]files: ["e.py", "f.py", "g.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dir4dirs: []files: ["g.py", "h.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirxdirs: ["diry"]files: ["z.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx/dirydirs: []files: ["k.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2dirs: ["dira"]files: ["k.py", "l.py", "m.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/diradirs: ["dirb"]files: []root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirbdirs: ["dirc"]files: []root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb/dircdirs: []files: ["aha.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3dirs: ["dir5"]files: ["x.py", "y.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3/dir5dirs: []files: ["z.py"]而当设置os.walk的topdown为False时,结果如下, 可以看出他是自上而下遍历的:
?test git:(master) ? python test11.pyroot: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dir4dirs: []files: ["g.py", "h.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx/dirydirs: []files: ["k.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirxdirs: ["diry"]files: ["z.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1dirs: ["dir4", "dirx"]files: ["e.py", "f.py", "g.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb/dircdirs: []files: ["aha.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirbdirs: ["dirc"]files: []root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/diradirs: ["dirb"]files: []root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2dirs: ["dira"]files: ["k.py", "l.py", "m.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3/dir5dirs: []files: ["z.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3dirs: ["dir5"]files: ["x.py", "y.py"]root: /Users/nisen/Projects/python_advanced_class/test/test_os_walkdirs: ["dir1", "dir2", "dir3"]files: ["a.py", "b.py", "c.py"]运行时修改遍历目录
当topdown设置为True时,可以在处理时修改返回的
dirs 列表,这样可以遍历下面的目录时会根据修改后的
dirs 来遍历。比如下面的例子,在遍历的时候不把"CSV"目录包括在内:
import osfrom os.path import join, getsizefor root, dirs, files in os.walk("python/Lib/email"):print root, "consumes",print sum(getsize(join(root, name)) for name in files),print "bytes in", len(files), "non-directory files"if "CVS" in dirs:dirs.remove("CVS")# don"t visit CVS directories参考资料
- https://docs.python.org/2/library/os.html#os.walk
本文永久更新链接地址:http://www.linuxidc.com/Linux/2016-12/137763.htm