##language:zh ''' 文章来自《Python cookbook》. 翻译仅仅是为了个人学习,其它商业版权纠纷与此无关! ''' -- 218.25.65.133 [<>] <> = 描述 = Computing Directory Sizes in a Cross-Platform Way 平台无关地计算目录大小 Credit: Frank Fejes == 问题 Problem == You need to compute the total size of a directory (or set of directories) in a way that works under both Windows and Unix-like platforms. 需要计算目录(或者目录集合)的大小,要求在Windows和类Unix系统上代码都适用。 == 解决 Solution == There are easier platform-dependent solutions, such as Unix's du, but Python also makes it quite feasible to have a cross-platform solution: 特定平台上有简单的方法,如Unix上的du命令。使用Python可以更容易的获得平台无关的方法: {{{ #!python import os from os.path import * class DirSizeError(Exception): pass def dir_size(start, follow_links=0, start_depth=0, max_depth=0, skip_errs=0): # Get a list of all names of files and subdirectories in directory start try: dir_list = os.listdir(start) except: # If start is a directory, we probably have permission problems if os.path.isdir(start): #译注:没有读权限 raise DirSizeError('Cannot list directory %s'%start) else: # otherwise, just re-raise the error so that it propagates raise total = 0L for item in dir_list: # Get statistics on each item--file and subdirectory--of start path = join(start, item) try: stats = os.stat(path) except: if not skip_errs: raise DirSizeError('Cannot stat %s'%path) #译注:没有读权限 # The size in bytes is in the seventh item of the stats tuple, so: total += stats[6] # recursive descent if warranted if isdir(path) and (follow_links or not islink(path)): #译注:遍历计算子目录 bytes = dir_size(path, follow_links, start_depth+1, max_depth) total += bytes if max_depth and (start_depth < max_depth): print_path(path, bytes) return total def print_path(path, bytes, units='b'): if units == 'k': print '%-8ld%s' % (bytes / 1024, path) elif units == 'm': print '%-5ld%s' % (bytes / 1024 / 1024, path) else: print '%-11ld%s' % (bytes, path) def usage (name): print "usage: %s [-bkLm] [-d depth] directory [directory...]" % name print '\t-b\t\tDisplay in Bytes (default)' print '\t-k\t\tDisplay in Kilobytes' print '\t-m\t\tDisplay in Megabytes' print '\t-L\t\tFollow symbolic links (meaningful on Unix only)' print '\t-d, --depth\t# of directories down to print (default = 0)' if _ _name_ _=='_ _main_ _': # When used as a script: import string, sys, getopt units = 'b' follow_links = 0 depth = 0 try: opts, args = getopt.getopt(sys.argv[1:], "bkLmd:", ["depth="]) #译注:解析命令行参数 except getopt.GetoptError: usage(sys.argv[0]) sys.exit(1) for o, a in opts: if o == '-b': units = 'b' elif o == '-k': units = 'k' elif o == '-L': follow_links = 1 elif o == '-m': units = 'm' elif o in ('-d', '--depth'): try: depth = int(a) except: print "Not a valid integer: (%s)" % a usage(sys.argv[0]) sys.exit(1) if len(args) < 1: print "No directories specified" usage(sys.argv[0]) sys.exit(1) else: paths = args for path in paths: try: bytes = dir_size(path, follow_links, 0, depth) except DirSizeError, x: print "Error:", x else: print_path(path, bytes) }}} == 讨论 Discussion == Unix-like platforms have the du command, but that doesn't help when you need to get information about disk-space usage in a cross-platform way. This recipe has been tested under both Windows and Unix, although it is most useful under Windows, where the normal way of getting this information requires using a GUI. In any case, the recipe's code can be used both as a module (in which case you'll normally call only the dir_size function) or as a command-line script. Typical use as a script is: 类Unix平台上可以使用du命令计算目录大小,但是在其它平台上没有这个命令。上面脚本在Windows和Unix下都测试过。在Windows下面计算目录大小信息通常需要使用GUI(?),这样脚本在windows下更有用些. 在任何平台下,脚本代码可以作为一个模块(可能仅仅使用dir-size函数)或者作为命令行脚本使用。 典型的使用方法如下: {{{ C:\> python dir_size.py "c:\Program Files" }}} This will give you some idea of where all your disk space has gone. To help you narrow the search, you can, for example, display each subdirectory: 使用上面命令行脚本可以获得全部磁盘空间的使用情况。如果想获得每个子目录的信息,可以使用'''浅'''搜索: {{{ C:\> python dir_size.py --depth=1 "c:\Program Files" }}} The recipe's operation is based on recursive descent. os.listdir provides a list of names of all the files and subdirectories of a given directory. If dir_size finds a subdirectory, it calls itself recursively. An alternative architecture might be based on os.path.walk, which handles the recursion on our behalf and just does callbacks to a function we specify, for each subdirectory it visits. However, here we need to be able to control the depth of descent (e.g., to allow the useful --depth command-line option, which turns into the max_depth argument of the dir_size function). This control is easier to attain when we administer the recursion directly, rather than letting os.path.walk handle it on our behalf. 脚本代码中使用下向递归。os.listdir获得指定目录下所有文件名称和子目录名称的list。如果函数dir_size判断出一个子目录,会递归调用自己。另一种解决方法是以os.path.walk为基础,walk代替我们处理递归,在每个子目录上使用我们指定的回调函数获得目录大小信息。不过,这里我们需要控制递归的深度(比如,允许使用便利的命令行参数--depth来确定dir_size函数的参数max_depth),直接处理递归而不用os.path.walk代替我们处理, 可以更容易的控制函数行为。 == 参考 See Also == Documentation for the os.path and getopt modules in the Library Reference.