Size: 1793
Comment:
|
Size: 2795
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 36: | Line 36: |
Line 38: | Line 39: |
Line 42: | Line 44: |
如果行分隔符号是'''\n'''(Windows中'''\n'''作为一个子串出现),那么可以选用如下的小技巧: | 如果行分隔符号是'\n' (Windows中''''\n''''作为一个子串出现), 那么可以选用如下的小技巧: |
Line 45: | Line 47: |
Line 48: | Line 51: |
buffer = thefile.read(8192*1024) | buffer = thefile.read(8192*1024) |
Line 50: | Line 53: |
count += buffer.count('\n') | count += buffer.count('\n') #string函数count |
Line 52: | Line 55: |
Line 54: | Line 58: |
{{{ #!python |
Without the 'rb' argument to open, this will work anywhere, but performance may suffer greatly on Windows or Macintosh platforms. |
Line 57: | Line 60: |
}}} | 如果不使用上面的'''rb'''参数,脚本也会正常工作,但是在Windows或Macintosh平台上可能有很大的效率损失。 |
Line 59: | Line 63: |
If you have an external program that counts a file's lines, such as wc -l on Unix-like platforms, you can of course choose to use that (e.g., via os.popen( )). However, it's generally simpler, faster, and more portable to do the line-counting in your program. You can rely on almost all text files having a reasonable size, so that reading the whole file into memory at once is feasible. For all such normal files, the len of the result of readlines gives you the count of lines in the simplest way. 如果可以使用外部程序,比如类Unix平台上命令 '''wc -l'''来统计文件行数,那么应该使用这个程序(在脚本中利用 '''os.popen()'''). 但是, 更简单快捷, |
文章来自《Python cookbook》. 翻译仅仅是为了个人学习,其它商业版权纠纷与此无关!
-- 61.182.251.99 [DateTime(2004-09-21T18:36:08Z)] TableOfContents
描述
统计文件行数
问题 Problem
需要计算文件的行数
解决 Solution
- The simplest approach, for reasonably sized files, is to read the file as a list of lines so that the count of lines is the length of the list.
对于具有适当大小的文件,最简单方法是拷贝文件各行到一个list中,list的长度就是文件的行数。
- If the file's path is in a string bound to the thefilepath variable, that's just:
如果文件路径由下面的string变量thefilepath给出,代码如下:
count = len(open(thefilepath).readlines( ))
For a truly huge file, this may be very slow or even fail to work. If you have to worry about humongous files, a loop using the xreadlines method always works:
对于很大的文件,readlines可能很慢, 甚至操作失败。 如果需要处理特别巨大的文件,使用xreadlines函数的循环可以保证不会出问题:
count = 0 for line in open(thefilepath).xreadlines( ): count += 1
Here's a slightly tricky alternative, if the line terminator is '\n' (or has '\n' as a substring, as happens on Windows):
如果行分隔符号是'\n' (Windows中'\n'作为一个子串出现), 那么可以选用如下的小技巧:
count = 0 thefile = open(thefilepath, 'rb') while 1: buffer = thefile.read(8192*1024) if not buffer: break count += buffer.count('\n') #string函数count thefile.close( )
Without the 'rb' argument to open, this will work anywhere, but performance may suffer greatly on Windows or Macintosh platforms.
如果不使用上面的rb参数,脚本也会正常工作,但是在Windows或Macintosh平台上可能有很大的效率损失。
讨论 Discussion
If you have an external program that counts a file's lines, such as wc -l on Unix-like platforms, you can of course choose to use that (e.g., via os.popen( )). However, it's generally simpler, faster, and more portable to do the line-counting in your program. You can rely on almost all text files having a reasonable size, so that reading the whole file into memory at once is feasible. For all such normal files, the len of the result of readlines gives you the count of lines in the simplest way. 如果可以使用外部程序,比如类Unix平台上命令 wc -l来统计文件行数,那么应该使用这个程序(在脚本中利用 os.popen()). 但是, 更简单快捷,
...