Differences between revisions 3 and 4
Revision 3 as of 2004-09-21 18:46:46
Size: 1793
Editor: 61
Comment:
Revision 4 as of 2004-09-21 18:58:41
Size: 2079
Editor: 61
Comment:
Deletions are marked like this. Additions are marked like this.
Line 42: Line 42:
如果行分隔符号是'''\n'''(Windows中'''\n'''作为一个子串出现),那么可以选用如下的小技巧: 如果行分隔符号是''''\n''''(Windows中''''\n''''作为一个子串出现),那么可以选用如下的小技巧:
Line 46: Line 46:
thefile = open(thefilepath, 'rb') thefile = open(thefilepath, ''''rb'''')
Line 53: Line 53:

Without the 'rb' argument to open, this will work anywhere, but performance may suffer greatly on Windows or Macintosh platforms.

如果不使用上面的'''rb'''参数,脚本也会正常工作,但是在Windows或Macintosh平台上可能有很大的效率损失。

文章来自《Python cookbook》.

翻译仅仅是为了个人学习,其它商业版权纠纷与此无关!

-- 61.182.251.99 [DateTime(2004-09-21T18:36:08Z)] TableOfContents

描述

统计文件行数

问题 Problem

需要计算文件的行数

解决 Solution

  • The simplest approach, for reasonably sized files, is to read the file as a list of lines so that the count of lines is the length of the list.

对于具有适当大小的文件,最简单方法是拷贝文件各行到一个list中,list的长度就是文件的行数。

  • If the file's path is in a string bound to the thefilepath variable, that's just:

如果文件路径由下面的string变量thefilepath给出,代码如下:

count = len(open(thefilepath).readlines(  ))

For a truly huge file, this may be very slow or even fail to work. If you have to worry about humongous files, a loop using the xreadlines method always works:

对于很大的文件,readlines可能很慢, 甚至操作失败。 如果需要处理特别巨大的文件,使用xreadlines函数的循环可以保证不会出问题:

count = 0
for line in open(thefilepath).xreadlines(  ): count += 1

Here's a slightly tricky alternative, if the line terminator is '\n' (or has '\n' as a substring, as happens on Windows):

如果行分隔符号是'\n'(Windows中'\n'作为一个子串出现),那么可以选用如下的小技巧:

count = 0
thefile = open(thefilepath, ''''rb'''')
while 1:
    buffer = thefile.read(8192*1024)
    if not buffer: break
    count += buffer.count('\n')
thefile.close(  )

Without the 'rb' argument to open, this will work anywhere, but performance may suffer greatly on Windows or Macintosh platforms.

如果不使用上面的rb参数,脚本也会正常工作,但是在Windows或Macintosh平台上可能有很大的效率损失。

   1 

讨论 Discussion

...

参考 See Also

PyCkBk-4-7 (last edited 2009-12-25 07:16:21 by localhost)