Differences between revisions 1 and 2
Revision 1 as of 2004-09-21 18:36:08
Size: 921
Editor: 61
Comment:
Revision 2 as of 2004-09-21 18:46:17
Size: 1795
Editor: 61
Comment:
Deletions are marked like this. Additions are marked like this.
Line 24: Line 24:
对具有适当大小文件,最简单方法是拷贝文件各行到一个list中,list的长度就是文件的行数。 具有适当大小文件,最简单方法是拷贝文件各行到一个list中,list的长度就是文件的行数。
Line 31: Line 31:
...
For a truly huge file, this may be very slow or even fail to work. If you have to worry about humongous files, a loop using the xreadlines method always works:

对于很大的文件,'''readlines'''可能很慢, 甚至操作失败。 如果需要处理特别巨大的文件,使用'''xreadlines'''函数的循环可以保证不会出问题:
{{{
count = 0
for line in open(thefilepath).xreadlines( ): count += 1
}}}

Here's a slightly tricky alternative, if the line terminator is '\n' (or has '\n' as a substring, as happens on Windows):

如果行分隔符号是''''\n''''(Windows中'''\n'''作为一个子串出现),那么可以选用如下的小技巧:

{{{
count = 0
thefile = open(thefilepath, 'rb')
while 1:
    buffer = thefile.read(8192*1024)
    if not buffer: break
    count += buffer.count('\n')
thefile.close( )
}}}

文章来自《Python cookbook》.

翻译仅仅是为了个人学习,其它商业版权纠纷与此无关!

-- 61.182.251.99 [DateTime(2004-09-21T18:36:08Z)] TableOfContents

描述

统计文件行数

问题 Problem

需要计算文件的行数

解决 Solution

  • The simplest approach, for reasonably sized files, is to read the file as a list of lines so that the count of lines is the length of the list.

对于具有适当大小的文件,最简单方法是拷贝文件各行到一个list中,list的长度就是文件的行数。

  • If the file's path is in a string bound to the thefilepath variable, that's just:

如果文件路径由下面的string变量thefilepath给出,代码如下:

count = len(open(thefilepath).readlines(  ))

For a truly huge file, this may be very slow or even fail to work. If you have to worry about humongous files, a loop using the xreadlines method always works:

对于很大的文件,readlines可能很慢, 甚至操作失败。 如果需要处理特别巨大的文件,使用xreadlines函数的循环可以保证不会出问题:

count = 0
for line in open(thefilepath).xreadlines(  ): count += 1

Here's a slightly tricky alternative, if the line terminator is '\n' (or has '\n' as a substring, as happens on Windows):

如果行分隔符号是'\n'(Windows中\n作为一个子串出现),那么可以选用如下的小技巧:

count = 0
thefile = open(thefilepath, 'rb')
while 1:
    buffer = thefile.read(8192*1024)
    if not buffer: break
    count += buffer.count('\n')
thefile.close(  )

   1 

讨论 Discussion

...

参考 See Also

PyCkBk-4-7 (last edited 2009-12-25 07:16:21 by localhost)