Differences between revisions 1 and 2

描述

问题
解决
# def

讨论参考文章来自《Python cookbook》.
翻译仅仅是为了个人学习,其它商业版权纠纷与此无关!
-- 61.182.251.99 [DateTime(2004-09-22T19:44:16Z)] TableOfContents class="line874">Reading a Text File by Paragraphs
按段落读取文本文件
Credit: Alex Martelli, Magnus Lie Hetland
Problem

class="line874">You need to read a file paragraph by paragraph, in which a paragraph is defined as a sequence of nonempty lines (in other words, paragraphs are separated by empty lines).

需要按段落读取文件，段落的定义是由非空行组成的行序列(既空行分隔段落)

Solution class="line874">A wrapper class is, as usual, the right Pythonic architecture for this (in Python 2.1 and earlier):

按照Python语言风格(在Python 2.1及更早版本中)普通的解决架构是使用一个包装类(wrapper class):

class Paragraphs: def _ _init_ _(self, fileobj, separator='\n'): # Ensure that we get a line-reading sequence in the best way possible: # 保证用最佳方法读取行系列                (#译注:困惑阿,xreadlines在2.3？中已经deprecated了) import xreadlines try: # Check if the file-like object has an xreadlines method # 检查可能是文件对象的参数是否具有'''xreadlines'''方法 self.seq = fileobj.xreadlines(  ) except AttributeError: # No, so fall back to the xreadlines module's implementation # 如果参数对象不具有xreadlines方法，使用xreadlines模块的实现 self.seq = xreadlines.xreadlines(fileobj) self.line_num = 0    # current index into self.seq (line number) #实例变量, 行号索引， self.para_num = 0    # current index into self (paragraph number) #实例变量,段落号索引， # Ensure that separator string includes a line-end character at the end #检查参数'''分隔字符串'''末尾包含 '\n' if separator[-1:] != '\n': separator += '\n' self.separator = separator         #实例变量,段落号索引， def _ _getitem_ _(self, index): if index != self.para_num: raise TypeError, "Only sequential access supported" self.para_num += 1 # Start where we left off and skip 0+ separator lines while 1: # Propagate IndexError, if any, since we're finished if it occurs line = self.seq[self.line_num] self.line_num += 1 if line != self.separator: break # Accumulate 1+ nonempty lines into result result = [line] while 1: # Intercept IndexError, since we have one last paragraph to return try: # Let's check if there's at least one more line in self.seq line = self.seq[self.line_num] except IndexError: # self.seq is finished, so we exit the loop break # Increment index into self.seq for next time self.line_num += 1 if line == self.separator: break result.append(line) return ''.join(result) Here's an example function, showing how to use class Paragraphs: show_paragraphs(filename, numpars=5): pp = Paragraphs(open(filename)) for p in pp: print "Par#%d, line# %d: %s" % ( pp.para_num, pp.line_num, repr(p)) if pp.para_num>numpars: break

Discussion class="anchor" id="line-93">

...

-  ⇤ ← Revision 1 as of 2004-09-22 19:44:16 → 
  Size: 3183
  Editor: 61
  Comment:
+   ← Revision 2 as of 2004-09-22 19:54:34 → ⇥
  Size: 3699
  Editor: 61
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 25:
-按照Python语言风格(在Python 2.1及更早版本中)普通的解决架构是使用一个'''包装'''类(wrapper class)
+按照Python语言风格(在Python 2.1及更早版本中)普通的解决架构是使用一个'''包装'''类(wrapper class):
-Line 32:
+Line 34:
-        # 保证
+        # 保证用最佳方法读取行系列                (#译注:困惑阿,xreadlines在2.3？中已经deprecated了)
-Line 36:
+Line 38:
+            # 检查可能是文件对象的参数是否具有'''xreadlines'''方法
-Line 39:
+Line 42:
+            # 如果参数对象不具有xreadlines方法，使用xreadlines模块的实现
-Line 42:
+Line 46:
+                             #实例变量, 行号索引，
-Line 43:
+Line 48:
+                             #实例变量,段落号索引，
-Line 45:
+Line 50:
+        #检查参数'''分隔字符串'''末尾包含 '\n'
-Line 46:
+Line 52:
-        self.separator = separator
+        self.separator = separator         #实例变量,段落号索引，

Diff for "PyCkBk-4-9"

描述