10. Brief Tour of the Standard Library Python标准库概览

10.1. Operating System Interface 操作系统接口

The os module provides dozens of functions for interacting with the operating system.

os 模块提供了很多与操作系统交互的函数。

>>> import os
>>> os.system('time 0:02')
0
>>> os.getcwd()      # Return the current working directory
'C:\\Python31'
>>> os.chdir('/server/accesslogs')

Be sure to use the import os style instead of from os import *. This will keep os.open() from shadowing the built-in open() function which operates much differently.

一定要使用 import os 风格而不是 from os import * ! 这会使 os.open() 函数覆盖内置的 open() 函数,因为它们的操作有很多不同。

The built-in dir() and help() functions are useful as interactive aids for working with large modules like os.

内置函数 dir()help() 对交互的使用像 os 这样的大模块非常有用。

>>> import os
>>> dir(os)
<returns a list of all module functions>
>>> help(os)
<returns an extensive manual page created from the module's docstrings>

For daily file and directory management tasks, the shutil module provides a higher level interface that is easier to use.

对于日常文件和目录管理任务, shutil 模块提供了易于使用的更高层的交互。

>>> import shutil
>>> shutil.copyfile('data.db', 'archive.db')
>>> shutil.move('/build/executables', 'installdir')

10.2. File Wildcards 文件通配符

The glob module provides a function for making file lists from directory wildcard searches.

glob 模块提供了一个函数用来从目录通配符搜索中生产文件列表。

>>> import glob
>>> glob.glob('*.py')
['primes.py', 'random.py', 'quote.py']

10.3. Command Line Arguments 命令行参数

Common utility scripts often need to process command line arguments. These arguments are stored in the sys module’s argv attribute as a list. For instance the following output results from running python demo.py one two three at the command line:

常用的实用脚本通常需要处理命令行参数。 这些参数以一个列表的形式存储在 sys 模块的 argv 属性中。 例如在命令行中运行 python demo.py one two three 可以得到下面的输出:

>>> import sys
>>> print(sys.argv)
['demo.py', 'one', 'two', 'three']

The getopt module processes sys.argv using the conventions of the Unix getopt() function. More powerful and flexible command line processing is provided by the optparse module.

getopt 模块以Unix的 getopt() 函数方式处理 sys.argv 。 更多灵活有效的命令行参数处理由 optparse 模块提供。

10.4. Error Output Redirection and Program Termination 错误输出重定向和程序终止

The sys module also has attributes for stdin, stdout, and stderr. The latter is useful for emitting warnings and error messages to make them visible even when stdout has been redirected.

sys 模块还包含 stdinstdoutstderr 属性。 即使在 stdout 被重定向时,后者也可以用于显示警告和错误信息。

>>> sys.stderr.write('Warning, log file not found starting a new one\n')
Warning, log file not found starting a new one

The most direct way to terminate a script is to use sys.exit().

退出一个脚本最直接的方法是使用``sys.exit()``。

10.5. String Pattern Matching 字符串模式匹配

The re module provides regular expression tools for advanced string processing. For complex matching and manipulation, regular expressions offer succinct, optimized solutions:

re 模块为高级字符串处理提供了正则表达式工具。 正则表达式为复杂的字符串匹配和处理提供了简洁,优化的方法:

>>> import re
>>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')
['foot', 'fell', 'fastest']
>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
'cat in the hat'

When only simple capabilities are needed, string methods are preferred because they are easier to read and debug.

当仅需简单的功能时,应该首先考虑使用字符串方法,因为他们容易阅读和调试。

>>> 'tea for too'.replace('too', 'two')
'tea for two'

10.6. Mathematics 数学

The math module gives access to the underlying C library functions for floating point math.

math 模块为浮点数运算提供了对底层C函数库的访问支持。

>>> import math
>>> math.cos(math.pi / 4)
0.70710678118654757
>>> math.log(1024, 2)
10.0

The random module provides tools for making random selections.

random 模块为随机选择功能提供了工具支持。

>>> import random
>>> random.choice(['apple', 'pear', 'banana'])
'apple'
>>> random.sample(range(100), 10)   # sampling without replacement
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
>>> random.random()    # random float
0.17970987693706186
>>> random.randrange(6)    # random integer chosen from range(6)
4

The SciPy project <http://scipy.org> has many other modules for numerical computations.

SciPy项目<http://scipy.org>包括很多其他数字计算模块。

10.7. Internet Access 互联网访问

There are a number of modules for accessing the internet and processing internet protocols. Two of the simplest are urllib.request for retrieving data from urls and smtplib for sending mail.

Python包含许多访问互联网和处理互联网协议的模块。 其中最简单的两个是通过URL地址获取数据的 urllib.request 和发送邮件的 smtplib

>>> from urllib.request import urlopen
>>> for line in urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl'):
...     if 'EST' in line or 'EDT' in line:  # look for Eastern Time
...         print(line)

<BR>Nov. 25, 09:43:32 PM EST

>>> import smtplib
>>> server = smtplib.SMTP('localhost')
>>> server.sendmail('soothsayer@example.org', 'jcaesar@example.org',
... """To: jcaesar@example.org
... From: soothsayer@example.org
...
... Beware the Ides of March.
... """)
>>> server.quit()

(Note that the second example needs a mailserver running on localhost.)

(注意:第二个示例需要在本机运行邮件服务器。)

10.8. Dates and Times 日期和时间

The datetime module supplies classes for manipulating dates and times in both simple and complex ways. While date and time arithmetic is supported, the focus of the implementation is on efficient member extraction for output formatting and manipulation. The module also supports objects that are timezone aware.

datetime 模块为日期和时间处理提供了简单和复杂的类支持。 由于为日期和时间的算术运算提供了支持,格式化输出和处理实现的重点就是高校的成员提取。 这个模块同样支持时区处理。

# dates are easily constructed and formatted
>>> from datetime import date
>>> now = date.today()
>>> now
datetime.date(2003, 12, 2)
>>> now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.")
'12-02-03. 02 Dec 2003 is a Tuesday on the 02 day of December.'

# dates support calendar arithmetic
>>> birthday = date(1964, 7, 31)
>>> age = now - birthday
>>> age.days
14368

10.9. Data Compression 数据压缩

Common data archiving and compression formats are directly supported by modules including: zlib, gzip, bz2, zipfile and tarfile.

Python模块还直接支持常用数数据打包和压缩格式,包括: zlibgzipbz2zipfiletarfile 等。

>>> import zlib
>>> s = 'witch which has which witches wrist watch'
>>> len(s)
41
>>> t = zlib.compress(s)
>>> len(t)
37
>>> zlib.decompress(t)
'witch which has which witches wrist watch'
>>> zlib.crc32(s)
226805979

10.10. Performance Measurement 性能评测

Some Python users develop a deep interest in knowing the relative performance of different approaches to the same problem. Python provides a measurement tool that answers those questions immediately.

有些Python开发对处理同一问题的不同方法之间的性能差异抱有浓厚的兴趣。 Python提供了一个测试工具可以立即找到这些问题的答案。

For example, it may be tempting to use the tuple packing and unpacking feature instead of the traditional approach to swapping arguments. The timeit module quickly demonstrates a modest performance advantage:

例如,使用元组的封装和拆封特性代替传统的方法交换参数是很诱人的。 timeit 模块快速的演示了这一微小性能优势:

>>> from timeit import Timer
>>> Timer('t=a; a=b; b=t', 'a=1; b=2').timeit()
0.57535828626024577
>>> Timer('a,b = b,a', 'a=1; b=2').timeit()
0.54962537085770791

In contrast to timeit‘s fine level of granularity, the profile and pstats modules provide tools for identifying time critical sections in larger blocks of code.

timeit 的细粒度相比, profilepstate 模块提供了在大代码块中识别时间临界区的工具。

10.11. Quality Control 质量控制

One approach for developing high quality software is to write tests for each function as it is developed and to run those tests frequently during the development process.

开发高质量软件的一种方法是在开发时为每个函数编写测试,并在开发过程中经常运行这些测试。

The doctest module provides a tool for scanning a module and validating tests embedded in a program’s docstrings. Test construction is as simple as cutting-and-pasting a typical call along with its results into the docstring. This improves the documentation by providing the user with an example and it allows the doctest module to make sure the code remains true to the documentation.

doctest 模块为模块扫描和验证内嵌在程序文档字符串中的测试提供了一个工具。 测试编制就是简单的把一个典型的调用及它的结果剪切并粘贴到文档字符串里。 这通过为用户提供一个示例改善了文档,并且它允许 doctext 模块确认代码和文档相符。

def average(values):
    """Computes the arithmetic mean of a list of numbers.

    >>> print(average([20, 30, 70]))
    40.0
    """
    return sum(values) / len(values)

import doctest
doctest.testmod()   # automatically validate the embedded tests

The unittest module is not as effortless as the doctest module, but it allows a more comprehensive set of tests to be maintained in a separate file.

unittest 模块不像 doctest 那么容易使用,不过它可以在一个独立的文件里提供更全面的测试集。

import unittest

class TestStatisticalFunctions(unittest.TestCase):

    def test_average(self):
        self.assertEqual(average([20, 30, 70]), 40.0)
        self.assertEqual(round(average([1, 5, 7]), 1), 4.3)
        self.assertRaises(ZeroDivisionError, average, [])
        self.assertRaises(TypeError, average, 20, 30, 70)

unittest.main() # Calling from the command line invokes all tests

10.12. Batteries Included “瑞士军刀”

Python has a “batteries included” philosophy. This is best seen through the sophisticated and robust capabilities of its larger packages. For example:

Python展现了“瑞士军刀”的哲学。 这可以通过它更大的包的高级和健壮的功能来得到最好的展现。 列如:

  • The xmlrpc.client and xmlrpc.server modules make implementing remote procedure calls into an almost trivial task. Despite the modules names, no direct knowledge or handling of XML is needed.

    xmlrpc.clientxmlrpc.server 模块让远程过程调用变得轻而易举。 尽管模块有这样的名字,用户无需拥有XML的知识或处理XML。

  • The email package is a library for managing email messages, including MIME and other RFC 2822-based message documents. Unlike smtplib and poplib which actually send and receive messages, the email package has a complete toolset for building or decoding complex message structures (including attachments) and for implementing internet encoding and header protocols.

    email 包是一个管理邮件信息的库,包括MIME和其它基于RFC 2822的信息文档。 不同于实际发送和接收信息的 smtplibpoplib 模块, email 包包含一个构造或解析复杂消息结构(包括附件)及实现互联网编码和头协议的完整工具集。

  • The xml.dom and xml.sax packages provide robust support for parsing this popular data interchange format. Likewise, the csv module supports direct reads and writes in a common database format. Together, these modules and packages greatly simplify data interchange between python applications and other tools.

    xml.domxml.sax 包为这些流行的数据交换格式解析提供了强健的支持。 同样的, csv 模块支持从一种通用的数据库格式中直接读写。 总之,这些模块和包大大简化了Python应用程序和其他工具之间的数据交换。

  • Internationalization is supported by a number of modules including gettext, locale, and the codecs package.

    有若干模块可以实现国际化操作,包括: gettextlocalecodecs 包。