'''Python Standard Library'''

''翻译: Python 江湖群''

2008-03-28 13:11:52

<<TableOfContents>>

----

[index.html 返回首页]

----

= 1. 线程和进程 =
  "Well, since you last asked us to stop, this thread has moved from discussing languages suitable for professional programmers via accidental users to computer-phobic users. A few more iterations can make this thread really interesting..."

  - eff-bot, June 1996

----

== 1.1. 概览 ==
本章将介绍标准 Python 解释器中所提供的线程支持模块. 
注意线程支持模块是可选的, 有可能在一些 Python 解释器中不可用. 

本章还涵盖了一些 Unix 和 Windows 下用于执行外部进程的模块.

=== 1.1.1. 线程 ===
执行 Python 程序的时候, 是按照从主模块顶端向下执行的. 循环用于重复执行部分代码, 
函数和方法会将控制临时移交到程序的另一部分. 

通过线程, 你的程序可以在同时处理多个任务. 每个线程都有它自己的控制流. 
所以你可以在一个线程里从文件读取数据, 另个向屏幕输出内容. 

为了保证两个线程可以同时访问相同的内部数据, Python 使用了 
''global interpreter lock (全局解释器锁)''. 在同一时间只可能有一个线程执行 Python 代码; 
Python 实际上是自动地在一段很短的时间后切换到下个线程执行, 
或者等待 一个线程执行一项需要时间的操作(例如等待通过 socket 传输的数据, 
或是从文件中读取数据). 

全局锁事实上并不能避免你程序中的问题. 多个线程尝试访问相同的数据会导致异常 状态. 例如以下的代码: 

{{{
def getitem(key):
    item = cache.get(key)
    if item is None:
        # not in cache; create a new one
        item = create_new_item(key)
	cache[key] = item
    return item
}}}

如果不同的线程先后使用相同的 key 调用这里的 {{{getitem}}} 方法, 
那么它们很可能会导致相同的参数调用两次 {{{create_new_item}}} . 
大多时候这样做没有问题, 但在某些时候会导致严重错误. 

不过你可以使用 ''lock objects'' 来同步线程. 一个线程只能拥有一个 ''lock object'' , 这样就可以确保某个时刻 只有一个线程执行 ''getitem'' 函数.

=== 1.1.2. 进程 ===
在大多现代操作系统中, 每个程序在它自身的''进程( process )''内执行. 
我们通过在 shell 中键入命令或直接在菜单中选择来执行一个程序/进程. 
Python 允许你在一个脚本内执行一个新的程序. 

大多进程相关函数通过 {{{os}}} 模块定义. 相关内容请参阅 [[#toc25|第 1.4.4 小节]] .

----

== 1.2. threading 模块 ==
(可选) {{{threading}}} 模块为线程提供了一个高级接口, 如 [[#eg-3-1|Example 3-1]] 所示. 
它源自 Java 的线程实现. 和低级的 {{{thread}}} 模块相同, 只有你在编译解释器时打开了线程支持才可以使用它 .

你只需要继承 ''Thread'' 类, 定义好 {{{run}}} 方法, 就可以创建一 个新的线程. 
使用时首先创建该类的一个或多个实例, 然后调用 {{{start}}} 方法. 这样每个实例的 
{{{run}}} 方法都会运行在它自己的线程里.

==== 1.2.0.1. Example 3-1. 使用 threading 模块 ====
{{{
File: threading-example-1.py

import threading
import time, random

class Counter:
    def _ _init_ _(self):
        self.lock = threading.Lock()
        self.value = 0

    def increment(self):
        self.lock.acquire() # critical section
        self.value = value = self.value + 1
        self.lock.release()
        return value

counter = Counter()

class Worker(threading.Thread):

    def run(self):
        for i in range(10):
            # pretend we're doing something that takes 10�00 ms
            value = counter.increment() # increment global counter
            time.sleep(random.randint(10, 100) / 1000.0)
            print self.getName(), "-- task", i, "finished", value

#
# try it

for i in range(10):
    Worker().start() # start a worker

*B*Thread-1 -- task 0 finished 1
Thread-3 -- task 0 finished 3
Thread-7 -- task 0 finished 8
Thread-1 -- task 1 finished 7
Thread-4 -- task 0 Thread-5 -- task 0 finished 4
finished 5
Thread-8 -- task 0 Thread-6 -- task 0 finished 9
finished 6
...
Thread-6 -- task 9 finished 98
Thread-4 -- task 9 finished 99
Thread-9 -- task 9 finished 100*b*
}}}

[[#eg-3-1|Example 3-1]] 使用了 ''Lock'' 对象来在全局 ''Counter'' 对象里创建临界区 
(critical section). 如果删除了 {{{acquire}}} 和 {{{release}}} 语句, 那么 {{{Counter}}} 很可能不会到达 100.

----

== 1.3. Queue 模块 ==
{{{Queue}}} 模块提供了一个线程安全的队列 (queue) 实现, 如 [[#eg-3-2|Example 3-2]] 所示. 
你可以通过它在多个线程里安全访问同个对象.

==== 1.3.0.1. Example 3-2. 使用 Queue 模块 ====
{{{
File: queue-example-1.py

import threading
import Queue
import time, random

WORKERS = 2

class Worker(threading.Thread):

    def _ _init_ _(self, queue):
        self._ _queue = queue
        threading.Thread._ _init_ _(self)

    def run(self):
        while 1:
            item = self._ _queue.get()
            if item is None:
                break # reached end of queue

            # pretend we're doing something that takes 10�00 ms
            time.sleep(random.randint(10, 100) / 1000.0)

            print "task", item, "finished"

#
# try it

queue = Queue.Queue(0)

for i in range(WORKERS):
    Worker(queue).start() # start a worker

for i in range(10):
    queue.put(i)

for i in range(WORKERS):
    queue.put(None) # add end-of-queue markers

*B*task 1 finished
task 0 finished
task 3 finished
task 2 finished
task 4 finished
task 5 finished
task 7 finished
task 6 finished
task 9 finished
task 8 finished*b*
}}}

[[#eg-3-3|Example 3-3]] 展示了如何限制队列的大小. 如果队列满了, 
那么控制主线程 (producer threads) 被阻塞, 等待项目被弹出 (pop off). 

==== 1.3.0.2. Example 3-3. 使用限制大小的 Queue 模块 ====
{{{
File: queue-example-2.py

import threading
import Queue

import time, random

WORKERS = 2

class Worker(threading.Thread):

    def _ _init_ _(self, queue):
        self._ _queue = queue
        threading.Thread._ _init_ _(self)

    def run(self):
        while 1:
            item = self._ _queue.get()
            if item is None:
                break # reached end of queue

            # pretend we're doing something that takes 10�00 ms
            time.sleep(random.randint(10, 100) / 1000.0)

            print "task", item, "finished"

#
# run with limited queue

queue = Queue.Queue(3)

for i in range(WORKERS):
    Worker(queue).start() # start a worker

for item in range(10):
    print "push", item
    queue.put(item)

for i in range(WORKERS):
    queue.put(None) # add end-of-queue markers

*B*push 0
push 1
push 2
push 3
push 4
push 5
task 0 finished
push 6
task 1 finished
push 7
task 2 finished
push 8
task 3 finished
push 9
task 4 finished
task 6 finished
task 5 finished
task 7 finished
task 9 finished
task 8 finished*b*
}}}

你可以通过继承 ''Queue'' 类来修改它的行为. [[#eg-3-4|Example 3-4]] 
为我们展示了一个简单的具有优先级的队列. 它接受一个元组作为参数, 
元组的第一个成员表示优先级(数值越小优先级越高). 

==== 1.3.0.3. Example 3-4. 使用 Queue 模块实现优先级队列 ====
{{{
File: queue-example-3.py

import Queue
import bisect

Empty = Queue.Empty

class PriorityQueue(Queue.Queue):
    "Thread-safe priority queue"

    def _put(self, item):
        # insert in order
        bisect.insort(self.queue, item)

#
# try it

queue = PriorityQueue(0)

# add items out of order
queue.put((20, "second"))
queue.put((10, "first"))
queue.put((30, "third"))

# print queue contents
try:
    while 1:
        print queue.get_nowait()
except Empty:
    pass

*B*third
second
first*b*
}}}

[[#eg-3-5|Example 3-5]] 展示了一个简单的堆栈 (stack) 实现 
(末尾添加, 头部弹出, 而非头部添加, 头部弹出).

==== 1.3.0.4. Example 3-5. 使用 Queue 模块实现一个堆栈 ====
{{{
File: queue-example-4.py

import Queue

Empty = Queue.Empty

class Stack(Queue.Queue):
    "Thread-safe stack"
    
    def _put(self, item):
        # insert at the beginning of queue, not at the end
        self.queue.insert(0, item)

    # method aliases
    push = Queue.Queue.put
    pop = Queue.Queue.get
    pop_nowait = Queue.Queue.get_nowait

#
# try it

stack = Stack(0)

# push items on stack
stack.push("first")
stack.push("second")
stack.push("third")

# print stack contents
try:
    while 1:
        print stack.pop_nowait()
except Empty:
    pass

*B*third
second
first*b*
}}}

----

== 1.4. thread 模块 ==
(可选) {{{thread}}} 模块提为线程提供了一个低级 (low_level) 的接口, 如 [[#eg-3-6|Example 3-6]] 所示. 
只有你在编译解释器时打开了线程支持才可以使用它. 如果没有特殊需要, 最好使用高级接口 {{{threading}}} 模块替代. 

==== 1.4.0.1. Example 3-6. 使用 thread 模块 ====
{{{
File: thread-example-1.py

import thread
import time, random

def worker():
    for i in range(50):
        # pretend we're doing something that takes 10�00 ms
        time.sleep(random.randint(10, 100) / 1000.0)
        print thread.get_ident(), "-- task", i, "finished"

#
# try it out!

for i in range(2):
    thread.start_new_thread(worker, ())

time.sleep(1)

print "goodbye!"

*B*311 -- task 0 finished
265 -- task 0 finished
265 -- task 1 finished
311 -- task 1 finished
...
265 -- task 17 finished
311 -- task 13 finished
265 -- task 18 finished
goodbye!*b*
}}}

注意当主程序退出的时候, 所有的线程也随着退出. 而 {{{threading}}} 模块不存在这个问题 . (该行为可改变)

----

== 1.5. commands 模块 ==
(只用于 Unix) {{{commands}}} 模块包含一些用于执行外部命令的函数. 
[[#eg-3-7|Example 3-7]] 展示了这个模块.

==== 1.5.0.1. Example 3-7. 使用 commands 模块 ====
{{{
File: commands-example-1.py

import commands

stat, output = commands.getstatusoutput("ls -lR")

print "status", "=>", stat
print "output", "=>", len(output), "bytes"

*B*status => 0
output => 171046 bytes*b*
}}}

----

== 1.6. pipes 模块 ==
(只用于 Unix) {{{pipes}}} 模块提供了 
"转换管道 (conversion pipelines)" 的支持. 你可以创建包含许多外部工具调用的管道来处理多个文件. 
如 [[#eg-3-8|Example 3-8]] 所示.

==== 1.6.0.1. Example 3-8. 使用 pipes 模块 ====
{{{
File: pipes-example-1.py

import pipes

t = pipes.Template()

# create a pipeline
# 这里 " - " 代表从标准输入读入内容
t.append("sort", "--")
t.append("uniq", "--")

# filter some text
# 这里空字符串代表标准输出
t.copy("samples/sample.txt", "")

*B*Alan Jones (sensible party)
Kevin Phillips-Bong (slightly silly)
Tarquin Fin-tim-lin-bin-whin-bim-lin-bus-stop-F'tang-F'tang-Olé-Biscuitbarrel*b*
}}}

----

== 1.7. popen2 模块 ==
{{{popen2}}} 模块允许你执行外部命令, 并通过流来分别访问它的 {{{stdin}}} 和 {{{stdout}}} ( 可能还有 {{{stderr}}} ). 

在 python 1.5.2 以及之前版本, 该模块只存在于 Unix 平台上. 2.0 后, Windows 下也实现了该函数. 
[[#eg-3-9|Example 3-9]] 展示了如何使用该模块来给字符串排序. 

==== 1.7.0.1. Example 3-9. 使用 popen2 模块对字符串排序Module to Sort Strings ====
{{{
File: popen2-example-1.py

import popen2, string

fin, fout = popen2.popen2("sort")

fout.write("foo\n")
fout.write("bar\n")
fout.close()

print fin.readline(),
print fin.readline(),
fin.close()

*B*bar
foo*b*
}}}

[[#eg-3-10|Example 3-10]] 展示了如何使用该模块控制应用程序 .

==== 1.7.0.2. Example 3-10. 使用 popen2 模块控制 gnuchess ====
{{{
File: popen2-example-2.py

import popen2
import string

class Chess:
    "Interface class for chesstool-compatible programs"

    def _ _init_ _(self, engine = "gnuchessc"):
        self.fin, self.fout = popen2.popen2(engine)
        s = self.fin.readline()
        if s != "Chess\n":
            raise IOError, "incompatible chess program"

    def move(self, move):
        self.fout.write(move + "\n")
        self.fout.flush()
        my = self.fin.readline()
        if my == "Illegal move":
            raise ValueError, "illegal move"
        his = self.fin.readline()
        return string.split(his)[2]

    def quit(self):
        self.fout.write("quit\n")
        self.fout.flush()

#
# play a few moves

g = Chess()

print g.move("a2a4")
print g.move("b2b3")

g.quit()

*B*b8c6
e7e5*b*
}}}

----

== 1.8. signal 模块 ==
你可以使用 {{{signal}}} 模块配置你自己的信号处理器 (signal handler), 
如 [[#eg-3-11|Example 3-11]] 所示. 当解释器收到某个信号时, 信号处理器会立即执行. 

==== 1.8.0.1. Example 3-11. 使用 signal 模块 ====
{{{
File: signal-example-1.py

import signal
import time

def handler(signo, frame):
    print "got signal", signo

signal.signal(signal.SIGALRM, handler)

# wake me up in two seconds
signal.alarm(2)

now = time.time()

time.sleep(200)

print "slept for", time.time() - now, "seconds"

*B*got signal 14
slept for 1.99262607098 seconds*b*
}}}

----


## moin code generated by txt2tags 2.4 (http://txt2tags.sf.net)
## cmdline: txt2tags -t moin -o moin/chapter3.moin chapter3.t2t