使用延迟机制(Using Deferreds)

-- -- Jerry Marx 于 [DateTime(2004-08-23T04:35:55Z)] 最后编辑 TableOfContents

介绍(Introduction)

Twisted is a framework that allows programmers to develop asynchronous networked programs. twisted.internet.defer.Deferred objects are one of the key concepts that you should understand in order to develop asynchronous code that uses the Twisted framework: they are a signal to the calling function that a result is pending.

Twisted是一个运行程序员以异步方式进行网络编程的框架.如果要理解Twisted框架并用它来开发异步程序[http://twistedmatrix.com/documents/TwistedDocs/TwistedDocs-1.3.0/api/twisted.internet.defer.Deferred.html "twisted.internet.defer.Deferred"]是一个非常关键的概念:它给了调用函数一个操作未决的信号

This HOWTO first describes the problem that Deferreds solve: that of managing tasks that are waiting for data without blocking. It then illustrates the difference between blocking Python code, and non-blocking code which returns a Deferred; describes Deferreds in more details; describes the details of the class interfaces; and finally describes DeferredList.

这份HOWTO首先解释Deferreds要解决什么问题:它以不阻塞的方式管理等待数据的任务.然后举例说明返回Defrred的非阻塞代码和阻塞Python代码的区别;描述了Deferreds的更多细节;详细讨论类接口;最后描述DeferredList

Defferreds解决什么问题(The Problem that Deferreds Solve)

Deferreds are designed to enable Twisted programs to wait for data without hanging until that data arrives.

Deferreds被设计为一种可以让程序等待数据而不用在数据到来之前一直挂起的方式.

Many computing tasks take some time to complete, and there are two reasons why a task might take some time:

很多计算任务需要很长时间才能完成,下面列出两个可能的原因

  1. it is computationally intensive (for example factorising large numbers) and requires a certain amount of CPU time to calculate the answer; or
  2. it is not computationally intensive but has to wait for data to be available to produce a result.
  3. 可能是高强度计算(比方说找很大数字的因子)需要大量CPU时间来计算结果;或者
  4. 虽然不是高强度计算但是需要等待数据变得有效以产生结果

It is the second class of problem — non-computationally intensive tasks that involve an appreciable delay — that Deferreds are designed to help solve. Functions that wait on hard drive access, database access, and network access all fall into this class, although the time delay varies.

第二种问题--不是高强度计算,引起一个可感知的延迟--就是Deferreds要解决的问题.函数等待硬盘访问,等待数据库访问,等待网络访问都是这类问题,尽管等待的时间各不相同.

The basic idea behind Deferreds, and other solutions to this problem, is to keep the CPU as active as possible. If one task is waiting on data, rather than have the CPU (and the program!) idle waiting for that data (a process normally called "blocking"), the program performs other operations in the meantime, and waits for some signal that data is ready to be processed before returning to that process.

Deferreds和其他解决这个问题的方法的基本思想就是尽可能的让CPU可用.如果一个任务正在等待数据,就应该在等待数据到达的信号的同时进行另一个任务,而不是让CPU(和程序!)空闲等待数据(对于进程来说一般叫做"阻塞").

In Twisted, a function signals to the calling function that it is waiting by returning a Deferred. When the data is available, the program activates the callbacks on that Deferred to process the data.

在Twisted中,函数发信号给调用函数等待返回的Deferred(这句怎么都译不通 :( ).当数据可用的时候,程序调用注册到Deferred的回调函数来处理数据.

今生前世(The Context)

用阻塞来解决(Dealing with Blocking Code)

When coding I/O based programs - networking code, databases, file access - there are many APIs that are blocking, and many methods where the common idiom is to block until a result is gotten.

在编写基于I/O的程序--网络,数据库,文件访问--的时候,会用到很多会阻塞的API函数,很多方法的惯用法就是阻塞等待.

   1 class Getter:
   2     def getData(self, x):
   3         # imagine I/O blocking code here
   4         print "blocking"
   5         import time
   6         time.sleep(4)
   7         return x * 3
   8 
   9 g = Getter()
  10 print g.getData(3)

不要打电话给我,我会打给你的(Don't Call Us, We'll Call You)

Twisted cannot support blocking calls in most of its code, since it is single threaded, and event based. The solution for this issue is to refactor the code, so that instead of blocking until data is available, we return immediately, and use a callback to notify the requester once the data eventually arrives.

由于Twisted以事件驱动的单线程方式工作,所以其大部分代码都不支持阻塞调用.解决这个问题的方法就是重构,我们直接返回而不是阻塞等待数据到达,然后在数据到达候使用回调函数机制通知请求者.

   1 from twisted.internet import reactor
   2 
   3 class Getter:
   4     def getData(self, x, callback):
   5         # this won't block
   6         reactor.callLater(2, callback, x * 3)
   7 
   8 def printData(d):
   9     print d
  10 
  11 g = Getter()
  12 g.getData(3, printData)
  13 
  14 # startup the event loop, exiting after 4 seconds
  15 reactor.callLater(4, reactor.stop);
  16 reactor.run()

There are several things missing in this simple example. There is no way to know if the data never comes back; no mechanism for handling errors. The example does not handle multiple callback functions, nor does it give a method to merge arguments before and after execution. Further, there is no way to distinguish between different calls to gotData from different producer objects. Deferred solves these problems, by creating a single, unified way to handle callbacks and errors from deferred execution.

这个例子缺少了一些东西.没有办法知道数据是否永远不会到达,没有错误处理机制.这个例子也不能处理多个回调函数,也不能提供一个函数在执行前或执行后合并参数.更进一步,它不能分辨从不同数据源获取的不同的数据.Deferred提供了一种简单,统一的方法来处理回调和错误从而可以解决上面的所有问题.

延迟机制(Deferreds)

A [http://twistedmatrix.com/documents/TwistedDocs/TwistedDocs-1.3.0/api/twisted.internet.defer.Deferred.html "twisted.internet.defer.Deferred"] is a promise that a function will at some point have a result. We can attach callback functions to a Deferred, and once it gets a result these callbacks will be called. In addition Deferreds allow the developer to register a callback for an error, with the default behavior of logging the error. The deferred mechanism standardizes the application programmer's interface with all sorts of blocking or delayed operations.

一个[http://twistedmatrix.com/documents/TwistedDocs/TwistedDocs-1.3.0/api/twisted.internet.defer.Deferred.html "twisted.internet.defer.Deferred"]许诺一个函数在某一点会返回结果.我们可以把一些回调函数注册到Deferred,一旦有结果这些回调函数就会被调用.此外Deferred也允许开发者注册对于错误处理的回调函数取代缺省的写入日值的错误处理行为.Deferred机制提供了处理各种阻塞和延迟的一种标准化的接口.

   1 from twisted.internet import reactor, defer
   2 
   3 class Getter:
   4     def getData(self, x):
   5         # this won't block
   6         d = defer.Deferred()
   7         reactor.callLater(2, d.callback, x * 3)
   8         return d
   9 
  10 def printData(d):
  11     print d
  12 
  13 g = Getter()
  14 d = g.getData(3)
  15 d.addCallback(printData)
  16 
  17 reactor.callLater(4, reactor.stop); reactor.run()

Deferreds do not make the code magically not block. Once you have rewritten your code to not block, Deferreds give you a nice way to build an interface to that code.

Defrreds并不是对代码施加了什么魔力使得它们不会阻塞.一旦你以非阻塞的方式重写你的代码,Deferreds可以给你创建这样接口的一种非常好的方法.

As we said, multiple callbacks can be added to a Deferred. The first callback in the Deferred's callback chain will be called with the result, the second with the result of the first callback, and so on. Why do we need this? Well, consider a Deferred returned by twisted.enterprise.adbapi - the result of a SQL query. A web widget might add a callback that converts this result into HTML, and pass the Deferred onwards, where the callback will be used by twisted to return the result to the HTTP client. The callback chain will be bypassed in case of errors or exceptions.

我们前面提到,可以注册多个回调到一个Deferred对象.Deferred的回调函数链中的第一个函数调用时会收到Deferred返回的结果,第二个回调函数被调用的时候收到第一个回调函数返回的结果,依此类推.为什么我们需要这样作?嗯,假设一个Deferred对象返回了twisted.enterprise.adbapi - SQL查询的结果.一个Web部件可以加入一个回调把这个结果转换为HTML,然后继续传给Deferred,下一个函数会把这个结果返回给HTTP客户.回调函数链在错误或者异常发生的时候会通过旁路返回.

   1 from twisted.internet import reactor, defer
   2 
   3 class Getter:
   4     def gotResults(self, x):
   5         """The Deferred mechanism provides a mechanism to signal error
   6            conditions.  In this case, odd numbers are bad.
   7         """           
   8         if x % 2 == 0:
   9             self.d.callback(x*3)
  10         else:
  11             self.d.errback(ValueError("You used an odd number!"))
  12 
  13     def _toHTML(self, r):
  14         return "Result: %s" % r
  15 
  16     def getData(self, x):
  17         """The Deferred mechanism allows for chained callbacks.
  18            In this example, the output of gotResults is first
  19            passed through _toHTML on its way to printData.
  20         """           
  21         self.d = defer.Deferred()
  22         reactor.callLater(2, self.gotResults, x)
  23         self.d.addCallback(self._toHTML)
  24         return self.d
  25 
  26 def printData(d):
  27     print d
  28 
  29 def printError(failure):
  30     import sys
  31     sys.stderr.write(str(failure))
  32 
  33 # this will print an error message
  34 g = Getter()
  35 d = g.getData(3)
  36 d.addCallback(printData)
  37 d.addErrback(printError)
  38 
  39 # this will print "Result: 12"
  40 g = Getter()
  41 d = g.getData(4)
  42 d.addCallback(printData)
  43 d.addErrback(printError)
  44 
  45 reactor.callLater(4, reactor.stop); reactor.run()

形象解释(Visual Explanation)

<<<图1: invalid macro name>>>

  1. Requesting method (data sink) requests data, gets Deferred object.
  2. Requesting method attaches callbacks to Deferred object.
  3. 请求数据(从数据接收器),得到Deferred对象
  4. 注册回调到Deferred对象.

<<<图2: invalid macro name>>>

When the result is ready, give it to the Deferred object. .callback(result) if the operation succeeded, .errback(failure) if it failed. Note that failure is typically an instance of a twisted.python.failure.Failure instance. Deferred object triggers previously-added (call/err)back with the result or failure. Execution then follows the following rules, going down the chain of callbacks to be processed. Result of the callback is always passed as the first argument to the next callback, creating a chain of processors. If a callback raises an exception, switch to errback. An unhandled failure gets passed down the line of errbacks, this creating an asynchronous analog to a series to a series of except: statements. If an errback doesn't raise an exception or return a twisted.python.failure.Failure instance, switch to callback.

当数据抵达就返回给Deferred对象. 操作成功就调用.callback(result),失败就调用.errback(failure).failure是一个[http://twistedmatrix.com/documents/TwistedDocs/TwistedDocs-1.3.0/api/twisted.python.failure.Failure.html "twisted.python.failure.Failure"]实例 Deffered对象触发先前加入的回调/错误回调函数,传给result或failure.回调函数链中的函数会依次被执行. 如果回调过程中发生异常,就切换到错误回调中去执行. 一个未处理错误会依次调用错误回调函数链中的函数,创建一系列的和异常相似的异步相似物:状态. 如果异常回调函数不继续抛出异常或者返回[http://twistedmatrix.com/documents/TwistedDocs/TwistedDocs-1.3.0/api/twisted.python.failure.Failure.html "twisted.python.failure.Failure"]实例,就切换到回调函数链执行.

关于回调(More about callbacks)

注册多个回调函数到Deferred:

   1 g = Getter()
   2 d = g.getResult(3)
   3 d.addCallback(processResult)
   4 d.addCallback(printResult)

Each callback feeds its return value into the next callback (callbacks will be called in the order you add them). Thus in the previous example, processResult's return value will be passed to printResult, instead of the value initially passed into the callback. This gives you a flexible way to chain results together, possibly modifying values along the way (for example, you may wish to pre-process database query results).

每个回调函数都把自己的返回值传给下一个回调函数(回调函数会按照注册的顺序被调用).在上门的例子中,processResult()的返回值会传给printResult(),而不是最初传给processResult()那个值.这样就可以灵活的组织回调函数链,可以在这个链的传递过程中修改数据(例如可以对数据库查询结果预处理).

关于出错回调(More about errbacks)

Deferred's error handling is modeled after Python's exception handling. In the case that no errors occur, all the callbacks run, one after the other, as described above.

Deferred的错误处理是对Python异常处理的模拟.没有错误发生的情况下所有的回调函数(callback)会一个接一个的被调用,如上所述.

If the errback is called instead of the callback (e.g. because a DB query raised an error), then a twisted.python.failure.Failure is passed into the first errback (you can add multiple errbacks, just like with callbacks). You can think of your errbacks as being like except blocks of ordinary Python code.

如果调用了错误回调函数(errback)而不是回调函数(callback)(例如因为一个数据库查询异常),一个[http://twistedmatrix.com/documents/TwistedDocs/TwistedDocs-1.3.0/api/twisted.python.failure.Failure.html twisted.python.failure.Failure]对象就会传给第一个错误回调函数(errback)(你可以定义多个错误回调函数,就像回调函数一样).你可以认为你的错误回调就像是原生Python代码的except块.

Unless you explicitly raise an error in except block, the Exception is caught and stops propagating, and normal execution continues. The same thing happens with errbacks: unless you explicitly return a Failure or (re-)raise an exception, the error stops propagating, and normal callbacks continue executing from that point (using the value returned from the errback). If the errback does returns a Failure or raise an exception, then that is passed to the next errback, and so on.

除非你显式的在except块抛出(raise)一个错误,错误会被捕捉而不是继续传播,正常的执行会继续.错误回调(errback)也是这样:除非你显式的返回(return)一个Failure或者(重新)抛出一个异常,错误不会继续传播,正常的回调函数会在这点开始继续执行(使用错误回调的返回值).如果错误回调确实返回一个Failure或者抛出一个异常,下一个错误回调就会被调用,依此类推.

Note: If an errback doesn't return anything, then it effectively returns None, meaning that callbacks will continue to be executed after this errback. This may not be what you expect to happen, so be careful. Make sure your errbacks return a Failure (probably the one that was passed to it), or a meaningful return value for the next callback.

注意:如果错误回调没有返回任何值,它就返回None,意味着在这个错误回调之后正常回调序列会继续执行.这也许不是你想要的,所以要小心.确定你在错误回调中返回了一个Failure(很可能就是别的函数传给它的那个),或者返回有意义的值给下一个正常回调函数(callback).

Also, twisted.python.failure.Failure instances have a useful method called trap, allowing you to effectively do the equivalent of:

此外.[http://twistedmatrix.com/documents/TwistedDocs/TwistedDocs-1.3.0/api/twisted.python.failure.Failure.html twisted.python.failure.Failure]有一个非常有用的方法叫做陷阱(trap),你可以用它有效的做相同的事情:

   1 try:
   2     # code that may throw an exception
   3     cookSpamAndEggs()
   4 except (SpamException, EggException):
   5     # Handle SpamExceptions and EggExceptions
   6     ...

You do this by:

你可以这样做:

   1 def errorHandler(failure):
   2     failure.trap(SpamException, EggException)
   3     # Handle SpamExceptions and EggExceptions
   4 
   5 d.addCallback(cookSpamAndEggs)
   6 d.addErrback(errorHandler)

If none of arguments passed to failure.trap match the error encapsulated in that Failure, then it re-raises the error.

如果传给failure.trap()的任何一个参数都不能匹配Failure包装的错误,这个错误就会被重新抛出.

There's another potential gotcha here. There's a method twisted.internet.defer.Deferred.addCallbacks which is similar to, but not exactly the same as, addCallback followed by addErrback. In particular, consider these two cases:

这是另一个潜在的陷阱.这里有个类似的但又不完全相同的方法[http://twistedmatrix.com/documents/TwistedDocs/TwistedDocs-1.3.0/api/twisted.internet.defer.Deferred.html#addCallbacks twisted.internet.defer.Deferred.addCallbacks]就是,在addErrBack()之后调用addCallBack().特别的,考虑下面两个例子:

   1 # Case 1
   2 d = getDeferredFromSomewhere()
   3 d.addCallback(callback1)       # A
   4 d.addErrback(errback1)         # B
   5 d.addCallback(callback2)       
   6 d.addErrback(errback2)        
   7 
   8 # Case 2
   9 d = getDeferredFromSomewhere()
  10 d.addCallbacks(callback1, errback1)  # C
  11 d.addCallbacks(callback2, errback2)

If an error occurs in callback1, then for Case 1 errback1 will be called with the failure. For Case 2, errback2 will be called. Be careful with your callbacks and errbacks.

如果在callback1()中发生了错误,Case 1中的errback1()会被传入failure来调用.在Case 2中,errback2()会被调用.运用callback和errback的时候千万要小心.

What this means in a practical sense is in Case 1, "A" will handle a success condition from getDeferredFromSomewhere, and "B" will handle any errors that occur from either the upstream source, or that occur in 'A'. In Case 2, "C"'s errback1 will only handle an error condition raised by getDeferredFromSomewhere, it will not do any handling of errors raised in callback1.

在Case 1中代码到底意思是什么呢?"A"会处理getDeferredFromSomeWhere()的成功情况,"B"会处理任何之前发生的错误,或者在"A"中发生的错误.在Case 2中,"C"的errback1只会处理getDeferredFromSomeWhere()中发生的错误,它不会处理任何callback1中抛出的错误.

未处理的错误(Unhandled Errors)

类概述(Class Overview)

基本回调函数(Basic Callback Functions)

延迟处理链(Chaining Deferreds)

自动化错误情况(Automatic Error Conditions)

打断一下,马上回来:技术细节(A Brief Interlude: Technical Details)

高级用法之处理链控制(Advanced Processing Chain Control)

处理同步或异步结果(Handling either synchronous or asynchronous results)

在库代码中处理可能的延迟(Handling possible Deferreds in the library code)

在异步函数中返回Deferred(Returning a Deferred from synchronous functions)

延迟链表(DeferredList)

其它的行为(Other behaviours)

[http://wiki.woodpecker.org.cn/moin.cgi/PyTwisted_2fLowLevelNetworkingEventLoop (目录)Index]

Version: 1.3.0