<<TableOfContents>>
= 2005-06-29 如何非阻塞地读取子进程的输出 =
'''[[PyCNmail:2005-June/011936.html|如何非阻塞地读取子进程的输出]] 引发'''
{{{
发件人: Hong Yuan <hongyuan@homemaster.cn>
回复: python-chinese@lists.python.cn
收件人: python-chinese@lists.python.cn
日期: 2005-6-29 下午1:51
}}}
== 问题 ==
{{{我在程序中需要启动一个外部命令并读取该命令的输出。问题是这个外部命令在产 
生第一部分输出后要停顿很长时间，而我并不想等到外部命令全部结束再去读取它 
的输出，而想先得到第一部分的输出。

例如，假设外部程序是以下python代码：

test.py:
import sys, time
print 'hello'*500
sys.stdout.flush()
time.sleep(100)
print 'world'*500

我希望在自己的程序中设置一个timeout，读取外部进程在这个timeout时间前的所 
有输出，即前500个'hello'。我的程序是这样写的：

import os, select
cmd = 'python test.py'
pin, pout = os.popen2(cmd)
select.select([pout], [], [], timeout)[0]
pout.read()

根据文档select应该等到pout有可读取的内容(即前500个'hello')时就返回，但实 
际情况是它要一直到test.py全部执行完毕才返回可读的内容。

谁知道这段代码应该如何正确书写？代码运行在Linux下。
}}}
== 讨论 ==
{{{发件人: Neo Chan (netkiller) <neo.chen@achievo.com>
回复: python-chinese@lists.python.cn
收件人: python-chinese@lists.python.cn
日期: 2005-6-29 下午2:08
}}}
{{{#!python
import select
import socket
import time

PORT = 8037

TIME1970 = 2208988800L

service = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
service.bind(("", PORT))
service.listen(1)

print "listening on port", PORT

while 1:
   is_readable = [service]
   is_writable = []
   is_error = []
   r, w, e = select.select(is_readable, is_writable, is_error, 1.0)
   if r:
       channel, info = service.accept()
       print "connection from", info
       t = int(time.time()) + TIME1970
       t = chr(t>>24&255) + chr(t>>16&255) + chr(t>>8&255) + chr(t&255)
       channel.send(t) # send timestamp
       channel.close() # disconnect
   else:
       print "still waiting"
}}}


Neo Chan (netkiller)
Best Regards, 73! de BG7NYT
Amateur Radio Callsign: BG7NYT
-----
{{{
发件人: shhgs <shhgs.efhilt@gmail.com>
回复: shhgs <shhgs.efhilt@gmail.com>, python-chinese@lists.python.cn
收件人: python-chinese@lists.python.cn
日期: 2005-6-29 下午7:19
}}}
{{{> 我希望在自己的程序中设置一个timeout，读取外部进程在这个timeout时间前的所
> 有输出，即前500个'hello'。我的程序是这样写的：
>
> import os, select
> cmd = 'python test.py'
> pin, pout = os.popen2(cmd)
> select.select([pout], [], [], timeout)[0]

select的作用是查看三个流的集合，发现其中的可读，可写，以及有错误报出的流。

你这里的问题，一是没有返回这个集合，所以它被当作垃圾回收了。第二select是非阻塞的，所以返回之后，它就等着执行下一句，也就是pout.read()了。这当然要等程序执行完毕才能运行。所以我觉得应该这么写：

while 1 ：
 i,o,e = select.select([pout], [], [], timeout)
   if i :
       print i[0].read()

> pout.read()
>
> 根据文档select应该等到pout有可读取的内容(即前500个'hello')时就返回，但实
> 际情况是它要一直到test.py全部执行完毕才返回可读的内容。
>
> 谁知道这段代码应该如何正确书写？代码运行在Linux下。

顺便说一句，Linux下可以，Windows下不行。Windows的select只支持socket。
}}}
== 成果 ==
{{{#!python
import os, fcntl, select

cmd = 'python test.py'

timeout = 2
pin, pout = os.popen2(cmd)

pout = pout.fileno()
flags = fcntl.fcntl(pout, fcntl.F_GETFL)
fcntl.fcntl(pout, fcntl.F_SETFL, flags | os.O_NONBLOCK)

while 1:
   i,o,e = select.select([pout], [], [], timeout)
   if i :
       buf = os.read(pout, 1024)
       if buf:
           print buf
       else:
           break
}}}
 * 由fcntl函数将文件设为非阻塞模式，用select来判断是否有数据可读，然后用os.read读取当前可读的所有数据。