::-- ZoomQuiet [DateTime(2007-11-10T07:34:49Z)] TableOfContents
1. fast UserDict
{{{Jiahua Huang <[email protected]> reply-to [email protected], to "python. cn" <[email protected]>, date Nov 10, 2007 3:28 PM subject [CPyUG:34791] 一行代码让 UserDict.UserDict 的类加速 4 倍 }}} 发现 Python 标准库里好些字典类从 UserDict.UserDict 派生, 而不是从 dict 派生, 是因为 旧版 python 内建类型不能派生子类,
那么这会不会影响速度呢,
先给两个分别继承 UserDict.UserDict 和 dict 的类 URdict, Rdict
>>> import UserDict >>> class URdict(UserDict.UserDict): ... dict can search key by value ... ... def indexkey4value(self, value): ... search key by value ... >>> rd = Rdict(a='One', b='Other', c='What', d='Why', e='Other') ... >>> rd.indexkey4value('Other') ... 'b' ... ... try: ... ind = self.values().index(value) ... return self.keys()[ind] ... except: ... return None ... def key4value(self, svalue): ... search key by value ... >>> rd = Rdict(a='One', b='Other', c='What', d='Why', e='Other') ... >>> rd.key4value('Other') ... 'b' ... ... for key, value in self.iteritems(): ... if value == svalue: ... return key ... def keys4value(self, svalue): ... search keys by value ... >>> rd = Rdict(a='One', b='Other', c='What', d='Why', e='Other') ... >>> rd.keys4value('Other') ... ['b', 'e'] ... ... keys=[] ... for key, value in self.iteritems(): ... if value == svalue: ... keys.append(key) ... return keys ... >>> >>> class Rdict(dict): ... dict can search key by value ... ... def indexkey4value(self, value): ... search key by value ... >>> rd = Rdict(a='One', b='Other', c='What', d='Why', e='Other') ... >>> rd.indexkey4value('Other') ... 'b' ... ... try: ... ind = self.values().index(value) ... return self.keys()[ind] ... except: ... return None ... def key4value(self, svalue): ... search key by value ... >>> rd = Rdict(a='One', b='Other', c='What', d='Why', e='Other') ... >>> rd.key4value('Other') ... 'b' ... ... for key, value in self.iteritems(): ... if value == svalue: ... return key ... def keys4value(self, svalue): ... search keys by value ... >>> rd = Rdict(a='One', b='Other', c='What', d='Why', e='Other') ... >>> rd.keys4value('Other') ... ['b', 'e'] ... ... keys=[] ... for key, value in self.iteritems(): ... if value == svalue: ... keys.append(key) ... return keys ... >>>
>>> import time >>> def _timeit(_src): ... exec( ... _t0 = time.time() ... %s ... _t1 = time.time() ... _t3 = _t1 - _t0 ... %_src) ... return _t3 ... >>> ran = range(100000)
再弄俩实例 >>> u = URdict() >>> r = Rdict()
看看插入速度 >>> _timeit("for i in ran: u[i]=i") 0.1777961254119873 >>> _timeit("for i in ran: r[i]=i") 0.048948049545288086
看看原始 dict 的速度 >>> _timeit("for i in ran: d[i]=i") 0.041368961334228516
可以看到, UserDict.UserDict 确实严重影响速度,
python 标准库里边好多 UserDict 的都应该换成 dict , 以提高性能
不过,一个个修改 Python 标准库似乎又不合适,
再次使用一招鲜,直接干掉 UserDict
在使用/导入那些模块前先来一行 >>> import UserDict; UserDict.UserDict = dict
完了再导入模块来试试 >>> u = URdict() >>> _timeit("for i in ran: u[i]=i") 0.042366981506347656
一行代码让速度提高 4 倍