## page was renamed from zhArticleTemplate ##language:zh #pragma section-numbers on ''' Python性能调试笔记 ''' ::-- Roka [<>] <> = 概要 = TODO == 字符串连接 == (1) 普通代码: {{{#!python s = "" for substring in list: s += substring }}} 高性能代码: {{{#!python s = "".join(list) }}} (2) 普通代码: {{{#!python s = "" for x in list: s += someFunction(x) }}} 高性能代码: {{{#!python slist = [someFunction(x) for x in somelist] s = "".join(slist) }}} (3) 普通代码: {{{#!python out = "" + head + prologue + query + tail + "" }}} 高性能代码: {{{#!python out = "%(head)s%(prologue)s%(query)s%(tail)s" % locals() }}} == 循环 == (1)一个转换大写的例程: 普通代码: {{{#!python newlist = [] for word in oldlist: newlist.append(word.upper()) }}} 高性能代码: {{{#!python #(map()函数是C语言实现,性能比较高,但是会在Py3000消失) newlist = map(str.upper, oldlist) #Or(List comprehensions, Py > 2.0) newlist = [s.upper() for s in oldlist] #Or(Generator expressions, Py > 2.4) newlist = (s.upper() for s in oldlist) }}} == 面向对象 == (1)假设不能使用map()和list comprehension,你只能使用循环时要避免”带点循环“: {{{#!python upper = str.upper newlist = [] append = newlist.append # loop without dots for word in list: append(upper(word)) }}} == 本地变量 == (1)终极办法-使用本地变量代替全局变量 {{{#!python def func(): upper = str.upper newlist = [] append = newlist.append for word in words: append(upper(word)) return newlist }}} == 字典 == (1)不要带IF循环: 普通代码: {{{#!python wdict= {} for word in words: if word not in wdict: wdict[word] = 0 wdict[word] += 1 }}} 高性能代码: {{{#!python #(Py < 2.x) wdict = {} for word in words: try: wdict[word] += 1 except KeyError: wdict[word] = 1 #(Py > 2.x) wdict = {} get = wdict.get for word in words: wdict[word] = get(word, 0) + 1 }}} 如果在字典里的是对象或列表,你还可以用dict.setdefault 方法 {{{#!python wdict.setdefault(key, []).append(newElement) }}} == Import == (1)在本地import会比全局import高效。 (2)保证只import一次。 {{{#!python #check pack = None def parse_pack(): global pack if pack is None: import pack ... }}} == 数据集合处理 == (1)避免在循环中进行函数调用 普通代码: {{{#!python import time x = 0 def doit(i): global x x = x + 1 list = range(100000) t = time.time() for i in list: doit(i) print "%.3f" %(time.time() -t ) }}} 高性能代码: {{{#!python import time x = 0 def doit(i): global x for i in list: x = x + 1 x = x + 1 list = range(100000) t = time.time() doit(list) print "%.3f" %(time.time() -t ) }}} (什么??竟然快了4倍以上!!) == 使用xrange()代替range() == {{{#!python # Measuring the performance using profile mod def myFunc(): b = [] a = [[1,2,3],[4,5,6]] for x in range(len(a)): for y in range(len(a[x])): b.append(a[x][y]) import profile profile.run("myFunc()","myFunc.profile") import pstats pstats.Stats("myFunc.profile").sort_stats("time").print_stats() }}} 结果: {{{ Wed May 23 12:05:07 2007 myFunc.profile 16 function calls in 0.001 CPU seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.001 0.001 0.001 0.001 :0(setprofile) 1 0.000 0.000 0.001 0.001 profile:0(myFunc()) 1 0.000 0.000 0.000 0.000 D:/Python25/measuringPerf.py:7(myFunc) 6 0.000 0.000 0.000 0.000 :0(append) 3 0.000 0.000 0.000 0.000 :0(range) 1 0.000 0.000 0.000 0.000 :1() 3 0.000 0.000 0.000 0.000 :0(len) 0 0.000 0.000 profile:0(profiler) }}} 现在替换range()为xrange(): {{{#!python # Measuring the performance using profile mod def myFunc(): b = [] a = [[1,2,3],[4,5,6]] for x in xrange(len(a)): for y in xrange(len(a[x])): b.append(a[x][y]) import profile profile.run("myFunc()","myFunc.profile") import pstats pstats.Stats("myFunc.profile").sort_stats("time").print_stats() }}} 结果: {{{ Wed May 23 12:05:59 2007 myFunc.profile 13 function calls in 0.001 CPU seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.001 0.001 0.001 0.001 :0(setprofile) 1 0.000 0.000 0.001 0.001 profile:0(myFunc()) 1 0.000 0.000 0.000 0.000 D:/Python25/measuringPerf.py:7(myFunc) 6 0.000 0.000 0.000 0.000 :0(append) 1 0.000 0.000 0.000 0.000 :1() 3 0.000 0.000 0.000 0.000 :0(len) 0 0.000 0.000 profile:0(profiler) }}} 注意到函数调用次数由16减少到了13, 虽然使用的CPU时间是一样的,但只是执行一次的结果。 {{{ 注: (ncalls):调用次数。 (tottime):总函数耗时(不包括子函数) (cumtime):总函数耗时(包括子函数) (percall):平均调用时间 }}} 毕竟xrange()是C完全实现的。 = 交流 =