Differences between revisions 2 and 7 (spanning 5 versions)

Python性能调试笔记

::-- Roka [2007-04-26 14:46:55]

Contents

概要
交流

1. 概要

TODO

1.1. 字符串连接

(1)

普通代码:

   1 s = ""
   2 for substring in list:
   3     s += substring

高性能代码:

   1 s = "".join(list)

(2)

普通代码:

   1 s = ""
   2 for x in list:
   3     s += someFunction(x)

高性能代码:

   1 slist = [someFunction(x) for x in somelist]
   2 s = "".join(slist)

(3)

普通代码:

   1 out = "<html>" + head + prologue + query + tail + "</html>"

高性能代码:

   1 out = "<html>%(head)s%(prologue)s%(query)s%(tail)s</html>" % locals()

1.2. 循环

(1)一个转换大写的例程:

普通代码:

   1 newlist = []
   2 for word in oldlist:
   3     newlist.append(word.upper())

高性能代码:

   1 #(map()函数是C语言实现，性能比较高，但是会在Py3000消失)
   2 newlist = map(str.upper, oldlist)
   3 #Or(List comprehensions, Py > 2.0)
   4 newlist = [s.upper() for s in oldlist]
   5 #Or(Generator expressions, Py > 2.4)
   6 newlist = (s.upper() for s in oldlist)

1.3. 面向对象

(1)假设不能使用map()和list comprehension，你只能使用循环时要避免”带点循环“:

   1 upper = str.upper
   2 newlist = []
   3 append = newlist.append
   4 # loop without dots
   5 for word in list:
   6     append(upper(word))

1.4. 本地变量

(1)终极办法-使用本地变量代替全局变量

   1 def func():
   2     upper = str.upper
   3     newlist = []
   4     append = newlist.append
   5     for word in words:
   6         append(upper(word))
   7     return newlist

1.5. 字典

(1)不要带IF循环:

普通代码:

   1 wdict= {}
   2 for word in words:
   3     if word not in wdict:
   4         wdict[word] = 0
   5     wdict[word] += 1

高性能代码:

   1 #(Py < 2.x)
   2 wdict = {}
   3 for word in words:
   4     try:
   5         wdict[word] += 1
   6     except KeyError:
   7         wdict[word] = 1
   8 
   9 #(Py > 2.x)
  10 wdict = {}
  11 get = wdict.get
  12 for word in words:
  13     wdict[word] = get(word, 0) + 1

如果在字典里的是对象或列表，你还可以用dict.setdefault 方法

   1 wdict.setdefault(key, []).append(newElement)

1.6. Import

(1)在本地import会比全局import高效。

(2)保证只import一次。

   1 #check
   2 pack = None
   3 
   4 def parse_pack():
   5     global pack
   6     if pack is None:
   7         import pack
   8     ...

1.7. 数据集合处理

(1)避免在循环中进行函数调用

普通代码:

   1 import time
   2 x = 0
   3 def doit(i):
   4     global x
   5     x = x + 1
   6 
   7 list = range(100000)
   8 t = time.time()
   9 for i in list:
  10     doit(i)
  11 
  12 print "%.3f" %(time.time() -t )

高性能代码:

   1 import time
   2 x = 0
   3 def doit(i):
   4     global x
   5     for i in list:
   6         x = x + 1
   7     x = x + 1
   8 
   9 list = range(100000)
  10 t = time.time()
  11 doit(list)
  12 
  13 print "%.3f" %(time.time() -t )

(什么？？竟然快了4倍以上！！)

1.8. 使用xrange()代替range()

   1 # Measuring the performance using profile mod
   2 
   3 def myFunc():
   4     b = []
   5     a = [[1,2,3],[4,5,6]]
   6     for x in range(len(a)):
   7         for y in range(len(a[x])):
   8           b.append(a[x][y])  
   9 
  10 import profile
  11 profile.run("myFunc()","myFunc.profile")
  12 import pstats
  13 pstats.Stats("myFunc.profile").sort_stats("time").print_stats()

结果:

Wed May 23 12:05:07 2007    myFunc.profile

         16 function calls in 0.001 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    0.001    0.001 :0(setprofile)
        1    0.000    0.000    0.001    0.001 profile:0(myFunc())
        1    0.000    0.000    0.000    0.000 D:/Python25/measuringPerf.py:7(myFunc)
        6    0.000    0.000    0.000    0.000 :0(append)
        3    0.000    0.000    0.000    0.000 :0(range)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        3    0.000    0.000    0.000    0.000 :0(len)
        0    0.000             0.000          profile:0(profiler)

现在替换range()为xrange():

   1 # Measuring the performance using profile mod
   2 
   3 def myFunc():
   4     b = []
   5     a = [[1,2,3],[4,5,6]]
   6     for x in xrange(len(a)):
   7         for y in xrange(len(a[x])):
   8           b.append(a[x][y])  
   9 
  10 import profile
  11 profile.run("myFunc()","myFunc.profile")
  12 import pstats
  13 pstats.Stats("myFunc.profile").sort_stats("time").print_stats()

结果:

Wed May 23 12:05:59 2007    myFunc.profile

         13 function calls in 0.001 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    0.001    0.001 :0(setprofile)
        1    0.000    0.000    0.001    0.001 profile:0(myFunc())
        1    0.000    0.000    0.000    0.000 D:/Python25/measuringPerf.py:7(myFunc)
        6    0.000    0.000    0.000    0.000 :0(append)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        3    0.000    0.000    0.000    0.000 :0(len)
        0    0.000             0.000          profile:0(profiler)

注意到函数调用次数由16减少到了13，

虽然使用的CPU时间是一样的，但只是执行一次的结果。

注：
(ncalls)：调用次数。
(tottime)：总函数耗时（不包括子函数） 
(cumtime)：总函数耗时（包括子函数） 
(percall):平均调用时间

毕竟xrange()是C完全实现的。

-  ⇤ ← Revision 2 as of 2007-05-22 07:14:37 → 
  Size: 3237
  Editor: Roka
  Comment:
+   ← Revision 7 as of 2009-12-25 07:16:00 → ⇥
  Size: 5761
  Editor: localhost
  Comment: converted to 1.6 markup
-Deletions are marked like this.
+Additions are marked like this.
 Line 8:
-::-- Roka [[[DateTime(2007-04-26T14:46:55Z)]]]
[[TableOfContents]]
== String Concatenation ==
+::-- Roka [<<DateTime(2007-04-26T14:46:55Z)>>]
<<TableOfContents>>

= 概要 =

TODO

== 字符串连接 ==
-Line 14:
+Line 19:
-Normal Code:
+普通代码:
-Line 22:
+Line 27:
-Optimized Code:
+高性能代码:
-Line 30:
+Line 35:
-Normal Code:
+普通代码:
-Line 38:
+Line 43:
-Optimized Code:
+高性能代码:
-Line 46:
+Line 52:
-Normal Code:
+普通代码:
-Line 51:
+Line 58:
-Optimized Code:
+高性能代码:
-Line 56:
+Line 64:
-== Loops ==

(1)Converting to upper case:

Normal Code:
+== 循环 ==

(1)一个转换大写的例程:

普通代码:
-Line 67:
+Line 76:
-Optimized Code:
{{{#!python
#(map() is fast but will be removed from Py3000)
+高性能代码:

{{{#!python
#(map()函数是C语言实现，性能比较高，但是会在Py3000消失)
-Line 77:
+Line 87:
-== OOP ==

(1)Suppose you cannot use map() or list comprehension, just remember Avoiding dots:
+== 面向对象 ==

(1)假设不能使用map()和list comprehension，你只能使用循环时要避免”带点循环“:
-Line 89:
+Line 99:
-== Local Variables ==

(1)Final speedup method is to use local instead of global vars.
+== 本地变量 ==

(1)终极办法-使用本地变量代替全局变量
-Line 102:
+Line 112:
-== Dictionary ==

(1)Avoid if in for loops:

Normal Code:
+== 字典 ==

(1)不要带IF循环:

普通代码:
-Line 115:
+Line 125:
-Optimized Code:
+高性能代码:
-Line 132:
+Line 142:
-Also , if the value stored in the dict is an object or a list, you could also use the dict.setdefault method, e.g.
+如果在字典里的是对象或列表，你还可以用dict.setdefault 方法
-Line 137:
+Line 147:
-This avoids having to lookup the twice.
-Line 141:
+Line 150:
-(1)import inside the function is more efficiently.

(2)Do import once,
+(1)在本地import会比全局import高效。

(2)保证只import一次。
-Line 155:
+Line 164:
-== Data Aggregation ==

(1)Avoiding function call in for loop

Normal Code:
+== 数据集合处理 ==

(1)避免在循环中进行函数调用

普通代码:
-Line 175:
+Line 184:
-Optimized Code:
+高性能代码:
-Line 192:
+Line 201:
-(What?? about 4 times faster!! )

== range() -> xrange() ==

It is implemented in Pure C.
+(什么？？竟然快了4倍以上！！)

== 使用xrange()代替range() ==

{{{#!python
# Measuring the performance using profile mod

def myFunc():
    b = []
    a = [[1,2,3],[4,5,6]]
    for x in range(len(a)):
        for y in range(len(a[x])):
          b.append(a[x][y])  

import profile
profile.run("myFunc()","myFunc.profile")
import pstats
pstats.Stats("myFunc.profile").sort_stats("time").print_stats()

}}}

结果:
{{{
Wed May 23 12:05:07 2007    myFunc.profile

         16 function calls in 0.001 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    0.001    0.001 :0(setprofile)
        1    0.000    0.000    0.001    0.001 profile:0(myFunc())
        1    0.000    0.000    0.000    0.000 D:/Python25/measuringPerf.py:7(myFunc)
        6    0.000    0.000    0.000    0.000 :0(append)
        3    0.000    0.000    0.000    0.000 :0(range)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        3    0.000    0.000    0.000    0.000 :0(len)
        0    0.000             0.000          profile:0(profiler)

}}}


现在替换range()为xrange():
{{{#!python
# Measuring the performance using profile mod

def myFunc():
    b = []
    a = [[1,2,3],[4,5,6]]
    for x in xrange(len(a)):
        for y in xrange(len(a[x])):
          b.append(a[x][y])  

import profile
profile.run("myFunc()","myFunc.profile")
import pstats
pstats.Stats("myFunc.profile").sort_stats("time").print_stats()

}}}

结果:
{{{
Wed May 23 12:05:59 2007    myFunc.profile

         13 function calls in 0.001 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    0.001    0.001 :0(setprofile)
        1    0.000    0.000    0.001    0.001 profile:0(myFunc())
        1    0.000    0.000    0.000    0.000 D:/Python25/measuringPerf.py:7(myFunc)
        6    0.000    0.000    0.000    0.000 :0(append)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        3    0.000    0.000    0.000    0.000 :0(len)
        0    0.000             0.000          profile:0(profiler)

}}}

注意到函数调用次数由16减少到了13，

虽然使用的CPU时间是一样的，但只是执行一次的结果。

{{{
注：
(ncalls)：调用次数。
(tottime)：总函数耗时（不包括子函数） 
(cumtime)：总函数耗时（包括子函数） 
(percall):平均调用时间
}}}

毕竟xrange()是C完全实现的。
-Line 198:
+Line 295:
-[[PageComment2]]

Diff for "PyPerformanceTuning"

1. 概要

1.1. 字符串连接

1.2. 循环

1.3. 面向对象

1.4. 本地变量

1.5. 字典

1.6. Import

1.7. 数据集合处理

1.8. 使用xrange()代替range()

2. 交流