Differences between revisions 3 and 4
Revision 3 as of 2010-05-13 10:18:22
Size: 3668
Editor: ZoomQuiet
Comment:
Revision 4 as of 2010-05-13 10:19:35
Size: 3726
Editor: ZoomQuiet
Comment:
Deletions are marked like this. Additions are marked like this.
Line 103: Line 103:
的确靠谱 ;-)
{{attachnment:zoomq-2010-05-13-181627_365x663_scrot.png}}
{{{#!python
== 的确靠谱 ;-) ==
使用 Kodos 测试

{{attachment:zoomq-2010-05-13-181627_365x663_scrot.png}}

自动生成的 Sample Code:
{{{#!python

re 模块功能问深

问题

Zoom.Quiet <[email protected]>
sender-time     Sent at 09:48 (GMT+08:00). Current time there: 3:34 PM. ✆
to      "Python.cn@google" <[email protected]>
date    Thu, May 13, 2010 at 09:48
subject re 模块功能问深

现有一正则表达式的技巧问题:

a="balalalalallalala"
b=re.compile("介个正则")
- b 的正则表达式 包含多个模式,或的关系;
- a 中可能有或是没有一个且仅一个模式,在b 中
  • 问题,是现在需要返回另外一个对应的值!
  • 比如说: b 中包含的3个模式,对应的期待返回值是:

"AB" -> 123423
"CD" -> 654623
"EF" -> 675647

当然俺,可以根据匹配的模式组编号另外再查个字典返回,但是,这就慢了...

re.sub()

小明同学 <[email protected]>
sender-time     Sent at 14:42 (GMT-07:00). Current time there: 12:35 AM. ✆
reply-to        [email protected]
to      [email protected]
date    Thu, May 13, 2010 at 14:42

   1 b = '(pat1|pat2|pat3)?'
   2 a = re.match(b)
   3 
   4 d = {None:val1,
   5     pat1:val2,
   6     pat2:val2,
   7     pat3:val3,
   8     #..........
   9    }
  10 
  11 return d[a.group()]

小明:实例

在下面的例子里,替换函数将十进制翻译成十六进制:

   1 >>> def hexrepl( match ):
   2 ...     "Return the hex string for a decimal number"
   3 ...     value = int( match.group() )
   4 ...     return hex(value)
   5 ...
   6 >>> p = re.compile(r'\d+')
   7 >>> p.sub(hexrepl, 'Call 65490 for printing, 49152 for user code.')
   8 'Call 0xffd2 for printing, 0xc000 for user code.'

当使用模块级的 re.sub() 函数时,模式作为第一个参数。

  • 模式也许是一个字符串或一个RegexObject

  • 如果你需要指定正则表达式标志,你必须要么使用 RegexObject做第一个参数,

  • 或用使用模式内嵌修正器,如 sub("(?i)b+", "x", "bbbb BBBB") returns 'x x'。

阿暖

阿暖 <[email protected]>
sender-time     Sent at 16:40 (GMT+08:00). Current time there: 4:44 PM. ✆
reply-to        [email protected]
to      [email protected]
date    Thu, May 13, 2010 at 16:40

ply里见过有这样的写法

   1 token={'123423':AB,
   2     '654623':"CD",
   3     '675647'  : "EF" ,
   4     }
   5 
   6 s='.**123423.**'
   7 m=re.match('''(123423)|(654623)|(675647)''',s)
   8 handle=token[m.group()]

的确靠谱 ;-)

使用 Kodos 测试 zoomq-2010-05-13-181627_365x663_scrot.png

自动生成的 Sample Code:{{{#!python

import re

# common variables

rawstr = r"""(?P<AB>123423)|(?P<CD>654623)|(?P<EF>675647)""" embedded_rawstr = r"""(?P<AB>123423)|(?P<CD>654623)|(?P<EF>675647)""" matchstr = """675645 123523 675647 675648 675649 99999999"""

# method 1: using a compile object compile_obj = re.compile(rawstr) match_obj = compile_obj.search(matchstr)

# method 2: using search function (w/ external flags) match_obj = re.search(rawstr, matchstr)

# method 3: using search function (w/ embedded flags) match_obj = re.search(embedded_rawstr, matchstr)

# Retrieve group(s) from match_obj all_groups = match_obj.groups()

# Retrieve group(s) by index group_1 = match_obj.group(1) group_2 = match_obj.group(2) group_3 = match_obj.group(3)

# Retrieve group(s) by name AB = match_obj.group('AB') CD = match_obj.group('CD') EF = match_obj.group('EF') }}}


反馈

创建 by -- ZoomQuiet [2010-05-13 07:37:16]

MiscItems/2010-05-13 (last edited 2010-05-13 10:36:14 by ZoomQuiet)