##language:zh
#pragma section-numbers off
##含有章节索引导航的 ZPyUG 文章通用模板
<<TableOfContents>>
## 默许导航,请保留
<<Include(ZPyUGnav)>>


= re 模块功能问深 =


##startInc
== 问题 ==
{{{
Zoom.Quiet <zoom.quiet@gmail.com>
sender-time	Sent at 09:48 (GMT+08:00). Current time there: 3:34 PM. ✆
to	"Python.cn@google" <python-cn@googlegroups.com>
date	Thu, May 13, 2010 at 09:48
subject	re 模块功能问深
}}}
现有一正则表达式的技巧问题:

{{{
a="balalalalallalala"
b=re.compile("介个正则")
- b 的正则表达式 包含多个模式，或的关系;
- a 中可能有或是没有一个且仅一个模式，在b 中
}}}
 * 问题，是现在需要返回另外一个对应的值！
 * 比如说: b 中包含的3个模式，对应的期待返回值是:
{{{
"AB" -> 123423
"CD" -> 654623
"EF" -> 675647
}}}
当然俺，可以根据匹配的模式组编号另外再查个字典返回，但是，这就慢了...

=== re.sub() ===
{{{
小明同学 <wjm251@gmail.com>
sender-time	Sent at 14:42 (GMT-07:00). Current time there: 12:35 AM. ✆
reply-to	python-cn@googlegroups.com
to	python-cn@googlegroups.com
date	Thu, May 13, 2010 at 14:42
}}}

{{{#!python
b = '(pat1|pat2|pat3)?'
a = re.match(b)

d = {None:val1,
    pat1:val2,
    pat2:val2,
    pat3:val3,
    #..........
   }

return d[a.group()]
}}}

=== 小明:实例 ===
在下面的例子里，替换函数将十进制翻译成十六进制：

{{{
#!python
>>> def hexrepl( match ):
...     "Return the hex string for a decimal number"
...     value = int( match.group() )
...     return hex(value)
...
>>> p = re.compile(r'\d+')
>>> p.sub(hexrepl, 'Call 65490 for printing, 49152 for user code.')
'Call 0xffd2 for printing, 0xc000 for user code.'
}}}

当使用模块级的 re.sub() 函数时，模式作为第一个参数。
 * 模式也许是一个字符串或一个`RegexObject`；
 * 如果你需要指定正则表达式标志，你必须要么使用 `RegexObject`做第一个参数，
 * 或用使用模式内嵌修正器，如 sub("(?i)b+", "x", "bbbb BBBB") returns 'x x'。

=== 阿暖 ===
{{{
阿暖 <anuan2008@gmail.com>
sender-time	Sent at 16:40 (GMT+08:00). Current time there: 4:44 PM. ✆
reply-to	python-cn@googlegroups.com
to	python-cn@googlegroups.com
date	Thu, May 13, 2010 at 16:40
}}}
	
ply里见过有这样的写法

{{{#!python
token={'123423':AB,
    '654623':"CD",
    '675647'  : "EF" ,
    }

s='.**123423.**'
m=re.match('''(123423)|(654623)|(675647)''',s)
handle=token[m.group()]
}}}

== 的确靠谱 ;-) ==
使用 Kodos 测试
{{attachment:zoomq-2010-05-13-181627_365x663_scrot.png}}

自动生成的 Sample Code:
{{{
#!python

import re

# common variables

rawstr = r"""(?P<AB>123423)|(?P<CD>654623)|(?P<EF>675647)"""
embedded_rawstr = r"""(?P<AB>123423)|(?P<CD>654623)|(?P<EF>675647)"""
matchstr = """675645 123523 675647 675648 675649 99999999"""

# method 1: using a compile object
compile_obj = re.compile(rawstr)
match_obj = compile_obj.search(matchstr)

# method 2: using search function (w/ external flags)
match_obj = re.search(rawstr, matchstr)

# method 3: using search function (w/ embedded flags)
match_obj = re.search(embedded_rawstr, matchstr)

# Retrieve group(s) from match_obj
all_groups = match_obj.groups()

# Retrieve group(s) by index
group_1 = match_obj.group(1)
group_2 = match_obj.group(2)
group_3 = match_obj.group(3)

# Retrieve group(s) by name
AB = match_obj.group('AB')
CD = match_obj.group('CD')
EF = match_obj.group('EF')
}}}

=== 不用 token ===
{{{
#!python
# common variables
rawstr = r"""(?P<AB>123423)|(?P<CD>654623)|(?P<EF>675647)"""
matchstr = """675645 123523 675647 675648 675649 99999999"""

# method 1: using a compile object
compile_obj = re.compile(rawstr)
match_obj = compile_obj.search(matchstr)
print match_obj.groupdict()
print [k for (k, v) in match_obj.groupdict().iteritems() if v !=None]

}}}


##endInc

----
'''反馈'''

创建 by -- ZoomQuiet [<<DateTime(2010-05-13T15:37:16+0800)>>]