##language:zh ''' 文章来自《Python cookbook》. 翻译仅仅是为了个人学习,其它商业版权纠纷与此无关! ''' -- 0.706 [<>] <> = Converting Between Different Naming Conventions 在不同的命名约定之间转换 = Credit: Sami Hangaslammi == 问题 Problem == You have a body of code whose identifiers use one of the common naming conventions to represent multiple words in a single identifier (CapitalizedWords, mixedCase, or under_scores), and you need to convert the code to another naming convention in order to merge it smoothly with other code. 你有一段代码,其中使用某种常见的命名约定表示一个多词标识符(首符大写形式CapitalizedWords,大小写混合形式 mixedCase 或下划线连接形式 under_scores) ,为了能平滑地与其他的代码合并 , 你需要将代码转换成另外的命名约定。 == 解决 Solution == re.sub covers the two hard cases, converting underscore to and from the others: re.sub 包含两种很难(理解)的情形, 将'下划线连接形式'(underscore)转换成其它形式和从其它形式转换成'下划线连接形式'(underscore): {{{ #!python import re def cw2us(x): # 首符大写形式 to 下划线连接形式 return re.sub(r'(?<=[a-z])[A-Z]|(?", x).lower( ) def us2mc(x): # 下划线连接形式 to 大小写混合形式 return re.sub(r'_([a-z])', lambda m: (m.group(1).upper( )), x) }}} Mixed-case to underscore is just like capwords to underscore (the case-lowering of the first character becomes redundant, but it does no harm): '大小写混合形式'到'下划线连接形式'的转换,正类似于'首符大写形式'到'下划线连接形式':(变第一个字符为小写成为多余,但是它没有害处) {{{ #!python def mc2us(x): # mixed-case to underscore notation return cw2us(x) }}} Underscore to capwords can similarly exploit the underscore to mixed-case conversion, but it needs an extra twist to uppercase the start: '下划线连接形式'到'首符大写形式' 能同样地使用对'下划线连接形式'到'大小写混合形式'的转换,但是它需要额外的把开头变为大写字母: {{{ #!python def us2cw(x): # underscore to capwords notation s = us2mc(x) return s[0].upper( )+s[1:] }}} Conversion between mixed-case and capwords is, of course, just an issue of lowercasing or uppercasing the first character, as appropriate: 在'大小写混合形式'和'首符大写形式'之间转换, 当然只是适当的用小写字母或大写字母写第一个字符的问题: {{{ #!python def mc2cw(x): # mixed-case to capwords return s[0].lower( )+s[1:] def cw2mc(x): # capwords to mixed-case return s[0].upper( )+s[1:] }}} == 讨论 Discussion == Here are some usage examples: 一些用法例子在这里: {{{ >>> cw2us("PrintHTML") 'print_html' >>> cw2us("IOError") 'io_error' >>> cw2us("SetXYPosition") 'set_xy_position' >>> cw2us("GetX") 'get_x' }}} The set of functions in this recipe is useful, and very practical, if you need to homogenize naming styles in a bunch of code, but the approach may be a bit obscure.In the interest of clarity, you might want to adopt a conceptual stance that is general and fruitful.In other words, to convert a bunch of formats into each other, find a neutral format and write conversions from each of the N formats into the neutral one and back again.This means having 2N conversion functions rather than N x (N-1)梐 big win for large N梑ut the point here (in which N is only three) is really one of clarity. 如果你需要一致化一组代码的命名风格,这一份配方中的函数是有用且非常实际的.但是方式可能有一点晦涩。(如果有)追求清楚的兴趣,你可能想要采用在概念上(更加)一般的和有成效的方式。 换句话说,为了在一组格式间彼此转换,找一个中立的格式,并且写出从N个格式中的每一个到中立者以及相反的转换。这意谓着有2N个转换函数,而不是 N*(N-1)个,这与这里相比(只有三种格式),在N很大时的优势确实很明显。 Clearly, the underlying neutral format that each identifier style is encoding is a list of words.Let's say, for definiteness and without loss of generality, that they are lowercase words: 显然,隐藏在每种标识符风格下面的中立格式就是一个单词的列表。让我们说, 为了明确且不失一般性,他们都是小写字母组成的词: {{{ #!python import string, re def anytolw(x): # any format of identifier to list of lowercased words # First, see if there are underscores: lw = string.split(x,'_') if len(lw)>1: return map(string.lower, lw) # No. Then uppercase letters are the splitters: pieces = re.split('([A-Z])', x) # Ensure first word follows the same rules as the others: if pieces[0]: pieces = [''] + pieces else: pieces = pieces[1:] # Join two by two, lowercasing the splitters as you go return [pieces[i].lower( )+pieces[i+1] for i in range(0,len(pieces),2)] }}} There's no need to specify the format, since it's self-describing.Conversely, when translating from our internal form to an output format, we do need to specify the format we want, but on the other hand, the functions are very simple: 没有需要去指明(参数的)格式, 因为它是自我描述的。相反地,当从我们的内在形式翻译到一个输出格式的时候,我们确实需要叙述我们想要的格式,但是另一方面,这些函数却是在非常简单的: {{{ #!python def lwtous(x): return '_'.join(x) def lwtocw(x): return ''.join(map(string.capitalize,x)) def lwtomc(x): return x[0]+''.join(map(string.capitalize,x[1:])) }}} Any other combination is a simple issue of functional composition: 任何其他的组合是简单的把(上面的)函数合成起来的结果: {{{ #!python def anytous(x): return lwtous(anytolw(x)) cwtous = mctous = anytous def anytocw(x): return lwtocw(anytolw(x)) ustocw = mctocw = anytocw def anytomc(x): return lwtomc(anytolw(x)) cwtomc = ustomc = anytomc }}} The specialized approach is slimmer and faster, but this generalized stance may ease understanding as well as offering wider application. 这种特殊的方式更加简短且快速,而且这种一般化的态度易于理解且能推广到很多应用上。 == 参考 See Also == The Library Reference sections on the re and string modules.