Size: 364
Comment:
|
Size: 5722
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 15: | Line 15: |
... | You need to check for the occurrence of any of a set of characters in a string. 你需要检查检查是否一个集合的字符在某个字符串中出现. |
Line 18: | Line 20: |
... | The solution generalizes to any sequence (not just a string), and any set (any object in which membership can be tested with the in operator, not just one of characters): 本配方通用于任何序列(不仅仅是字符串),任何集合(任何可以用in 操作符来检查其成员关系的对象,不仅仅是一个字符集). |
Line 21: | Line 26: |
def containsAny(str, set): """ Check whether sequence str contains ANY of the items in set. """ return 1 in [c in str for c in set] def containsAll(str, set): """ Check whether sequence str contains ALL of the items in set. """ return 0 not in [c in str for c in set] |
|
Line 25: | Line 37: |
... | While the find and count string methods can check for substring occurrences, there is no ready-made function to check for the occurrence in a string of a set of characters. While working on a condition to check whether a string contained the special characters used in the glob.glob standard library function, I came up with the above code (with help from the OpenProjects IRC channel #python). Written this way, it really is compatible with human thinking, even though you might not come up with such code intuitively. That is often the case with list comprehensions. The following code creates a list of 1/0 values, one for each item in the set: {{{ [c in str for c in set] }}} Then this code checks whether there is at least one true value in that list: {{{ 1 in [c in str for c in set] }}} Similarly, this checks that no false values are in the list: 0 not in [c in str for c in set] Usage examples are best cast in the form of unit tests to be appended to the .py source file of this module, with the usual idiom to ensure that the tests execute if the module runs as a main script: {{{ #!python if _ _name_ _ == "_ _main_ _": # unit tests, must print "OK!" when run assert containsAny('*.py', '*?[]') assert not containsAny('file.txt', '*?[]') assert containsAll('43221', '123') assert not containsAll('134', '123') print "OK!" }}} Of course, while the previous idioms are neat, there are alternatives (aren't there always?). Here are the most elementary梐nd thus, in a sense, the most Pythonic梐lternatives: {{{ #!python def containsAny(str, set): for c in set: if c in str: return 1 return 0 def containsAll(str, set): for c in set: if c not in str: return 0 return 1 }}} Here are some alternatives that ensure minimal looping (earliest possible return). These are the most concise and thus, in a sense, the most powerful: {{{ #!python from operator import and_, or_, contains def containsAny(str, set): return reduce(or_, map(contains, len(set)*[str], set)) def containsAll(str, set): return reduce(and_, map(contains, len(set)*[str], set)) }}} Here are some even slimmer variants of the latter that rely on a special method that string objects supply only in Python 2.2 and later: {{{ #!python from operator import and_, or_ def containsAny(str, set): return reduce(or_, map(str._ _contains_ _, set)) def containsAll(str, set): return reduce(and_, map(str._ _contains_ _, set)) }}} And here is a tricky variant that relies on functionality also available in 2.0: {{{ #!python def containsAll(str, set): try: map(str.index, set) except ValueError: return 0 else: return 1 }}} Fortunately, this rather tricky approach lacks an immediately obvious variant applicable to implement containsAny. However, one last tricky scheme, based on string.translate's ability to delete all characters in a set, does apply to both functions: {{{ #!python import string notrans = string.maketrans('', '') # identity "translation" def containsAny(str, set): return len(set)!=len(set.translate(notrans, str)) def containsAll(str, set): return 0==len(set.translate(notrans, str)) }}} This trick at least has some depth梚t relies on set.translate(notrans, str) being the subsequence of set that is made of characters not in str. If that subsequence has the same length as set, no characters have been removed by set.translate, so no characters of set are in str. Conversely, if that subsequence has length 0, all characters have been removed, so all characters of set are in str. The translate method of string objects keeps coming up naturally when one wants to treat strings as sets of characters, partly because it's so speedy and partly because it's so handy and flexible. See Recipe 3.8 for another similar application. One last observation is that these different ways to approach the task have very different levels of generality. At one extreme, the earliest approaches, relying only on in (for looping on str and for membership in set) are the most general; they are not at all limited to string processing, and they make truly minimal demands on the representations of str and set. At the other extreme, the last approach, relying on the translate method, works only when both str and set are strings or closely mimic string objects' functionality. |
Line 28: | Line 132: |
Recipe 3.8; documentation for the translate and maketrans functions in the string module in the Library Reference. |
文章来自《Python cookbook》. 翻译仅仅是为了个人学习,其它商业版权纠纷与此无关!
-- 0.706 [DateTime(2004-08-15T17:02:47Z)] TableOfContents
描述
...
问题 Problem
You need to check for the occurrence of any of a set of characters in a string.
你需要检查检查是否一个集合的字符在某个字符串中出现.
解决 Solution
The solution generalizes to any sequence (not just a string), and any set (any object in which membership can be tested with the in operator, not just one of characters):
本配方通用于任何序列(不仅仅是字符串),任何集合(任何可以用in 操作符来检查其成员关系的对象,不仅仅是一个字符集).
讨论 Discussion
While the find and count string methods can check for substring occurrences, there is no ready-made function to check for the occurrence in a string of a set of characters.
While working on a condition to check whether a string contained the special characters used in the glob.glob standard library function, I came up with the above code (with help from the OpenProjects IRC channel #python). Written this way, it really is compatible with human thinking, even though you might not come up with such code intuitively. That is often the case with list comprehensions.
The following code creates a list of 1/0 values, one for each item in the set:
[c in str for c in set]
Then this code checks whether there is at least one true value in that list:
1 in [c in str for c in set]
Similarly, this checks that no false values are in the list:
0 not in [c in str for c in set] Usage examples are best cast in the form of unit tests to be appended to the .py source file of this module, with the usual idiom to ensure that the tests execute if the module runs as a main script:
Of course, while the previous idioms are neat, there are alternatives (aren't there always?). Here are the most elementary梐nd thus, in a sense, the most Pythonic梐lternatives:
Here are some alternatives that ensure minimal looping (earliest possible return). These are the most concise and thus, in a sense, the most powerful:
Here are some even slimmer variants of the latter that rely on a special method that string objects supply only in Python 2.2 and later:
And here is a tricky variant that relies on functionality also available in 2.0:
Fortunately, this rather tricky approach lacks an immediately obvious variant applicable to implement containsAny. However, one last tricky scheme, based on string.translate's ability to delete all characters in a set, does apply to both functions:
This trick at least has some depth梚t relies on set.translate(notrans, str) being the subsequence of set that is made of characters not in str. If that subsequence has the same length as set, no characters have been removed by set.translate, so no characters of set are in str. Conversely, if that subsequence has length 0, all characters have been removed, so all characters of set are in str. The translate method of string objects keeps coming up naturally when one wants to treat strings as sets of characters, partly because it's so speedy and partly because it's so handy and flexible. See Recipe 3.8 for another similar application.
One last observation is that these different ways to approach the task have very different levels of generality. At one extreme, the earliest approaches, relying only on in (for looping on str and for membership in set) are the most general; they are not at all limited to string processing, and they make truly minimal demands on the representations of str and set. At the other extreme, the last approach, relying on the translate method, works only when both str and set are strings or closely mimic string objects' functionality.
参考 See Also
Recipe 3.8; documentation for the translate and maketrans functions in the string module in the Library Reference.