Regex Patterns
This module contains a modest but growing collection of useful regular expressions, useful for extracting things like URLs, monetary values, and more.
Data:
A compiled regular expression for extracting (probably) valid URLs. |
|
A compiled regular expression for extracting domains from URLs. |
|
A compiled regular expression for finding HTTP/S prefixes. |
|
A compiled regular expression finding USD monetary amounts. |
|
A compiled regular expression for finding basic title-cased words. |
|
A compiled regular expression for finding raw numbers. |
|
A compiled regular expression for finding non-alphanumeric values. |
- URL_REGEX = re.compile('((?:https?:\\/\\/(?:www\\.)?)?[-a-zA-Z0-9@:%._\\+~#=]{1,4096}\\.[a-z]{2,6}\\b(?:[-a-zA-Z0-9@:%_\\+.~#?&//=]*))')
A compiled regular expression for extracting (probably) valid URLs.
- DOMAIN_REGEX = re.compile('(?:http[s]?\\:\\/\\/)?(?:www(?:s?)\\.)?([\\w\\.\\-]+)(?:[\\\\\\/](?:.+))?')
A compiled regular expression for extracting domains from URLs. Can be useful in a pinch but we recommend using the
pewtils.http.extract_domain_from_url()
instead.
- HTTP_REGEX = re.compile('^http(?:s)?\\:\\/\\/')
A compiled regular expression for finding HTTP/S prefixes.
- US_DOLLAR_REGEX = re.compile('(\\$(?:[1-9][0-9]{0,2}(?:(?:\\,[0-9]{3})+)?(?:\\.[0-9]{1,2})?))\\b')
A compiled regular expression finding USD monetary amounts.
- TITLEWORD_REGEX = re.compile('\\b([A-Z][a-z]+)\\b')
A compiled regular expression for finding basic title-cased words.
- NUMBER_REGEX = re.compile('\\b([0-9]+)\\b')
A compiled regular expression for finding raw numbers.
- NONALPHA_REGEX = re.compile('[^\\w]')
A compiled regular expression for finding non-alphanumeric values.