What is Regex?
Regular Expressions (REGEX, REGEXP) are used for pattern matching, text manipulation and parsing data. They are available in many modern programming languages such as Perl, PHP, Python, Ruby & Javascript as well as countless utilities such as grep and software programs. Regex comes in slightly different ‘flavours‘ based on the programming language or application. We will be learning PCRE version, this stands for Perl Compatible Regular Expressions, the perl programming language is credited for popularizing its usage. (PHP uses PCRE for example).
Regex Software:
These are referred to as regex visualizers, they aid in crafting regex by giving visual aids. Some great software/web apps to start learning are seen below:
- RegExRX ~ http://bit.ly/RegExRX ~ Mac OSX (App store).
- Regex101 ~ http://regex101.com ~ Great online utility.
Terminology:
This tutorial series will cover the regex terminology below.
Atoms
An atom is any unit that can match.
Literal Matches
Matches exactly as presented ‘literally’.
Character Classes
Tries to match an individual character within square brackets [abc]
Quantifiers
So far we have only seen literal matches and individual character matches. Regex has the ability to multiply atoms.
Anchors
Matches start ^ or end $ of string.
Boundaries
\b is a zero width assertion, meaning it doesn’t appear in resulting match but affects the outcome. the \b tries to match instances where a ‘word’ character transitions to a non word (for example white space ).
Alterations
Tries to match left of | if fails tries the next alternative.
Iteration
Like a quantifier but it matches a particular amount of times.
Metacharacters
Character classes have some shortcut equivalents. These are called metacharacters.
Capture Groups vs non capture groups
Capture groups save to memory a segment within parentheses and assign a variable to it. Starting at 1 and incrementing to 9.
Lookarounds
Positive and negative look arounds are zero width assertions. This means they affect the outcome of the match but are not included in the results. (Like word boundaries and anchors).
Modifiers
Modifiers change the behavior of the regex pattern. 2 methods to invoke them (inline) or at the end of the pattern (Varies on programming language/application).
If you have any questions or need help on anything related to regex I would love to assist! Post over at https://geekalicious.club/forum/viewforum.php?f=7 for all your regex needs π