Alternation is the term in regular expression that is actually a simple “OR”.
In a regular expression it is denoted with a vertical line character
The corresponding regexp:
A usage example:
We already know a similar thing – square brackets. They allow to choose between multiple character, for instance
Square brackets allow only characters or character sets. Alternation allows any expressions. A regexp
A|B|C means one of expressions
gr(a|e)ymeans exactly the same as
To separate a part of the pattern for alternation we usually enclose it in parentheses, like this:
In previous chapters there was a task to build a regexp for searching time in the form
hh:mm, for instance
12:00. But a simple
\d\d:\d\d is too vague. It accepts
25:99 as the time (as 99 seconds match the pattern).
How can we make a better one?
We can apply more careful matching. First, the hours:
- If the first digit is
1, then the next digit can by anything.
- Or, if the first digit is
2, then the next must be
As a regexp:
Next, the minutes must be from
59. In the regexp language that means
[0-5]\d: the first digit
0-5, and then any digit.
Let’s glue them together into the pattern:
We’re almost done, but there’s a problem. The alternation
| now happens to be between
That’s wrong, as it should be applied only to hours
2[0-3]. That’s a common mistake when starting to work with regular expressions.
The correct variant: