10 Haziran 2019

Sometimes we need to match a pattern only if followed by another pattern. For instance, we’d like to get the price from a string like `1 turkey costs 30€`.

We need a number (let’s say a price has no decimal point) followed by `€` sign.

The syntax is: `x(?=y)`, it means "look for `x`, but match only if followed by `y`".

For an integer amount followed by `€`, the regexp will be `\d+(?=€)`:

``````let str = "1 turkey costs 30€";

alert( str.match(/\d+(?=€)/) ); // 30 (correctly skipped the sole number 1)``````

Let’s say we want a quantity instead, that is a number, NOT followed by `€`.

Here a negative lookahead can be applied.

The syntax is: `x(?!y)`, it means "search `x`, but only if not followed by `y`".

``````let str = "2 turkeys cost 60€";

alert( str.match(/\d+(?!€)/) ); // 2 (correctly skipped the price)``````

## Lookbehind

Lookbehind is similar, but it looks behind. That is, it allows to match a pattern only if there’s something before.

The syntax is:

• Positive lookbehind: `(?<=y)x`, matches `x`, but only if it follows after `y`.
• Negative lookbehind: `(?<!y)x`, matches `x`, but only if there’s no `y` before.

For example, let’s change the price to US dollars. The dollar sign is usually before the number, so to look for `\$30` we’ll use `(?<=\\$)\d+` – an amount preceeded by `\$`:

``````let str = "1 turkey costs \$30";

alert( str.match(/(?<=\\$)\d+/) ); // 30 (skipped the sole number)``````

And, to find the quantity – a number, not preceeded by `\$`, we can use a negative lookbehind `(?<!\\$)\d+`:

``````let str = "2 turkeys cost \$60";

alert( str.match(/(?<!\\$)\d+/) ); // 2 (skipped the price)``````

## Capture groups

Generally, what’s inside the lookaround (a common name for both lookahead and lookbehind) parentheses does not become a part of the match.

E.g. in the pattern `\d+(?=€)`, the `€` sign doesn’t get captured as a part of the match. That’s natural: we look for a number `\d+`, while `(?=€)` is just a test that it should be followed by `€`.

But in some situations we might want to capture the lookaround expression as well, or a part of it. That’s possible. Just wrap that into additional parentheses.

For instance, here the currency `(€|kr)` is captured, along with the amount:

``````let str = "1 turkey costs 30€";
let reg = /\d+(?=(€|kr))/; // extra parentheses around €|kr

alert( str.match(reg) ); // 30, €``````

And here’s the same for lookbehind:

``````let str = "1 turkey costs \$30";
let reg = /(?<=(\\$|£))\d+/;

alert( str.match(reg) ); // 30, \$``````

Please note that for lookbehind the order stays be same, even though lookahead parentheses are before the main pattern.

Usually parentheses are numbered left-to-right, but lookbehind is an exception, it is always captured after the main pattern. So the match for `\d+` goes in the result first, and then for `(\\$|£)`.

## Summary

Lookahead and lookbehind (commonly referred to as “lookaround”) are useful when we’d like to take something into the match depending on the context before/after it.

For simple regexps we can do the similar thing manually. That is: match everything, in any context, and then filter by context in the loop.

Remember, `str.matchAll` and `reg.exec` return matches with `.index` property, so we know where exactly in the text it is, and can check the context.

But generally regular expressions are more convenient.

Lookaround types:

Pattern type matches
`x(?=y)` Positive lookahead `x` if followed by `y`
`x(?!y)` Negative lookahead `x` if not followed by `y`
`(?<=y)x` Positive lookbehind `x` if after `y`
`(?<!y)x` Negative lookbehind `x` if not after `y`

Lookahead can also used to disable backtracking. Why that may be needed and other details – see in the next chapter.

Eğitim haritası

## Yorumlar

• Koda birkaç satır eklemek için `<code>` kullanınız, birkaç satır eklemek için ise `<pre>` kullanın. Eğer 10 satırdan fazla kod ekleyecekseniz plnkr kullanabilirsiniz)