Other syntaxes
Conditional
You can do something similar to an if/then/else in a regular expression, the syntax is (?if then|else). The else part is optional and could be removed with the vertical bar. The then and else part are subexpression of any kind. One of those will be evaluated depending on the if condition results. The if part is a condition and depnding on the implementation, it could be:
Back-references: In /(<B>)?This text may be bold(?(1)</B>)/, the (?(1)…) check that (<B>)? has match something, and if this is the case, try to match </B>.
Look-arounds: /(?(?<=a)b|c)/ means match b if precede by a else match c.
Host language code: Perl allow the use of arbitrary perl code has a conditional, for exemple, the following regurar expression matches correctly nested double angle quotes:
/« (?{local $nest=0}) (?> (?: [^«»]+ | « (?{$nest++}) | » (?(?{$nest != 0}) (?{$nest--}) | (?!) )) )* ) (?(?{$nest!=0}) (?!)) »/x
For your conveninence, I have used the x mode for a better presentation. This is also the occasion to introduce the special syntax (?!) which litteraly says "fail now". Finaly, using perl, there is another way to match nested contruct, which is called dynamic regex.
Mode modifiers
Some implementation also allow mode to be change during a regex or applied on part of a regex. The (?modifier) and (?-modifier) may be use inside a regex to switch modes on and off respectively (ie. /SenSiTiVe Text(?i)insensitive text(?-i)Back To Sensitive Text/). You can also apply a modifier to a subexpression using (?modifier:…) like in //SenSiTiVe Text(?i:insensitive text)Back To Sensitive Text/.
See also...
- Regular expressions - A technical white paper about this powerful language
- Regular Expression Engines - NFA and DFA engines explained
- Syntaxes - A short reference to usual syntaxes
- Regular expressions backtracking and quantifiers greediness - Backtracking of NFA engines explained
- Backtracking in regards to correctness and efficiency - Examples related to backtracking
- Unrolling the loop - Optimize repeated alternation