Reference – What does this regex mean?

What is this?

This is a collection of common Q&A. This is also a Community Wiki, so everyone is invited to participate in maintaining it.

Why is this?

regex is suffering from give me ze code type of questions and poor answers with no explanation. This reference is meant to provide links to quality Q&A.

What’s the scope?

This reference is meant for the following languages: php, perl, javascript, python, ruby, java, .net.

This might be too broad, but these languages share the same syntax. For specific features there’s the tag of the language behind it, example:

  • What are regular expression Balancing Groups? .net

1

The Stack Overflow Regular Expressions FAQ

See also a lot of general hints and useful links at the regex tag details page.


Online tutorials

  • RegexOne ↪
  • Regular Expressions Info ↪

Quantifiers

  • Zero-or-more: *: greedy, *?: reluctant, *+: possessive
  • One-or-more: +: greedy, +?: reluctant, ++: possessive
  • Zero-or-one: ?: greedy, ??: reluctant, ?+: possessive
  • Min/max ranges (all inclusive): {n,m}: between n & m, {n,}: n-or-more, {n}: exactly n
  • Differences between greedy, reluctant (a.k.a. “lazy”, “ungreedy”) and possessive quantifier:
    • Greedy vs. Reluctant vs. Possessive Quantifiers
    • In-depth discussion on the differences between greedy versus non-greedy
    • What’s the difference between {n} and {n}?
    • Can someone explain Possessive Quantifiers to me? php, perl, java, ruby
    • Emulating possessive quantifiers .net
    • Non-Stack Overflow references: From Oracle, regular-expressions.info

Character Classes

  • What is the difference between square brackets and parentheses?
  • [...]: any one character, [^...]: negated/any character but
  • [^] matches any one character including newlines javascript
  • [w-[d]] / [a-z-[qz]]: set subtraction .net, xml-schema, xpath, JGSoft
  • [w&&[^d]]: set intersection java, ruby 1.9+, javascript (with v flag)
  • [[:alpha:]]:POSIX character classes
  • [[:<:]] and [[:>:]] Word boundaries
  • Why do [^\D2], [^[^0-9]2], [^2[^0-9]] get different results in Java? java
  • Shorthand:
    • Digit: d:digit, D:non-digit
    • Word character (Letter, digit, underscore): w:word character, W:non-word character
    • Whitespace: s:whitespace, S:non-whitespace
  • Unicode categories (p{L}, P{L}, etc.)

Escape Sequences

  • Horizontal whitespace: h:space-or-tab, t:tab
  • Newlines:
    • r, n:carriage return and line feed
    • R:generic newline php java-8
  • Negated whitespace sequences: H:Non horizontal whitespace character, V:Non vertical whitespace character, N:Non line feed character pcre php5 java-8
  • Other: v:vertical tab, e:the escape character

Anchors

anchor matches flavors
^ Start of string Common*
^ Start of line Commonm
$ End of line Commonm
$ End of text Common* except javascript
$ Very end of string javascript*, phpD
A Start of string Common except javascript
Z End of text Common except javascript python
Z Very end of string python
z Very end of string Common except javascript python
b Word boundary Common
B Not a word boundary Common
G End of previous match Common except javascript, python
Term Definition
Start of string At the very start of the string.
Start of line At the very start of the string, and
after a non-terminal line terminator.
Very end of string At the very end of the string.
End of text At the very end of the string, and
at a terminal line terminator.
End of line At the very end of the string, and
at a line terminator.
Word boundary At a word character not preceded by a word character, and
at a non-word character not preceded by a non-word character.
End of previous match At a previously set position, usually where a previous match ended.
At the very start of the string if no position was set.

“Common” refers to the following: icu java javascript .net objective-c pcre perl php python swift ruby

* Default |
m Multi-line mode. |
D Dollar end only mode.

Groups

  • (...):capture group, (?:):non-capture group
    • How to capture multiple repeated groups?
    • Why is my repeating capturing group only capturing the last match?
  • 1:backreference and capture-group reference, $1:capture group reference
    • What’s the meaning of a number after a backslash in a regular expression?
    • g<1>123:How to follow a numbered capture group, such as 1, with a number?: python
  • What does a subpattern (?i:regex) mean?
  • What does the ‘P’ in (?P<group_name>regexp) mean?
  • (?>):atomic group or independent group, (?|):branch reset
    • Equivalent of branch reset in .NET/C# .net
  • Named capture groups:
    • General named capturing group reference at regular-expressions.info
    • java: (?<groupname>regex): Overview and naming rules (Non-Stack Overflow links)
    • Other languages: (?P<groupname>regex) python, (?<groupname>regex) .net, (?<groupname>regex) perl, (?P<groupname>regex) and (?<groupname>regex) php
  • (?<-foo>): balancing groups .net

Lookarounds

  • Lookaheads: (?=...):positive, (?!...):negative
  • Lookbehinds: (?<=...):positive, (?<!...):negative
  • Lookbehind limits in:
    • Lookbehinds need to be constant-length php, perl, python, ruby
    • Lookarounds of limited length {0,n} java
    • Variable length lookbehinds are allowed .net
  • Lookbehind alternatives:
    • Using K php, perl (Flavors that support K)
    • Alternative regex module for Python python
      • The hacky way
      • JavaScript negative lookbehind equivalents External link
  • Tempered greedy token

Modifiers

flag modifier flavors
a ASCII python
c current position perl
e expression php perl
g global most
i case-insensitive most
m multiline php perl python javascript .net java
m (non)multiline ruby
o once perl ruby
r non-destructive perl
S study php
s single line ruby
U ungreedy php r
u unicode most
x whitespace-extended most
y sticky ↪ javascript
  • How to convert preg_replace e to preg_replace_callback?
  • What are inline modifiers?
  • What is ‘?-mix’ in a Ruby Regular Expression
  • What does the caret ^ in (?^:…) mean in the string form of a Perl qr// Regex?

Other:

  • |:alternation (OR) operator, .:any character, [.]:literal dot character
  • What special characters must be escaped?
  • Control verbs (php and perl): (*PRUNE), (*SKIP), (*FAIL) and (*F)
    • php only: (*BSR_ANYCRLF)
    • The (*SKIP)(*FAIL) idiom
  • Recursion (php and perl): (?R), (?0) and (?1), (?-1), (?&groupname)

Common Tasks

  • Get a string between two curly braces: {...}
  • Match (or replace) a pattern except in situations s1, s2, s3…
  • Find all YouTube video ids in a string
  • Ignore middle part of a match
  • Validation:
    • Internet: email addresses, URLs (host/port: regex and non-regex alternatives), passwords
    • Numeric: a number, min-max ranges (such as 1-31), phone numbers, date
    • Parsing HTML with regex: See “General Information > When not to use Regex”

Advanced Regex-Fu

  • Strings and numbers:
    • Regular expression to match a line that doesn’t contain a word
    • How does this PCRE pattern detect palindromes?
    • Match strings whose length is a fourth power
    • How does this regex find triangular numbers?
    • How to determine if a number is a prime with regex?
    • How to match the middle character in a string with regex?
  • Other:
    • How can we match a^n b^n?
    • Match nested brackets
      • Using a recursive pattern php, perl
      • Using balancing groups .net
    • “Vertical” regex matching in an ASCII “image”
    • List of highly up-voted regex questions on Code Golf
    • How to make two quantifiers repeat the same number of times?
    • An impossible-to-match regular expression: (?!a)a
    • Match/delete/replace this except in contexts A, B and C
    • Match nested brackets with regex without using recursion or balancing groups?
    • Can a Regex Return the Number of the Line where the Match is Found?
    • Matching overlapping strings
      • Manually modify the .lastIndex property javascript
      • With the third-party library regex python
    • Match a sequence of two distinct characters where the disparity in count is not greater than 3

Flavor-Specific Information

(Except for those marked with *, this section contains non-Stack Overflow links.)

  • Java
    • Official documentation: Pattern Javadoc ↪, Oracle’s regular expressions tutorial ↪
    • The differences between functions in java.util.regex.Matcher:
      • matches()): The match must be anchored to both input-start and -end
      • find()): A match may be anywhere in the input string (substrings)
      • lookingAt(): The match must be anchored to input-start only
      • (For anchors in general, see the section “Anchors”)
    • The only java.lang.String functions that accept regular expressions: matches(s), replaceAll(s,s), replaceFirst(s,s), split(s), split(s,i)
    • *An (opinionated and) detailed discussion of the disadvantages of and missing features in java.util.regex
  • .NET
    • How to read a .NET regex with look-ahead, look-behind, capturing groups and back-references mixed together?
  • Official documentation:
    • Boost regex engine: General syntax, Perl syntax (used by TextPad, Sublime Text, UltraEdit, …???)
    • JavaScript general info and RegExp object
    • .NET MySQL Oracle Perl5 version 18.2
    • PHP: pattern syntax, preg_match
    • Python: Regular expression operations, search vs match, how-to
    • Rust: crate regex, struct regex::Regex
    • Splunk: regex terminology and syntax and regex command
    • Tcl: regex syntax, manpage, regexp command
    • Visual Studio Find and Replace

General information

(Links marked with * are non-Stack Overflow links.)

  • Other general documentation resources: Learning Regular Expressions, *Regular-expressions.info, *Wikipedia entry, *RexEgg, Open-Directory Project
  • DFA versus NFA
  • Generating Strings matching regex
  • Books: Jeffrey Friedl’s Mastering Regular Expressions
  • When to not use regular expressions:
    • Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. (blog post written by Stack Overflow’s founder)*
    • Do not use regex to parse HTML:
      • Don’t. Please, just don’t
      • Well, maybe…if you’re really determined (other answers in this question are also good)

Examples of regex that can cause regex engine to fail

  • Why does this regular expression kill the Java regex engine?

Tools: Testers and Explainers

(This section contains non-Stack Overflow links.)

  • Online (* includes replacement tester, + includes split tester):

    • Debuggex (Also has a repository of useful regexes) javascript, python, pcre
    • *Regular Expressions 101 php, pcre, python, javascript, java, go, c#, rust
    • Regex Pal, regular-expressions.info javascript
    • Rubular ruby RegExr Regex Hero dotnet
    • *+ regexstorm.net .net
    • *RegexPlanet: Java java, Go go, Haskell haskell, JavaScript javascript, .NET dotnet, Perl perl php PCRE php, Python python, Ruby ruby, XRegExp xregexp
    • freeformatter.com xregexp
    • *+regex.larsolavtorvik.com php PCRE and POSIX, javascript
  • Offline:

    • Microsoft Windows: RegexBuddy (analysis), RegexMagic (creation), Expresso (analysis, creation, free)
  • MySQL 8.0: Various syntax changes were made. Note especially the doubling of backslashes in some contexts. (This Answer need further editing to reflect the differences.)

4

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật