knowledge/technology/tools/Regex.md
2023-12-04 11:02:23 +01:00

2.5 KiB
Raw Permalink Blame History

obj aliases wiki
concept
Regular Expression
https://en.wikipedia.org/wiki/Regular_expression

Regex

A regular expression (shortened as regex or regexp), is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation.

Anchors

^

Matches the beginning of the string or line.
Example: ^word

$

Matches the end of the string or line.
Example: \.txt$

Flags

  • i: Makes the expression case insensitive
  • g: Ensures that the expression does not stop on the first match

Group & References

()

Groups an expression.
Example: (ha)+

\1

References a grouped expression. \1 references the first group, \2 the second and so on.
Example: (ha)\s\1

(?:)

Makes a grouping that cannot be referenced.
Example: (?:ha)+

Character Classes

[abc]

Matches any character in the set.
Example: b[eo]r

[^abc]

Matches any character not in the set.
Example: b[^eo]r

[a-z]

Matches all characters between two characters, including themselves.
Example: [e-i]

.

Matches any character except line breaks.

\w

Matches any alphanumeric character. Including the underline.

\W

Matches any non-alphanumeric character.

\d

Matches any numeric character.

\D

Matches any non-numeric character.

\s

Matches any whitespace character.

\S

Matches any non-whitespace character.

Lookarounds

(?=)

Positive Lookahead.
Example: \d(?=after)

(?!)

Negative Lookahead.
Example: \d(?!after)

(?<=)

Positive Lookbehind.
Example: (?<=behind)\d

(?<!)

Negative Lookbehind.
Example: (?<!behind)\d

Quantifiers And Alternation

+

Expression matches one or more.
Example: be+r

*

Expression matches zero or more.
Example: be*r

{}

Expression matches within specified ranges (matches this many times):

  • Match exactly: {4}
  • Match minimum: {4,}
  • Match between: {4,9}
    Example: be{1,2}r

?

Makes the expression optional or lazy.
Example: colou?r

|

Works like OR. It waits for one of the expressions it reserved to match.
Example: (c|r)at

Common Regular Expressions

  • IPv4-Address: \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
  • MAC-Address: (?:[0-9a-fA-F]{2}\:){5}[0-9a-fA-F]{2}
  • Hex Color Codes: ^#?([a-fA-F0-9]{6})$
  • Mail Address: ^([a-zA-Z09._%-]+@[a-zA-Z09.-]+\.[a-zA-Z]{2,6})*$