knowledge/technology/tools/Regex.md

116 lines
2.5 KiB
Markdown
Raw Normal View History

2023-12-04 10:02:23 +00:00
---
obj: concept
aliases: ["Regular Expression"]
wiki: https://en.wikipedia.org/wiki/Regular_expression
---
# Regex
A regular expression (shortened as regex or regexp), is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation.
## Anchors
### `^`
Matches the beginning of the string or line.
Example: `^word`
### `$`
Matches the end of the string or line.
Example: `\.txt$`
## Flags
- `i`: Makes the expression case insensitive
- `g`: Ensures that the expression does not stop on the first match
## Group & References
### `()`
Groups an expression.
Example: `(ha)+`
### `\1`
References a grouped expression. `\1` references the first group, `\2` the second and so on.
Example: `(ha)\s\1`
### `(?:)`
Makes a grouping that cannot be referenced.
Example: `(?:ha)+`
## Character Classes
### `[abc]`
Matches any character in the set.
Example: `b[eo]r`
### `[^abc]`
Matches any character not in the set.
Example: `b[^eo]r`
### `[a-z]`
Matches all characters between two characters, including themselves.
Example: `[e-i]`
### `.`
Matches any character except line breaks.
### `\w`
Matches any alphanumeric character. Including the underline.
### `\W`
Matches any non-alphanumeric character.
### `\d`
Matches any numeric character.
### `\D`
Matches any non-numeric character.
### `\s`
Matches any whitespace character.
### `\S`
Matches any non-whitespace character.
## Lookarounds
### `(?=)`
Positive Lookahead.
Example: `\d(?=after)`
### `(?!)`
Negative Lookahead.
Example: `\d(?!after)`
### `(?<=)`
Positive Lookbehind.
Example: `(?<=behind)\d`
### `(?<!)`
Negative Lookbehind.
Example: `(?<!behind)\d`
## Quantifiers And Alternation
### `+`
Expression matches one or more.
Example: `be+r`
### `*`
Expression matches zero or more.
Example: `be*r`
### `{}`
Expression matches within specified ranges (matches this many times):
- Match exactly: `{4}`
- Match minimum: `{4,}`
- Match between: `{4,9}`
Example: `be{1,2}r`
### `?`
Makes the expression optional or lazy.
Example: `colou?r`
### `|`
Works like OR. It waits for one of the expressions it reserved to match.
Example: `(c|r)at`
## Common Regular Expressions
- IPv4-Address: `\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}`
- MAC-Address: `(?:[0-9a-fA-F]{2}\:){5}[0-9a-fA-F]{2}`
- Hex Color Codes: `^#?([a-fA-F0-9]{6})$`
- Mail Address: `^([a-zA-Z09._%-]+@[a-zA-Z09.-]+\.[a-zA-Z]{2,6})*$`