knowledge/technology/tools/Regex.md
2023-12-04 11:02:23 +01:00

115 lines
2.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
obj: concept
aliases: ["Regular Expression"]
wiki: https://en.wikipedia.org/wiki/Regular_expression
---
# Regex
A regular expression (shortened as regex or regexp), is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation.
## Anchors
### `^`
Matches the beginning of the string or line.
Example: `^word`
### `$`
Matches the end of the string or line.
Example: `\.txt$`
## Flags
- `i`: Makes the expression case insensitive
- `g`: Ensures that the expression does not stop on the first match
## Group & References
### `()`
Groups an expression.
Example: `(ha)+`
### `\1`
References a grouped expression. `\1` references the first group, `\2` the second and so on.
Example: `(ha)\s\1`
### `(?:)`
Makes a grouping that cannot be referenced.
Example: `(?:ha)+`
## Character Classes
### `[abc]`
Matches any character in the set.
Example: `b[eo]r`
### `[^abc]`
Matches any character not in the set.
Example: `b[^eo]r`
### `[a-z]`
Matches all characters between two characters, including themselves.
Example: `[e-i]`
### `.`
Matches any character except line breaks.
### `\w`
Matches any alphanumeric character. Including the underline.
### `\W`
Matches any non-alphanumeric character.
### `\d`
Matches any numeric character.
### `\D`
Matches any non-numeric character.
### `\s`
Matches any whitespace character.
### `\S`
Matches any non-whitespace character.
## Lookarounds
### `(?=)`
Positive Lookahead.
Example: `\d(?=after)`
### `(?!)`
Negative Lookahead.
Example: `\d(?!after)`
### `(?<=)`
Positive Lookbehind.
Example: `(?<=behind)\d`
### `(?<!)`
Negative Lookbehind.
Example: `(?<!behind)\d`
## Quantifiers And Alternation
### `+`
Expression matches one or more.
Example: `be+r`
### `*`
Expression matches zero or more.
Example: `be*r`
### `{}`
Expression matches within specified ranges (matches this many times):
- Match exactly: `{4}`
- Match minimum: `{4,}`
- Match between: `{4,9}`
Example: `be{1,2}r`
### `?`
Makes the expression optional or lazy.
Example: `colou?r`
### `|`
Works like OR. It waits for one of the expressions it reserved to match.
Example: `(c|r)at`
## Common Regular Expressions
- IPv4-Address: `\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}`
- MAC-Address: `(?:[0-9a-fA-F]{2}\:){5}[0-9a-fA-F]{2}`
- Hex Color Codes: `^#?([a-fA-F0-9]{6})$`
- Mail Address: `^([a-zA-Z09._%-]+@[a-zA-Z09.-]+\.[a-zA-Z]{2,6})*$`