24 lines
2.4 KiB
Markdown
24 lines
2.4 KiB
Markdown
---
|
|
obj: application
|
|
repo: https://github.com/mgdm/htmlq
|
|
rev: 2024-04-25
|
|
---
|
|
|
|
# htmlq
|
|
Like [jq](jq.md), but for [HTML](../../internet/HTML.md). Uses [CSS](../../internet/CSS.md) selectors to extract bits of content from [HTML](../../internet/HTML.md) files.
|
|
|
|
## Usage
|
|
Usage: `htmlq [FLAGS] [OPTIONS] [--] [selector]...`
|
|
|
|
### Options
|
|
| Option | Description |
|
|
| ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
| `-B, --detect-base` | Try to detect the base [URL](../../internet/URL.md) from the `<base>` tag in the document. If not found, default to the value of `--base`, if supplied |
|
|
| `-w, --ignore-whitespace` | When printing text nodes, ignore those that consist entirely of whitespace |
|
|
| `-p, --pretty` | Pretty-print the serialised output |
|
|
| `-t, --text` | Output only the contents of text nodes inside selected elements |
|
|
| `-a, --attribute <attribute>` | Only return this attribute (if present) from selected elements |
|
|
| `-b, --base <base>` | Use this [URL](../../internet/URL.md) as the base for links |
|
|
| `-f, --filename <FILE>` | The input file. Defaults to stdin |
|
|
| `-o, --output <FILE>` | The output file. Defaults to stdout |
|
|
| `-r, --remove-nodes <SELECTOR>...` | Remove nodes matching this expression before output. May be specified multiple times |
|