add htmlq

This commit is contained in:
JMARyA 2024-04-26 08:11:28 +02:00
parent 5f3e434132
commit c9c4cda5ba
Signed by: jmarya
GPG key ID: 901B2ADDF27C2263

View file

@ -0,0 +1,24 @@
---
obj: application
repo: https://github.com/mgdm/htmlq
rev: 2024-04-25
---
# htmlq
Like [jq](jq.md), but for [HTML](../../internet/HTML.md). Uses [CSS](../../internet/CSS.md) selectors to extract bits of content from [HTML](../../internet/HTML.md) files.
## Usage
Usage: `htmlq [FLAGS] [OPTIONS] [--] [selector]...`
### Options
| Option | Description |
| ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `-B, --detect-base` | Try to detect the base [URL](../../internet/URL.md) from the `<base>` tag in the document. If not found, default to the value of `--base`, if supplied |
| `-w, --ignore-whitespace` | When printing text nodes, ignore those that consist entirely of whitespace |
| `-p, --pretty` | Pretty-print the serialised output |
| `-t, --text` | Output only the contents of text nodes inside selected elements |
| `-a, --attribute <attribute>` | Only return this attribute (if present) from selected elements |
| `-b, --base <base>` | Use this [URL](../../internet/URL.md) as the base for links |
| `-f, --filename <FILE>` | The input file. Defaults to stdin |
| `-o, --output <FILE>` | The output file. Defaults to stdout |
| `-r, --remove-nodes <SELECTOR>...` | Remove nodes matching this expression before output. May be specified multiple times |