2.4 KiB
2.4 KiB
obj | repo | rev |
---|---|---|
application | https://github.com/mgdm/htmlq | 2024-04-25 |
htmlq
Like jq, but for HTML. Uses CSS selectors to extract bits of content from HTML files.
Usage
Usage: htmlq [FLAGS] [OPTIONS] [--] [selector]...
Options
Option | Description |
---|---|
-B, --detect-base |
Try to detect the base URL from the <base> tag in the document. If not found, default to the value of --base , if supplied |
-w, --ignore-whitespace |
When printing text nodes, ignore those that consist entirely of whitespace |
-p, --pretty |
Pretty-print the serialised output |
-t, --text |
Output only the contents of text nodes inside selected elements |
-a, --attribute <attribute> |
Only return this attribute (if present) from selected elements |
-b, --base <base> |
Use this URL as the base for links |
-f, --filename <FILE> |
The input file. Defaults to stdin |
-o, --output <FILE> |
The output file. Defaults to stdout |
-r, --remove-nodes <SELECTOR>... |
Remove nodes matching this expression before output. May be specified multiple times |