jmarya/knowledge

JMARyA c9c4cda5ba

add htmlq

2024-04-26 08:11:28 +02:00

2.4 KiB

Raw Blame History

obj	repo	rev
application	https://github.com/mgdm/htmlq	2024-04-25

htmlq

Like jq, but for HTML. Uses CSS selectors to extract bits of content from HTML files.

Usage

Usage: htmlq [FLAGS] [OPTIONS] [--] [selector]...

Options

Option	Description
`-B, --detect-base`	Try to detect the base URL from the `<base>` tag in the document. If not found, default to the value of `--base`, if supplied
`-w, --ignore-whitespace`	When printing text nodes, ignore those that consist entirely of whitespace
`-p, --pretty`	Pretty-print the serialised output
`-t, --text`	Output only the contents of text nodes inside selected elements
`-a, --attribute <attribute>`	Only return this attribute (if present) from selected elements
`-b, --base <base>`	Use this URL as the base for links
`-f, --filename <FILE>`	The input file. Defaults to stdin
`-o, --output <FILE>`	The output file. Defaults to stdout
`-r, --remove-nodes <SELECTOR>...`	Remove nodes matching this expression before output. May be specified multiple times