knowledge/technology/applications/cli/hck.md
2024-03-19 09:25:42 +01:00

3 KiB

obj repo
application https://github.com/sstadick/hck

hck

hck is a shortening of hack, a rougher form of cut.

A close to drop in replacement for cut that can use a regex delimiter instead of a fixed string. Additionally this tool allows for specification of the order of the output columns using the same column selection syntax as cut (see below for examples).

No single feature of hck on its own makes it stand out over awk, cut, xsv or other such tools. Where hck excels is making common things easy, such as reordering output fields, or splitting records on a weird delimiter. It is meant to be simple and easy to use while exploring datasets. Think of this as filling a gap between cut and awk.

Usage

Usage: hck [options]
Options:

Option Description
-o, --output <OUTPUT> Output file to write to, defaults to stdout
-d, --delimiter <DELIMITER> Delimiter to use on input files, this is a substring literal by default. To treat it as a literal add the -L flag
[default: \s+]
-L, --delim-is-literal Treat the delimiter as a string literal. This can significantly improve performance, especially for single byte delimiters
-I, --use-input-delim Use the input delimiter as the output delimiter if the input is literal and no other output delimiter has been set
-D, --output-delimiter <OUTPUT_DELIMITER> Delimiter string to use on outputs
[default: "\t"]
-f, --fields <FIELDS> Fields to keep in the output, ex: 1,2-,-5,2-5. Fields are 1-based and inclusive
-e, --exclude <EXCLUDE> Fields to exclude from the output, ex: 3,9-11,15-. Exclude fields are 1 based and inclusive. Exclude fields take precedence over fields
-E, --exclude-header <EXCLUDE_HEADER> Headers to exclude from the output, ex: '^badfield.*$'. This is a string literal by default. Add the -r flag to treat as a regex
-F, --header-field <HEADER_FIELD> A string literal or regex to select headers, ex: '^is_.*$'. This is a string literal by default. add the -r flag to treat it as a regex
-r, --header-is-regex Treat the header_fields as regexs instead of string literals