knowledge/technology/applications/cli/jq.md

160 lines
9.5 KiB
Markdown
Raw Normal View History

2023-12-04 10:02:23 +00:00
---
obj: application
website: https://jqlang.github.io/jq/
repo: https://github.com/jqlang/jq
---
# jq
jq is a lightweight and flexible command-line [JSON](../../files/JSON.md) processor akin to sed,awk,grep, and friends for [JSON](../../files/JSON.md) data. It's written in portable C and has zero runtime dependencies, allowing you to easily slice, filter, map, and transform structured data.
## Usage
```shell
2024-01-17 08:44:04 +00:00
cat data.json | jq [FILTER]
2023-12-13 22:45:40 +00:00
# Raw Data
cat data.json | jq -r [FILTER]
2023-12-04 10:02:23 +00:00
```
## Filters
### Identity
2024-01-17 08:44:04 +00:00
The absolute simplest filter is `.` . This filter takes its input and produces the same value as output. That is, this is the identity operator.
2023-12-04 10:02:23 +00:00
### Object Identifier
2024-01-17 08:44:04 +00:00
The simplest _useful_ filter has the form `.foo`. When given a [JSON](../../files/JSON.md) object (aka dictionary or hash) as input, `.foo` produces the value at the key "foo" if the key is present, or null otherwise.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
The `.foo` syntax only works for simple, identifier-like keys, that is, keys that are all made of alphanumeric characters and underscore, and which do not start with a digit.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
If the key contains special characters or starts with a digit, you need to surround it with double quotes like this: `."foo$"`, or else `.["foo$"]`.
2023-12-04 10:02:23 +00:00
### Array Index
2024-01-17 08:44:04 +00:00
When the index value is an integer, `.[<number>]` can index arrays. Arrays are zero-based, so `.[2]` returns the third element.
2023-12-04 10:02:23 +00:00
Negative indices are allowed, with -1 referring to the last element, -2 referring to the next to last element, and so on.
### Array/String Slice
2024-01-17 08:44:04 +00:00
The `.[<number>:<number>]` syntax can be used to return a subarray of an array or substring of a string. The array returned by `.[10:15]` will be of length 5, containing the elements from index 10 (inclusive) to index 15 (exclusive). Either index may be negative (in which case it counts backwards from the end of the array), or omitted (in which case it refers to the start or end of the array). Indices are zero-based.
2023-12-04 10:02:23 +00:00
### Array/Object Value Iterator
2024-01-17 08:44:04 +00:00
If you use the `.[index]` syntax, but omit the index entirely, it will return _all_ of the elements of an array. Running `.[]` with the input `[1,2,3]` will produce the numbers as three separate results, rather than as a single array. A filter of the form `.foo[]` is equivalent to `.foo | .[]`.
2023-12-04 10:02:23 +00:00
You can also use this on an object, and it will return all the values of the object.
Note that the iterator operator is a generator of values.
### Comma
2024-01-17 08:44:04 +00:00
If two filters are separated by a comma, then the same input will be fed into both and the two filters' output value streams will be concatenated in order: first, all of the outputs produced by the left expression, and then all of the outputs produced by the right. For instance, filter `.foo, .bar`, produces both the "foo" fields and "bar" fields as separate outputs.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
The `,` operator is one way to contruct generators.
2023-12-04 10:02:23 +00:00
### Pipe
The `|` operator combines two filters by feeding the output(s) of the one on the left into the input of the one on the right. It's similar to the Unix [shell](Shell.md)'s pipe, if you're used to that.
2024-01-17 08:44:04 +00:00
If the one on the left produces multiple results, the one on the right will be run for each of those results. So, the expression `.[] | .foo` retrieves the "foo" field of each element of the input array. This is a cartesian product, which can be surprising.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
Note that `.a.b.c` is the same as `.a | .b | .c`.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
Note too that `.` is the input value at the particular stage in a "pipeline", specifically: where the `.` expression appears. Thus `.a | . | .b` is the same as `.a.b`, as the `.` in the middle refers to whatever value `.a` produced.
2023-12-04 10:02:23 +00:00
### Array Construction: `[]`
2024-01-17 08:44:04 +00:00
As in [JSON](../../files/JSON.md), `[]` is used to construct arrays, as in `[1,2,3]`. The elements of the arrays can be any jq expression, including a pipeline. All of the results produced by all of the expressions are collected into one big array. You can use it to construct an array out of a known quantity of values (as in `[.foo, .bar, .baz]`) or to "collect" all the results of a filter into an array (as in `[.items[].name]`)
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
Once you understand the "," operator, you can look at jq's array syntax in a different light: the expression `[1,2,3]` is not using a built-in syntax for comma-separated arrays, but is instead applying the `[]` operator (collect results) to the expression 1,2,3 (which produces three different results).
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
If you have a filter `X` that produces four results, then the expression `[X]` will produce a single result, an array of four elements.
2023-12-04 10:02:23 +00:00
### Object Construction: `{}`
2024-01-17 08:44:04 +00:00
Like [JSON](../../files/JSON.md), `{}` is for constructing objects (aka dictionaries or hashes), as in: `{"a": 42, "b": 17}`.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
If the keys are "identifier-like", then the quotes can be left off, as in `{a:42, b:17}`. Variable references as key expressions use the value of the variable as the key. Key expressions other than constant literals, identifiers, or variable references, need to be parenthesized, e.g., `{("a"+"b"):59}`.
2023-12-04 10:02:23 +00:00
The value can be any expression (although you may need to wrap it in parentheses if, for example, it contains colons), which gets applied to the {} expression's input (remember, all filters have an input and an output).
```
{foo: .bar}
```
2024-01-17 08:44:04 +00:00
will produce the [JSON](../../files/JSON.md) object `{"foo": 42}` if given the [JSON](../../files/JSON.md) object `{"bar":42, "baz":43}` as its input. You can use this to select particular fields of an object: if the input is an object with "user", "title", "id", and "content" fields and you just want "user" and "title", you can write
2023-12-04 10:02:23 +00:00
```
{user: .user, title: .title}
```
2024-01-17 08:44:04 +00:00
Because that is so common, there's a shortcut syntax for it: `{user, title}`.
2023-12-04 10:02:23 +00:00
If one of the expressions produces multiple results, multiple dictionaries will be produced. If the input's
```
{"user":"stedolan","titles":["JQ Primer", "More JQ"]}
```
then the expression
```
{user, title: .titles[]}
```
will produce two outputs:
```
{"user":"stedolan", "title": "JQ Primer"}
{"user":"stedolan", "title": "More JQ"}
```
Putting parentheses around the key means it will be evaluated as an expression. With the same input as above,
```
{(.user): .titles}
```
produces
```
{"stedolan": ["JQ Primer", "More JQ"]}
```
## Functions
### `has(key)`
2024-01-17 08:44:04 +00:00
The builtin function `has` returns whether the input object has the given key, or the input array has an element at the given index.
2023-12-04 10:02:23 +00:00
### `map(f)`, `map_values(f)`
2024-01-17 08:44:04 +00:00
For any filter `f`, `map(f)` and `map_values(f)` apply `f` to each of the values in the input array or object, that is, to the values of `.[]`.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
In the absence of errors, `map(f)` always outputs an array whereas `map_values(f)` outputs an array if given an array, or an object if given an object.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
When the input to `map_values(f)` is an object, the output object has the same keys as the input object except for those keys whose values when piped to `f` produce no values at all.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
`map(f)` is equivalent to `[.[] | f]` and `map_values(f)` is equivalent to `.[] |= f`.
2023-12-04 10:02:23 +00:00
### `del(path)`
2024-01-17 08:44:04 +00:00
The builtin function `del` removes a key and its corresponding value from an object.
2023-12-04 10:02:23 +00:00
### `reverse`
This function reverses an array.
### `contains(element)`
2024-01-17 08:44:04 +00:00
The filter `contains(b)` will produce true if b is completely contained within the input. A string B is contained in a string A if B is a substring of A. An array B is contained in an array A if all elements in B are contained in any element in A. An object B is contained in object A if all of the values in B are contained in the value in A with the same key. All other types are assumed to be contained in each other if they are equal.
2023-12-04 10:02:23 +00:00
### `startswith(str)`
2024-01-17 08:44:04 +00:00
Outputs `true` if . starts with the given string argument.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
### `endswith(str)`
Outputs `true` if . ends with the given string argument.
2023-12-04 10:02:23 +00:00
### `split(str)`
Splits an input string on the separator argument.
### `join(str)`
2024-01-17 08:44:04 +00:00
Joins the array of elements given as input, using the argument as separator. It is the inverse of `split`: that is, running `split("foo") | join("foo")` over any input string returns said input string.
2023-12-04 10:02:23 +00:00
## Conditionals
### if-then-else-end
2024-01-17 08:44:04 +00:00
`if A then B else C end` will act the same as `B` if `A` produces a value other than false or null, but act the same as `C` otherwise.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
`if A then B end` is the same as `if A then B else . end`. That is, the `else` branch is optional, and if absent is the same as `.`. This also applies to `elif` with absent ending `else` branch.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
Checking for false or null is a simpler notion of "truthiness" than is found in JavaScript or [Python](../../dev/programming/languages/Python.md), but it means that you'll sometimes have to be more explicit about the condition you want. You can't test whether, e.g. a string is empty using `if .name then A else B end`; you'll need something like `if .name == "" then A else B end` instead.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
If the condition `A` produces multiple results, then `B` is evaluated once for each result that is not false or null, and `C` is evaluated once for each false or null.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
More cases can be added to an if using `elif A then B` syntax.
2023-12-04 10:02:23 +00:00
Example: `jq 'if . == 0 then "zero" elif . == 1 then "one" else "many" end'`
### Alternative Operator `//`
2024-01-17 08:44:04 +00:00
The `//` operator produces all the values of its left-hand side that are neither `false` nor `null`, or, if the left-hand side produces no values other than `false` or `null`, then `//` produces all the values of its right-hand side.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
A filter of the form `a // b` produces all the results of `a` that are not `false` or `null`. If `a` produces no results, or no results other than `false` or `null`, then `a // b` produces the results of `b`.
2023-12-04 10:02:23 +00:00
2024-01-17 08:44:04 +00:00
This is useful for providing defaults: `.foo // 1` will evaluate to `1` if there's no `.foo` element in the input.