jq simple json object merge/combination - json

a.json
{"a": 1}
b.json
{"b": 1}
Desired outcome
{"a": 1, "b": 1}
jq -s "." a.json b.json
[
{
"a": 1
},
{
"b": 1
}
]
It's wrapped in an array
jq "." a.json b.json
{
"a": 1
}
{
"b": 1
}
That's not even valid json
Is jq the wrong tool here? What is more appropriate?

Try:
jq -s 'add' a.json b.json
Result:
{
"a": 1,
"b": 1
}

In some cases it may be desirable to avoid “slurping” the objects, as that requires more memory than necessary.
In any case, to accomplish the task economically, use -n in conjunction with inputs as follows:
reduce inputs as $i ({}; . + $i)
sigma/1
If you don't mind that sigma(empty) evaluates to null, you could define a polymorphic sigma as follows:
def sigma(s): reduce s as $x (null; . +$x);
This works on streams of numbers, streams of objects, streams of arrays, and streams of strings, and so would be suitable for your standard library.
In any case, with this def, for the task at hand, you could write: simga(inputs).

Related

2D array formatting in JQ

I'm using jq to reformat my files to be "nice and pretty". I'm using basic '.', but this is not working as I expect.
Let's say I have structure like this:
{ "foo": { "bar": { "baz": 123 },
"bislot":
[[1,2,3,4,5],
[6,7,8,9,10],
[11,12,13,14,15]]
}}
and after reformatting with jq . I'm getting output like this:
{
"foo": {
"bar": {
"baz": 123
},
"bislot": [
[
1,
2,
3,
4,
5
],
[
6,
7,
8,
9,
10
],
[
11,
12,
13,
14,
15
]
]
}
}
What I want to achieve is something like this:
{
"foo": {
"bar": {
"baz": 123
},
"bislot":
[[1,2,3,4,5],
[6,7,8,9,10],
[11,12,13,14,15]]
}
}
So for 2D arrays every item(array) should be in one line. Any ideas how can I do this?
As it has been pointed out already, from a JSON processor's perspective the formatting doesn't really matter. As you have noticed, jq by default pretty-prints its output. It also offers a --compact-output (or -c) option to output each JSON document into one line, which in your case would result in
jq -c . file.json
{"foo":{"bar":{"baz":123},"bislot":[[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15]]}}
If you care more about aesthetics than whether it is logically the same document or not, an easy way could be to encode all arrays within arrays as JSON, thus becoming strings, which when pretty-printing the whole document with jq won't be broken down anymore:
jq 'walk(arrays[] |= (arrays |= #json))' file.json
{
"foo": {
"bar": {
"baz": 123
},
"bislot": [
"[1,2,3,4,5]",
"[6,7,8,9,10]",
"[11,12,13,14,15]"
]
}
}
Demo
Going further, you could take the output from the previous call, read it line by line back into jq, decode lines with JSON-encoded arrays using fromjson, and output the modified lines (together masquerading as JSON document). This approach uses regex matching with look-ahead and look-behind to identify (the positions of) JSON-encoded arrays.
jq 'walk(arrays[] |= (arrays |= #json))' file.json \
| jq -Rr '(
match("^\\s+(?=\"\\[)").length as $a
| match("(?<=\\]\"),?$").offset as $b
| .[:$a] + (.[$a:$b] | fromjson) + .[$b:]
) // .'
{
"foo": {
"bar": {
"baz": 123,
"baz2": 123
},
"bislot": [
[1,2,3,4,5],
[6,7,8,9,10],
[11,12,13,14,15]
]
}
}
Demo
Disclaimer: Although this outputs a valid JSON document for your sample input, this should still be regarded as an aesthetic printout. If the validity of your JSON document is a first-class concern, the actual formatting shouldn't really matter. jq's default pretty-printer, and its compact equivalent generally should fit most needs.
You can consider combining jq with another tool (here perl):
jq . input.json | perl -0 -pe 's/(\[\s*)
(((\[[^\[\]]+\])[^\[\]]*)+)
/$v1=$1; $v2 = $2;
$v2 =~ s{\[[^\[\]]+\]}
{$v=$&,($v =~ s[\s][]g),$v}egsx;
$v1 . $v2
/egsx'

Get value of JSON object using jq --stream

I'm trying to extract the value of an JSON object using jq --stream, because the real data can the size of multiple GigaBytes.
This is the JSON I'm using for my tests, where I want to extract the value of item:
{
"other": "content here",
"item": {
"A": {
"B": "C"
}
},
"test": "test"
}
The jq options I'm using:
jq --stream --null-input 'fromstream(inputs | select(.[0][0] == "item"))[]' example.json
However, I don't get any output with this command.
A strange thing I found is that when removing the object after the item the above command seems to work:
{
"other": "content here",
"item": {
"A": {
"B": "C"
}
}
}
The result looks as expected:
❯ jq --stream --null-input 'fromstream(inputs | select(.[0][0] == "item"))[]' example.json
{
"A": {
"B": "C"
}
}
But as I cannot control the input JSON this is not the solution.
I'm using jq version 1.6 on MacOS.
You didn't truncate the stream, therefore after filtering it to only include the parts below .item, fromstream is missing the final back-tracking item [["item"]]. Either add it manually at the end (not recommended, this would also include the top-level object in the result), or, much simpler, use 1 | truncate_stream to strip the first level altogether:
jq --stream --null-input '
fromstream(1 | truncate_stream(inputs | select(.[0][0] == "item")))
' example.json
{
"A": {
"B": "C"
}
}
Alternatively, you can use reduce and setpath to build up the result object yourself:
jq --stream --null-input '
reduce inputs as $in (null;
if $in | .[0][0] == "item" and has(1) then setpath($in[0];$in[1]) else . end
)
' example.json
{
"item": {
"A": {
"B": "C"
}
}
}
To remove the top level object, either filter for .item at the end, or, similarly to truncate_stream, remove the path's first item using [1:] to strip the first level:
jq --stream --null-input '
reduce inputs as $in (null;
if $in | .[0][0] == "item" and has(1) then setpath($in[0][1:];$in[1]) else . end
)
' example.json
{
"A": {
"B": "C"
}
}

How would you collect the first few entries of a list from a large json file using jq?

I am trying to process a large json file for testing purposes that has a few thousand entries. The json contains a long list of data to is too large for me to process in one go. Using a jq, is there an easy way to get a valid snippet of the json that only contains the first few entries from the data list? For example is there a query that would look at the whole json file and return to me a valid json that only contains the first 4 entries from data? Thank you!
{
"info":{
"name":"some-name"
},
"data":[
{...},
{...},
{...},
{...}
}
Based on your snippet, the relevant jq would be:
.data |= .[:4]
Here's an example using the --stream option:
$ cat input.json
{
"info": {"name": "some-name"},
"data": [
{"a":1},
{"b":2},
{"c":3},
{"d":4},
{"e":5},
{"f":6},
{"g":7}
]
}
jq --stream -n '
reduce (
inputs | select(has(1) and (.[0] | .[0] == "data" and .[1] < 4))
) as $in (
{}; .[$in[0][-1]] = $in[1]
)
' input.json
{
"a": 1,
"b": 2,
"c": 3,
"d": 4
}
Note: Using limit would have been more efficient in this case, but I tried to be more generic for the purpose of scalability.

jq: How can a nested object be sliced to include the parent-key hierarchy?

Using Bash and jq, if I have a Bash variable filter F of the form .<key1>.<key2>...<keyN> and I want to slice a Bash variable JSON object O so that the result is just that slice of the object including all keys in F, how can this be done with jq?
For example, suppose:
O='
{
"a":
{
"b":
{
"c": { "p":1 },
"x": 1
},
"x": 2
},
"x": 3
}'
Then, doing:
F='.a.b.c'; jq -r "$F" <<<"$O"
results in:
{
"p": 1
}
But, I want the slice to include parent key hierarchy.
Inelegant Solution
I have come up with a solution, but it involves 2 calls to jq:
F='.a.b.c'; S="$(jq -r "$F" <<<"$O"); jq --null-input -r "$F |= $S"
that results in:
{
"a": {
"b": {
"c": {
"p": 1
}
}
}
}
The solution must work for any valid O and F Bash variable where O stores a JSON object and F is a simple filter of key names only as described above. For example:
F='.a.b'; S="$(jq -r "$F" <<<"$O")"; jq --null-input -r "$F |= $S"
results in:
{
"a": {
"b": {
"c": {
"p": 1
},
"x": 1
}
}
}
Can slicing an object with a key-hierarchy filter be done more simply in jq?
Provided $F is a valid jq path expression (i.e., so that jq -n "$F" works):
jq "$F as \$v | null | $F |= \$v" <<< "$O"
(I included the |= from your solution to show the similarity, but here you could drop the |.)

Json parsing on cli using jq

Let's say I have the below json object:
{
"d": {
"e": {
"bar": 2
}
},
"a": {
"b": {
"c": {
"foo": 1
}
}
}
}
I want to get the value foo without typing '.a.b.c.foo'
I realize I can do...
echo '{ "a":{"b":{"c":{ "foo":1}}},"d":{"e":{"bar":2}}}' | jq '.[][][].foo' but is there a recursive wild in jq? like **? I know for sure jq doesn't support *, is there a way to have jq support jsonpath?
Or maybe even just another cli tool that does support json path?
In jq 1.4 you could do this:
$ jq '..|.foo?' file.json
If you're stuck with 1.3 you could use
$ jq 'recurse(if type == "array" or type == "object" then .[] else empty end) | if type == "object" then . else empty end | .foo' file.json
which is a bit of a mouthful... That's why 1.4 has .., which recurses down through all iterables in ., and the ? operator, which doesn't bother indexing that which can't be.