How to sort/unique output using jq - json

I have json like below:
% cat example.json
{
"values" : [
{
"title": "B",
"url": "https://B"
},
{
"title": "A",
"url": "https://A"
}
]
}
I want to sort the values based on title. i.e. expected output
{
"title": "A",
"url": "https://A"
}
{
"title": "B",
"url": "https://B"
}
Tried the blow. Does not work:
% jq '.values[] | sort' example.json
jq: error (at example.json:12): object ({"title":"B...) cannot be sorted, as it is not an array
% jq '.values[] | sort_by(.title)' example.json
jq: error (at example.json:12): Cannot index string with string "title"

If you want to preserve the overall structure, you would use the jq filter:
.values |= sort_by(.title)
If you want to extract .values and sort the array, leave out the "=":
.values | sort_by(.title)
To produce the output as shown in the Q:
.values | sort_by(.title)[]
Uniqueness
There are several ways in which "uniqueness" can be defined, and also several ways in which uniqueness can be achieved.
One option would simply be to use unique_by instead of sort_by; another (with different semantics) would be to use (sort_by(.title)|unique) instead of sort_by(.title).

Related

Why is adding parentheses to a filter in 'jq' producing valid JSON and without parentheses, multiple outputs of objects?

With jq, I would like to set a property within JSON data and let jq output the original JSON with the updated value. I found, more or less due to trial and error, a solution, and want to understand why and how it works.
I have the following JSON data:
{
"notifications": [
{
"source": "observer01",
"channel": "error",
"time": "2021-01-01 01:01:01"
},
{
"source": "observer01",
"channel": "info",
"time": "2021-02-02 02:02:02"
}
]
}
My goal is to update the time property of an object with a specific source and channel (the original JSON is way longer with lots of objects in the notifications array of the same format).
(In the following example, I want to update the time property of observer01 with channel info, so the second object in the example data above.)
My first try, not producing the desired output, was the following jq command:
jq '.notifications[] | select(.source == "observer01" and .channel == "info").time = "NEWTIME"' data.json
That produces the following output:
{
"source": "observer01",
"channel": "error",
"time": "2021-01-01 01:01:01"
},
{
"source": "observer01",
"channel": "info",
"time": "NEWTIME"
}
Which is just a list of the JSON objects within the notifications array. I understand that this can be useful, for example piping the objects to other command line tools.
Now let's try the following jq command, which is the same as above plus one pair of parentheses:
jq '(.notifications[] | select(.source == "observer01" and .channel == "info").time) = "NEWTIME"' data.json
This produces the desired output, the original valid JSON with the updated time property:
{
"notifications": [
{
"source": "observer01",
"channel": "error",
"time": "2021-01-01 01:01:01"
},
{
"source": "observer01",
"channel": "info",
"time": "NEWTIME"
}
]
}
Why is adding the parentheses to the jq filter in the case above producing a different output?
The parentheses just change the precedence. It's documented in man jq:
Parenthesis work as a grouping operator just as in any typical programming language.
jq ´(. + 2) * 5´
1
=> 15
Let's have a simpler example:
echo '[{"a":1}, {"a":2}]' | jq '.[] | .a |= .+1'
It outputs
{
"a": 2
}
{
"a": 3
}
because it's interpreted as
↓ ↓
echo '[{"a":1}, {"a":2}]' | jq '.[] | (.a |= .+1)'
The first filter .[] outputs the elements as separated objects, they are then modified by the second filter.
Placing the parentheses after the first two elements changes the precedence:
↓ ↓
echo '[{"a":1}, {"a":2}]' | jq '(.[] | .a) |= .+1'
and produces a different otuput:
[
{
"a": 2
},
{
"a": 3
}
]
BTW, this is the same output as from
echo '[{"a":1}, {"a":2}]' | jq '.[].a |= .+1'
It changes the value associated with the "a" key in the array.
Let's compare the two.
.notifications[] | select(...).time = "NEWTIME"
(.notifications[] | select(...).time) = "NEWTIME"
In the first one, the top-level filter is defined by |. The input is an object, and the output is the result of applying select(...).time = "NEWTIME" to each value produced by .notifications[]. In essence, the original object is "lost".
In the second one, the top-level filter is defined by =. x = y returns its input as output, but with a side effect produced by
Determining what the path expression x refers to in the input,
Evaluating the filter y on the input, (Even an expression like "NEWTIME" is just a filter: one that ignores its input and returns the string "NEWTIME")
Assigning the result of y to the thing addressed by x.

Merge and Sort JSON using JQ

I have a file containing the following structure and unknown number of results:
{
"results": [
[
{
"field": "AccountID",
"value": "5177497"
},
{
"field": "Requests",
"value": "50900"
}
],
[
{
"field": "AccountID",
"value": "pro"
},
{
"field": "Requests",
"value": "251"
}
]
],
"statistics": {
"Matched": 51498,
"Scanned": 8673577,
"ScannedByte": 2.72400814E10
},
"status": "HOLD"
}
{
"results": [
[
{
"field": "AccountID",
"value": "5577497"
},
{
"field": "Requests",
"value": "51900"
}
],
"statistics": {
"Matched": 51498,
"Scanned": 8673577,
"ScannedByte": 2.72400814E10
},
"status": "HOLD"
}
There are multiple such results which are indexed as an array with the results folder. They are not seperated by a comma.
I am trying to just print The "AccountID" sorted by "Requests" in ZSH using jq. I have tried flattening them and using:
jq -r '.results[][0] |.value ' filename
jq -r '.results[][1] |.value ' filename
To get the Account ID and Requests seperately and sorting them. I don't think bash has a dictionary that can be used. The problem lies in the file as the Field and value are not key value pair but are both pairs. Therefore extracting them using the above two lines into seperate arrays and sorting by the second array seems a bit too long. I was wondering if there is a way to combine both the operations.
The other way is to combine it all to a string and sort it in ascending order. Python would probably have the best solution but the code requires to be a zsh or bash script.
Solutions that use sed, jq or any other ZSH supported compilers are welcome. If there is a way to create a dictionary in bash, please do let me know.
The projectd output requirement is just the Account ID vs Request Number.
5577497 has 51900 requests
5177497 has 50900 requests
pro has 251 requests
If you don't mind learning a little jq, it will probably be best to write a small jq program to do what you want.
To get you started, consider the following jq program, which assumes your input is a stream of valid JSON objects with a "results" key similar to your sample:
[inputs | .results[] | map( { (.field) : .value} ) | add]
After making minor changes to your input so that it consists of valid JSON objects, an invocation of jq with the -n option produces an array of AccountID/Requests objects:
[
{
"AccountID": "5177497",
"Requests": "50900"
},
{
"AccountID": "pro",
"Requests": "251"
},
{
"AccountID": "5577497",
"Requests": "51900"
}
]
You could (for example) now use jq's group_by to group these objects by AccountID, and thereby produce the result you want.
jq -S '.results[] | map( { (.field) : .value} ) | add' query-results-aggregate \
| jq -s -c 'group_by(.number_of_requests) | .[]'
This does the trick. Thanks to peak for the guidance.

Get parent value from json using jq

My json file looks like this;
{
"RQBTYFE86MFC3oL": {
"name": "Nightmode",
"lights": [
"1",
"2",
"3",
"4",
"5",
"7",
"8",
"9",
"10",
"11"
],
"owner": "kvovodUUfn2vlby9h9okdDhv8SrTzkBFjk6kPz2v",
"recycle": false,
"locked": false,
"appdata": {
"version": 1,
"data": "QSXCj_r01_d99"
},
"picture": "",
"lastupdated": "2018-08-08T03:21:39",
"version": 2
}
}
I want to get the 'RQBTYFE86MFC3oL' value by doing a query for 'Nightmode'. So far I came up with this;
jq '.[] | select(.name == "Nightmode")'
This will return me the correct part of the Json but the 'RQBTYFE86MFC3oL' part is stripped. How do I get this part as well?
A simple way to determine the key name(s) corresponding to values satisfying a certain condition is to use to_entries, as explained in the jq manual.
Using this approach, the appropriate jq filter would be:
to_entries[] | select(.value.name == "Nightmode") | .key
with the result:
"RQBTYFE86MFC3oL"
If you want to get the key-value pair, you'd use with_entries as follows:
with_entries( select(.value.name == "Nightmode") )
If the input JSON is too large to fit comfortably in memory, then it would make sense to use jq's streaming parser (invoked with the --stream command-line option):
jq --stream '
select(.[1] == "Nightmode" and (first|length) == 2 and first[1] == "name")
| first | first'
This would produce the key name.
The key idea is that the streaming parser produces arrays including pairs of the form: [ARRAYPATH, VALUE] where VALUE is the value at ARRAYPATH.
You want to get the Key Value.
So use the keys command, to return 'RQBTYFE86MFC3oL' as that is the key, the rest is the value of that key.
jq 'keys'
Here is a snippet: https://jqplay.org/s/YvpCb2PH42
Reference: https://stedolan.github.io/jq/manual/

jq: How to match one of array and get sibling value

I have some JSON like this:
{
"x": [
{
"name": "Hello",
"id": "211"
},
{
"name": "Goodbye",
"id": "221"
},
{
"name": "Christmas",
"id": "171"
}
],
"y": "value"
}
Using jq, given a name value (e.g. Christmas) how can I get it's associated id (i.e. 171).
I've got as far as being able to check for presence of the name in one of the array's objects, but I can't work out how to filter it down
jq -r 'select(.x[].name == "Christmas")'
jq approach:
jq -r '.x[] | select(.name == "Christmas").id' file
171
The function select(boolean_expression) produces its input unchanged if boolean_expression returns true for that input, and produces no output otherwise.
It can also been done like:
jq '.x[] | select(.name == "Christmas").id'
Also you can try this at link online jq play

Perform string manipulation on a value and return the original JSON document with jq

In my JSON document I have a string that I need manipulated and then have the entire document returned with the 'fixed' values.
The input document is:
{
"records" : [
{
"time": "123456789000"
},
{
"time": "123456789000"
}
]
}
I want to find the "time" key and replace the string by dropping off the last 3 chars. The resulting document would be:
{
"records" : [
{
"time": "123456789"
},
{
"time": "123456789"
}
]
}
I've been trying to understand the jq query syntax but I'm not coming right. I'm still struggling to return the whole document when filtering on a specific value. All I have so far is:
.records[] | select(.time | contains("123456789000"))
Here is a solution using |= and string slicing
.records[].time |= .[:-3]
Sample Run (assuming data in data.json)
$ jq -M '.records[].time |= .[:-3]' data.json
{
"records": [
{
"time": "123456789"
},
{
"time": "123456789"
}
]
}
Try it online at jqplay.org
With jq sub() function:
jq '.records[].time |= sub("[0-9]{3}$";"")' file
The output:
{
"records": [
{
"time": "123456789"
},
{
"time": "123456789"
}
]
}
Or even simpler: via dividing the time value by 1000:
jq '.records[].time |= (tonumber / 1000 | tostring)' file
The following works with jq version 1.4 or later:
jq '.records[].time |= .[:-3]' file.json
(The expression .[:-3] is short for .[0:-3]; the negative integer here counts from the right.)
With jq 1.3, the following filter would work in your particular case:
.records[].time |= (tonumber | ./1000 | tostring)