How to "number" array items? [duplicate] - json

This question already has answers here:
How do i add an index in jq
(3 answers)
Closed 2 years ago.
Considering input:
[
{
"a": 1
},
{
"a": 2
},
{
"a": 7
}
]
how do I add new field to each object, which value would be index in array? Producing:
[
{
"a": 1,
"index": 0
},
{
"a": 2,
"index": 1
},
{
"a": 7,
"index": 2
}
]

Using reduce, without disassembling/reassembling the input:
reduce range(length) as $index (.; .[$index] += {$index})
Online demo

Store the structure into a variable, then use keys to get the indices, retrieve the corresponding object from the variable using the index and add the index to it:
jq '[ . as $d | keys[] | $d[.] + {index:.} ]' file.json

to_entries takes an object and returns an array of key/value pairs.
It can be used effectively and intuitively.
jq/to_entries
input file
// file.json
[
{
"a": 1
},
{
"a": 2
},
{
"a": 7
}
]
commands
jq 'to_entries | map(.value+{index:.key})' file.json
results
[
{
"a": 1,
"index": 0
},
{
"a": 2,
"index": 1
},
{
"a": 7,
"index": 2
}
]

Here's also an alternative (non-jq) solution, using jtc:
bash $ <input.json jtc -w'[:]<I>k' -i'{"index":{I}}'
[
{
"a": 1,
"index": 0
},
{
"a": 2,
"index": 1
},
{
"a": 7,
"index": 2
}
]
bash $
PS. I'm a developer of jtc - unix JSON processor
PPS. The disclaimer is required by SO.

Try : Array[index_of_object].property = value
for example : array[0].a = 1

Related

How to bring outer values inside an array iteration

I have a JSON in the shape
[
{
a:1,
b: [2,3]
},
{
a:4,
b: [5,6]
}
]
That I want to transform in the shape
[
[
{
a: 1,
b: 2,
},
{
a: 1,
b: 3,
},
],
[
{
a: 4,
b: 5,
},
{
a: 4,
b: 6,
},
],
]
That is I want to bring the value of the field a inside the array.
how can I do this with jq?
Try this :
jq 'map([{a,b:.b[]}])'
As #pmf pointed out, you can also update object :
jq 'map([.b=.b[]])'
You could iterate over the items using variable binding with as.
Then either update .b to have the value of its items using the update operator |=:
jq 'map([.b[] as $b | .b |= $b])'
Demo
Or create completely new objects from data collected:
jq 'map(.a as $a | [.b[] as $b | {$a,$b}])'
Demo
[
[
{
"a": 1,
"b": 2
},
{
"a": 1,
"b": 3
}
],
[
{
"a": 4,
"b": 5
},
{
"a": 4,
"b": 6
}
]
]

jq parse json with stream flag into different json file

I have a json file as below called data.json, I want to parse the data with jq tool in streaming mode(do not load the whole file into memory), because the real data have 20GB
the streaming mode in jq seems to add a flag --stream and it will parse the json file row by row
{
"id": {
"bioguide": "E000295",
"thomas": "02283",
"govtrack": 412667,
"opensecrets": "N00035483",
"lis": "S376"
},
"bio": {
"gender": "F",
"birthday": "1970-07-01"
},
"tooldatareports": [
{
"name": "A",
"tooldata": [
{
"toolid": 12345,
"data": [
{
"time": "2021-01-01",
"value": 1
},
{
"time": "2021-01-02",
"value": 10
},
{
"time": "2021-01-03",
"value": 5
}
]
},
{
"toolid": 12346,
"data": [
{
"time": "2021-01-01",
"value": 10
},
{
"time": "2021-01-02",
"value": 100
},
{
"time": "2021-01-03",
"value": 50
}
]
}
]
}
]
}
The final result I hope it can become as below
A list contains two dict, each dict contain 2 keys
[
{
"data": [
{
"time": "2021-01-01",
"value": 1
},
{
"time": "2021-01-02",
"value": 10
},
{
"time": "2021-01-03",
"value": 5
}
]
},
{
"data": [
{
"time": "2021-01-01",
"value": 10
},
{
"time": "2021-01-02",
"value": 100
},
{
"time": "2021-01-03",
"value": 50
}
]
}
]
For this problem, I use the below command line to get a result, but it still has some differences.
cat data.json | jq --stream 'select(.[0][0]=="tooldatareports" and .[0][2]=="tooldata" and .[1]!=null) | .'
the result is not a list contain a lot of dict
for each time and value are separate in the different list
Does anyone have any idea about this?
Here's a solution that does not use truncate_stream:
jq -n --stream '
[fromstream(
inputs
| (.[0] | index("data")) as $ix
| select($ix)
| .[0] |= .[$ix:] )]
' input.json
The following produces the required output:
jq -n --stream '
[{data: fromstream(5|truncate_stream(inputs))}]
' input.json
Needless to say, there are other variations ...
Here's a step-by-step explanation of peak's answers.
First let's convert the json to stream.
https://jqplay.org/s/VEunTmDSkf
[["id","bioguide"],"E000295"]
[["id","thomas"],"02283"]
[["id","govtrack"],412667]
[["id","opensecrets"],"N00035483"]
[["id","lis"],"S376"]
[["id","lis"]]
[["bio","gender"],"F"]
[["bio","birthday"],"1970-07-01"]
[["bio","birthday"]]
[["tooldatareports",0,"name"],"A"]
[["tooldatareports",0,"tooldata",0,"toolid"],12345]
[["tooldatareports",0,"tooldata",0,"data",0,"time"],"2021-01-01"]
[["tooldatareports",0,"tooldata",0,"data",0,"value"],1]
[["tooldatareports",0,"tooldata",0,"data",0,"value"]]
[["tooldatareports",0,"tooldata",0,"data",1,"time"],"2021-01-02"]
[["tooldatareports",0,"tooldata",0,"data",1,"value"],10]
[["tooldatareports",0,"tooldata",0,"data",1,"value"]]
[["tooldatareports",0,"tooldata",0,"data",2,"time"],"2021-01-03"]
[["tooldatareports",0,"tooldata",0,"data",2,"value"],5]
[["tooldatareports",0,"tooldata",0,"data",2,"value"]]
[["tooldatareports",0,"tooldata",0,"data",2]]
[["tooldatareports",0,"tooldata",0,"data"]]
[["tooldatareports",0,"tooldata",1,"toolid"],12346]
[["tooldatareports",0,"tooldata",1,"data",0,"time"],"2021-01-01"]
[["tooldatareports",0,"tooldata",1,"data",0,"value"],10]
[["tooldatareports",0,"tooldata",1,"data",0,"value"]]
[["tooldatareports",0,"tooldata",1,"data",1,"time"],"2021-01-02"]
[["tooldatareports",0,"tooldata",1,"data",1,"value"],100]
[["tooldatareports",0,"tooldata",1,"data",1,"value"]]
[["tooldatareports",0,"tooldata",1,"data",2,"time"],"2021-01-03"]
[["tooldatareports",0,"tooldata",1,"data",2,"value"],50]
[["tooldatareports",0,"tooldata",1,"data",2,"value"]]
[["tooldatareports",0,"tooldata",1,"data",2]]
[["tooldatareports",0,"tooldata",1,"data"]]
[["tooldatareports",0,"tooldata",1]]
[["tooldatareports",0,"tooldata"]]
[["tooldatareports",0]]
[["tooldatareports"]]
Now do .[0] to extract the path portion of stream.
https://jqplay.org/s/XdPrp8RuEj
["id","bioguide"]
["id","thomas"]
["id","govtrack"]
["id","opensecrets"]
["id","lis"]
["id","lis"]
["bio","gender"]
["bio","birthday"]
["bio","birthday"]
["tooldatareports",0,"name"]
["tooldatareports",0,"tooldata",0,"toolid"]
["tooldatareports",0,"tooldata",0,"data",0,"time"]
["tooldatareports",0,"tooldata",0,"data",0,"value"]
["tooldatareports",0,"tooldata",0,"data",0,"value"]
["tooldatareports",0,"tooldata",0,"data",1,"time"]
["tooldatareports",0,"tooldata",0,"data",1,"value"]
["tooldatareports",0,"tooldata",0,"data",1,"value"]
["tooldatareports",0,"tooldata",0,"data",2,"time"]
["tooldatareports",0,"tooldata",0,"data",2,"value"]
["tooldatareports",0,"tooldata",0,"data",2,"value"]
["tooldatareports",0,"tooldata",0,"data",2]
["tooldatareports",0,"tooldata",0,"data"]
["tooldatareports",0,"tooldata",1,"toolid"]
["tooldatareports",0,"tooldata",1,"data",0,"time"]
["tooldatareports",0,"tooldata",1,"data",0,"value"]
["tooldatareports",0,"tooldata",1,"data",0,"value"]
["tooldatareports",0,"tooldata",1,"data",1,"time"]
["tooldatareports",0,"tooldata",1,"data",1,"value"]
["tooldatareports",0,"tooldata",1,"data",1,"value"]
["tooldatareports",0,"tooldata",1,"data",2,"time"]
["tooldatareports",0,"tooldata",1,"data",2,"value"]
["tooldatareports",0,"tooldata",1,"data",2,"value"]
["tooldatareports",0,"tooldata",1,"data",2]
["tooldatareports",0,"tooldata",1,"data"]
["tooldatareports",0,"tooldata",1]
["tooldatareports",0,"tooldata"]
["tooldatareports",0]
["tooldatareports"]
Let me first quickly explain index\1.
index("data") of [["tooldatareports",0,"tooldata",0,"data",0,"time"],"2021-01-01"] is 4 since that is the index of the first occurrence of "data".
Knowing that let's now do .[0] | index("data").
https://jqplay.org/s/ny0bV1xEED
null
null
null
null
null
null
null
null
null
null
null
4
4
4
4
4
4
4
4
4
4
4
null
4
4
4
4
4
4
4
4
4
4
4
null
null
null
null
As you can see in our case the indexes are either 4 or null. We want to filter each input such that the corresponding index is not null. Those are the input that have "data" as part of their path.
(.[0] | index("data")) as $ix | select($ix) does just that. Remember that each $ix is mapped to each input. So only input with their $ix being not null are displayed.
For example see https://jqplay.org/s/NwcD7_USZE Here inputs | select(null) gives no output but inputs | select(true) outputs every input.
These are the filtered stream:
https://jqplay.org/s/SgexvhtaGe
[["tooldatareports",0,"tooldata",0,"data",0,"time"],"2021-01-01"]
[["tooldatareports",0,"tooldata",0,"data",0,"value"],1]
[["tooldatareports",0,"tooldata",0,"data",0,"value"]]
[["tooldatareports",0,"tooldata",0,"data",1,"time"],"2021-01-02"]
[["tooldatareports",0,"tooldata",0,"data",1,"value"],10]
[["tooldatareports",0,"tooldata",0,"data",1,"value"]]
[["tooldatareports",0,"tooldata",0,"data",2,"time"],"2021-01-03"]
[["tooldatareports",0,"tooldata",0,"data",2,"value"],5]
[["tooldatareports",0,"tooldata",0,"data",2,"value"]]
[["tooldatareports",0,"tooldata",0,"data",2]]
[["tooldatareports",0,"tooldata",0,"data"]]
[["tooldatareports",0,"tooldata",1,"data",0,"time"],"2021-01-01"]
[["tooldatareports",0,"tooldata",1,"data",0,"value"],10]
[["tooldatareports",0,"tooldata",1,"data",0,"value"]]
[["tooldatareports",0,"tooldata",1,"data",1,"time"],"2021-01-02"]
[["tooldatareports",0,"tooldata",1,"data",1,"value"],100]
[["tooldatareports",0,"tooldata",1,"data",1,"value"]]
[["tooldatareports",0,"tooldata",1,"data",2,"time"],"2021-01-03"]
[["tooldatareports",0,"tooldata",1,"data",2,"value"],50]
[["tooldatareports",0,"tooldata",1,"data",2,"value"]]
[["tooldatareports",0,"tooldata",1,"data",2]]
[["tooldatareports",0,"tooldata",1,"data"]]
Before we go further let's review update assignment.
Have a look at https://jqplay.org/s/g4P6j8f9FG
Let's say we have input [["tooldatareports",0,"tooldata",0,"data",0,"time"],"2021-01-01"].
Then filter .[0] |= .[4:] produces [["data",0,"time"],"2021-01-01"].
Why?
Remember that right hand side (.[4:]) inherits the context of the left hand side(.[0]). So in this case it has the effect of updating the path ["tooldatareports",0,"tooldata",0,"data",0,"time"] to ["data",0,"time"].
Let's move on then.
So (.[0] | index("data")) as $ix | select($ix) | .[0] |= .[$ix:] has the output:
https://jqplay.org/s/AwcQpVyHO2
[["data",0,"time"],"2021-01-01"]
[["data",0,"value"],1]
[["data",0,"value"]]
[["data",1,"time"],"2021-01-02"]
[["data",1,"value"],10]
[["data",1,"value"]]
[["data",2,"time"],"2021-01-03"]
[["data",2,"value"],5]
[["data",2,"value"]]
[["data",2]]
[["data"]]
[["data",0,"time"],"2021-01-01"]
[["data",0,"value"],10]
[["data",0,"value"]]
[["data",1,"time"],"2021-01-02"]
[["data",1,"value"],100]
[["data",1,"value"]]
[["data",2,"time"],"2021-01-03"]
[["data",2,"value"],50]
[["data",2,"value"]]
[["data",2]]
[["data"]]
Now all we need to do is convert this stream back to json.
https://jqplay.org/s/j2uyzEU_Rc
[fromstream(inputs)] gives:
[
{
"data": [
{
"time": "2021-01-01",
"value": 1
},
{
"time": "2021-01-02",
"value": 10
},
{
"time": "2021-01-03",
"value": 5
}
]
},
{
"data": [
{
"time": "2021-01-01",
"value": 10
},
{
"time": "2021-01-02",
"value": 100
},
{
"time": "2021-01-03",
"value": 50
}
]
}
]
This is the output we wanted.

Using jq to merge two arrays into one array of objects

I am very beginner into jq and I want to reshape my JSON file.
I have ve got JSON structured like this:
{
"a": [1, 2, 3, 4 ...],
"b": [
{
"x": 1000,
"value": 1
},
{
"x": 1000,
"value": 2
},
{
"x": 1000,
"value": 3
}
...
]
}
I am wondering how I can achieve result like this with jq:
[
{
"value": 1,
"from": "a",
},
{
"value": 2,
"from": "a"
},
...
{
"value": 1,
"from": "b"
},
{
"value": 2,
"from": "b"
}
...
]
Here is a very slightly generic, but hardly robust, solution:
map_values( if type == "array"
then map(if type == "object" then .value else . end)
else . end)
| [ keys_unsorted[] as $k
| .[$k][] as $v
| { value: $v, from: $k } ]
Create two lists, one from .a and one from .b, and merge them with +.
In the first list, create objects with value: set to the original content and add from: "a"; in the second list, remove .x from elements of .b and add the from again.
jq '[.a[] | {value:(.), from: "a"}] + [.b[] | del(.x) + {from: "b"}]

Group and count JSON using jq [duplicate]

This question already has an answer here:
How to group a JSON by a key and sort by its count?
(1 answer)
Closed 1 year ago.
I am trying to convert the following JSON into a csv which has each unique "name" and the total count (i.e: number of times that name appears).
Current data:
[
{
"name": "test"
},
{
"name": "hello"
},
{
"name": "hello"
}
]
Ideal output:
[
{
"name": "hello",
"count": 2
},
{
"name": "test",
"count": 1
}
]
I've tried [.[] | group_by (.name)[] ] but get the following error:
jq: error (at :11): Cannot index string with string "name"
JQ play link: https://jqplay.org/s/fWqNUii1b2
Note, I am already using jq to format the initial raw data into the format above. Please see the JQ play link here: https://jqplay.org/s/PwwRYscmBK
group_by(.name)
| map({name: .[0].name, count: length})
[
{
"name": "hello",
"count": 2
},
{
"name": "test",
"count": 1
}
]
Jq▷Play
Based on OP's comment, use the following jq filter to count each name across multiple objects, where the .name is nested.
map(.labels)
| map({name: .[0].name, count: length})
Jq▷Play
echo '[{"name": "test"}, {"name": "hello"}, {"name": "hello"}]' | jq 'group_by (.name)[] | {name: .[0].name, count: length}' | jq -s
[
{
"name": "hello",
"count": 2
},
{
"name": "test",
"count": 1
}
]

How can you sort a JSON object efficiently using JQ

I've got JSON in the format
{
"a": {
"size":3
},
"b": {
"size":2
},
"c": {
"size":1
}
}
I need to sort it by size, e.g.:
{
"c": {
"size": 1
},
"b": {
"size": 2
},
"a": {
"size": 3
}
}
I have found a way to do it, e.g.:
. as $in | keys_unsorted | map ({"key": ., "size" : $in[.].size}) | sort_by(.size) | map(.key | {(.) : $in[.]}) | add
but this seems quite complex so I'm hoping there's a simpler way that I've overlooked?
You can use to_entries / from_entries, like this:
jq 'to_entries|sort_by(.value.size)|from_entries' file.json
to_entries will transform your input object into a list of key/value pair objects:
[
{
"key": "a",
"value": {
"size": 3
}
},
...
{
"key": "c",
"value": {
"size": 1
}
}
]
That allows to apply sort_by(.value.size) to that list and then convert it back to an object using from_entries.