one json object per line with jq from a large json file - json

Here is my json file :
[
{
"name": "1"
},
{
"name": "2"
},
{
"name": "3"
},
{
"name": "4"
}
]
i would like to get all object in a file one by line :
{"name":"1"}
{"name":"2"}
{"name":"3"}
{"name":"4"}
and my file is very big and i'am using the stream option.
Here is my attempt so far :
jq --stream -c '.[]' car.json > result.json
but it gives me :
[0,"name"]
"1"
[1,"name"]
"2"

This topic is covered in the jq FAQ. For the situation you describe you might be able to use the simpler of the two possibilities given there:
jq -cn --stream 'fromstream(1|truncate_stream(inputs))'

Related

jq process json where an element can be an array or object

The output from the tool I am using is creating an element in the json that is an object when there is only 1 item but an array when there is more than 1.
How do I parse this with jq to return the full list of names only from within content?
{
"data": [
{
"name": "data block1",
"content": {
"name": "1 bit of data"
}
},
{
"name": "data block2",
"content": [
{
"name": "first bit"
},
{
"name": "another bit"
},
{
"name": "last bit"
}
]
}
]
}
What I can't work out is how to switch depending on the type of content.
# jq '.data[].content.name' test.json
"1 bit of data"
jq: error (at test.json:22): Cannot index array with string "name"
# jq '.data[].content[].name' test.json
jq: error (at test.json:22): Cannot index string with string "name"
I am sure I should be able to use type but my jq-fu is not strong enough!
# jq '.data[].content | type=="array"' test.json
false
true
jq version 1.5
jq '.data[].content | if type == "array" then .[] else . end | .name?'
(The trailing ? is there just in case.)
More succinctly:
jq '.data[].content | .name? // .[].name?'

I am trying to retrieve the value of a key based on another key of a json in bash scripting without the use of jq

I have a json file and need to retrieve the name from the json based on the value of "Id" fields in the same json. I need to do this with bash scripting. I can use sed, awk, grep but not jq.
[
{
"attributes":{
"type":"chocolate","url":"/services/chocolate"
},
"Id":"1",
"Name":"Chocolaty chocloate"
},
{
"attributes":{
"type":"Fruit","url":"/services/fruit"
},
"Id":"2",
"Name":"Fruity Apple"
},
{
"attributes":{
"type":"drink","url":"/services/drink"
},
"Id":"3",
"Name":"Milk Shake"
},
{
"attributes":{
"type":"food","url":"/services/food"
},
"Id":"4",
"Name":"Noodles"
}
]
In the above example, if I pass value "3" to the script, I expect the name of the "Id" with the value "3" which is Milk Shake.
The best solution is with jq. But if it's supposed to be sed, you could do it here:
#!/bin/bash
sed -n "/\"Id\":\"$2\"/,/\"Name\"/s/.*\"Name\":\"\([^\"]*\).*/\1/p" "$1"
usage
script.sh file.json 3
output
Milk Shake
Assuming the two lines are always next to each other, a GNU grep pipe, (in a function), works:
foo() { grep -A 1 '"Id":"'"$2"'",$' "$1" | grep -o '"[^"]*"$'; }
foo file.json 3
Output:
"Milk Shake"

How to access nested json keys in jq --stream

I have a huge json file(15 GB) which looks like as follows:
{
"userActivities": {
"-L3ATRosRd-bDgSmX75Z": {
"deviceId": "60ee32c2fae8dcf0",
"dow": "Friday"
}
},
"users": {
"0GTDyAepIjcKMB1XulHCYLXylFS2": {
"ageRangeMin": 21,
"age_range": {
"min": 21
},
"gender": "male"
},
"0GTDyAepIjcKMB1S2": {
"ageRangeMin": 22,
"age_range": {
"min": 20
},
"gender": "male"
}
}
}
I want to extract the objects as if by .users[], but using the streaming parser (jq --stream). That is, I want my output to be as follows:
{"ageRangeMin":21,"age_range":{"min":21},"gender":"male"}
{"ageRangeMin":22,"age_range":{"min":20},"gender":"male"}
Any guidance/help is greatly appreciated. I'm unable to understand how jq --stream works.
If the goal is to just get objects at a certain depth of the json object tree, you can just truncate the stream.
$ jq --stream -nc 'fromstream(2|truncate_stream(inputs | select(.[0][:1] == ["users"])))'
Just make sure you're running the latest available jq. There's a bug in 1.5 for truncate_stream/1 that breaks for any other input greater than 1.
With your input in input.json, the following invocation:
$ jq -nc --stream '
fromstream(inputs|select(.[0][0] == "users"))|.[][]' input.json
yields:
{"ageRangeMin":21,"age_range":{"min":21},"gender":"male"}
{"ageRangeMin":22,"age_range":{"min":20},"gender":"male"}
The idea is to extract the "users" key-value pair first as a single-key object.
Note that the -n option must be used here.

replace json node with another json file with command line jq

I have a result.json:
{
"Msg": "This is output",
"output": {}
}
and a output.json:
{
"type": "string",
"value": "result is here"
}
I want to replace output field in result.json with whole file output.json as
{
"Msg": "This is output",
"output": {
"type": "string",
"value": "result is here"
}
}
and idea with jq command line? Thank you in advance.
You can use --argfile to process multiple files :
jq --argfile f1 result.json --argfile f2 output.json -n '$f1 | .output = $f2'
Basically the same as Bertrand Martel's answer, but using a different (and shorter) approach to reading the two files.
jq -n 'input | .output = input' result.json output.json

Update inner attribute of JSON with jq

Could somebody help me to deal with jq command line utility to update JSON object's inner value?
I want to alter object interpreterSettings.2B263G4Z1.properties by adding several key-values, like "spark.executor.instances": "16".
So far I only managed to fully replace this object, not add new properties with command:
cat test.json | jq ".interpreterSettings.\"2B188AQ5T\".properties |= { \"spark.executor.instances\": \"16\" }"
This is input JSON:
{
"interpreterSettings": {
"2B263G4Z1": {
"id": "2B263G4Z1",
"name": "sh",
"group": "sh",
"properties": {}
},
"2B188AQ5T": {
"id": "2B188AQ5T",
"name": "spark",
"group": "spark",
"properties": {
"spark.cores.max": "",
"spark.yarn.jar": "",
"master": "yarn-client",
"zeppelin.spark.maxResult": "1000",
"zeppelin.dep.localrepo": "local-repo",
"spark.app.name": "Zeppelin",
"spark.executor.memory": "2560M",
"zeppelin.spark.useHiveContext": "true",
"spark.home": "/usr/lib/spark",
"zeppelin.spark.concurrentSQL": "false",
"args": "",
"zeppelin.pyspark.python": "python"
}
}
},
"interpreterBindings": {
"2AXUMXYK4": [
"2B188AQ5T",
"2AY8SDMRU"
]
}
}
I also tried the following but this only prints contents of interpreterSettings.2B263G4Z1.properties, not full object.
cat test.json | jq ".interpreterSettings.\"2B188AQ5T\".properties + { \"spark.executor.instances\": \"16\" }"
The following works using jq 1.4 or jq 1.5 with a Mac/Linux shell:
jq '.interpreterSettings."2B188AQ5T".properties."spark.executor.instances" = "16" ' test.json
If you have trouble adapting the above for Windows, I'd suggest putting the jq program in a file, say my.jq, and invoking it like so:
jq -f my.jq test.json
Notice that there is no need to use "cat" in this case.
p.s. You were on the right track - try replacing |= with +=