jq update json from another json - json

I have a defaults.json and a current.json.
defaults.json gets copied to current.json, and current.json is used as the main configuration file.
defaults would looke something like this:
{
"AttributeName":"setting1"
"Value": [
{
"ValueName": "Disabled",
"ValueDisplayName": "Disabled"
},
{
"ValueName": "Enabled",
"ValueDisplayName": "Enabled"
}
],
"DefaultValue": "Enabled"
}
and current.json would look like this:
{
"AttributeName":"setting1"
"Value": [
{
"ValueName": "Disabled",
"ValueDisplayName": "Disabled"
},
{
"ValueName": "Enabled",
"ValueDisplayName": "Enabled"
}
],
"DefaultValue": "Enabled",
"CurrentValue": "Enabled"
}
Now when I add a new "setting2" (which has the same keys, but values can be different) to defaults.json, I would like to update current.json with that setting, without overwriting the "currentvalue" field. How can I do this using jq?
I tried things like jq -rs 'add' defaults.json current.json but this just prints current.json.
I've tried looking at some of the other questions regarding jq, but they all cater to a very specific situation, uncomparable to mine.

The following jq program first finds the "new" paths in default.json by subtracting the paths in current.json from the paths in default.json, and then updates the "current" JSON by adding all the "new" paths and their associated values:
jq --argfile default default.json '
. as $current
| ([$default|paths] - [$current|paths]) as $new
| reduce $new[] as $p ($current;
setpath($p; $default|getpath($p)))
' current.json
Caveat
As of this writing, the --argfile option is officially "deprecated", so you might like to use one of the many other ways to pass in the contents of default.json.

Related

Merging multiple JSON Lines files into a single JSON object

I'm trying to merge / reduce many JSON objects and somehow I'm not getting the expected result.
I'm only interested in getting all keys, the values and the number of items inside arrays are irrelevant.
file1.json:
{
"customerId": "xx",
"emails": [
{
"address": "james#zz.com",
"customType": "",
"type": "custom"
},
{
"address": "sales#x.com",
"primary": true
},
{
"address": "info#x.com"
}
]
}
{
"id": "654",
"emails": [
{
"address": "peter#x.com",
"primary": true
}
]
}
The desired output is a JSON object with all possible keys from all input objects. The values are irrelevant, any value from any input object is OK. But all keys from input objects must be present in output object:
{
"emails": [
{
"address": "james#zz.com", <--- any existing value works
"customType": "", <--- any existing value works
"type": "custom", <--- any existing value works
"primary": true <--- any existing value works
}
],
"customerId": "xx", <--- any existing value works
"id": "654" <--- any existing value works
}
I tried reducing it, but it misses many of the keys in the array:
$ jq -s 'reduce .[] as $item ({}; . + $item)' file1.json
{
"customerId": "xx",
"emails": [
{
"address": "peter#x.com",
"primary": true
}
],
"id": "654"
}
The structure of the objects contained in file1.json is unknown, so the solution must be agnostic of any keys/values and the solution must not assume any structure or depth.
Is it possible to fix this somehow considering how jq works? Or is it possible to solve this issue using another tool?
PS: For those of you that are curious, this is useful to infer a schema that can be created in a database. Given an arbitrary number of JSON objects with an arbitrary structure, it's easy to create a single JSON squished/merged/fused structure that will "accommodate" all JSON objects.
BigQuery is able to autodetect a schema, but only 500 lines are analyzed to come up with it. This presents problems if objects have different structures past that 500 line mark.
With this approach I can squish a JSON Lines file with 1000000s of objects into one line that can be then imported into BigQuery with the autodetect schema flag and it will work every time since BigQuery only has one line to analyze and this line is the "super-schema" of all the objects. After extracting the autodetected schema I can manually fine tune it to make sure types are correct and then recreate the table specifying my tuned schema:
$ ls -1 users*.json | wc --lines
3672
$ cat users*.json > users-all.json
$ cat users-all.json | wc --lines
146482633
$ jq 'squish' users-all.json > users-all-squished.json
$ cat users-all-squished.json | wc --lines
1
$ bq load --autodetect users users-all-squished.json
$ bq show schema --format=prettyjson users > users-schema.json
$ vi users-schema.json
$ bq rm --table users
$ bq mk --table users --schema=users-schema.json
$ bq load users users-all.json
[Some options are missing or changed for readability]
Here is a solution that produces the expected result in the sample example, and seems to meet all the stated requirements. It is similar to one proposed by #pmf on this page.
jq -n --stream '
def squish: map(if type == "number" then 0 else . end);
reduce (inputs | select(length==2)) as [$p, $v] ({}; setpath($p|squish; $v))
'
Output
For the example given in the Q, the output is:
{
"customerId": "xx",
"emails": [
{
"address": "peter#x.com",
"customType": "",
"type": "custom",
"primary": true
}
],
"id": "654"
}
As #peak has pointed out, some aspects are underspecified. For instance, what should happen with .customerId and .id? Are they always the same across all files (as suggested by the sample files provided)? Do you want the items of the .emails array just thrown into one large array, or do you want to have them "merged" by some criteria (e.g. by a common value in their .address field)? Here are some stubs to start from:
Simply concatenate the .emails arrays and take all other parts from the first file:
jq 'reduce inputs as $in (.; .emails += $in.emails)' file*.json
# or simpler
jq '.emails += [inputs.emails[]]' file*.json
Demo Demo
{
"emails": [
{
"address": "cc#xx.com"
},
{
"address": "james#zz.com",
"customType": "",
"type": "custom"
},
{
"address": "james#x.com"
},
{
"address": "sales#x.com",
"primary": true
},
{
"address": "info#x.com"
},
{
"address": "james#x.com"
},
{
"address": "sales#x.com",
"primary": true
},
{
"address": "info#x.com"
}
],
"customerId": "xx",
"id": "654"
}
Merge the objects in the .emails array by a common value in their .address field, with latter values overwriting former values for other fields with colliding names, and discard all other parts from the files:
jq -n 'reduce inputs.emails[] as $e ({}; .[$e.address] += $e) | map(.)' file*.json
Demo
[
{
"address": "cc#xx.com"
},
{
"address": "james#zz.com",
"customType": "",
"type": "custom"
},
{
"address": "james#x.com"
},
{
"address": "sales#x.com",
"primary": true
},
{
"address": "info#x.com"
}
]
If you are only interested in a list of unique field names for a given address, regardless of the counts and values used, you can also go with:
jq -n '
reduce inputs.emails[] as $e ({}; .[$e.address][$e | keys_unsorted[]] = 1)
| map_values(keys)
'
Demo
{
"cc#xx.com": [
"address"
],
"james#zz.com": [
"address",
"customType",
"type"
],
"james#x.com": [
"address"
],
"sales#x.com": [
"address",
"primary"
],
"info#x.com": [
"address"
]
}
The structure of the objects contained in file1.json is unknown, so the solution must be agnostic of any keys/values and the solution must not assume any structure or depth.
You can use the --stream flag to break down the structure into an array of paths and values, discard the values part and make the paths unique:
jq --stream -nc '[inputs[0]] | unique[]' file*.json
["customerId"]
["emails"]
["emails",0,"address"]
["emails",0,"customType"]
["emails",0,"primary"]
["emails",0,"type"]
["emails",1,"address"]
["emails",2]
["emails",2,"address"]
["emails",2,"primary"]
["emails",3]
["emails",3,"address"]
["id"]
Trying to build a representation of this, similar to any of the input files, comes with a lot of caveats. For instance, how would you represent in a single structure if one file had .emails as an array of objects, and another had .emails as just an atomic value, say, a string. You would not be able to represent this plurality without introducing new, possibly ambiguous structures (e.g. putting all possibilities into an array).
Therefore, having a list of paths could be a fair compromise. Judging by your desired output, you want to focus more on the object structure, so you could further reduce complexity by discarding the array indices. Depending on your use case, you could replace them with a single value to retain the information of the presence of an array, or discard them entirely:
jq --stream -nc '[inputs[0] | map(numbers = 0)] | unique[]' file*.json
["customerId"]
["emails"]
["emails",0]
["emails",0,"address"]
["emails",0,"customType"]
["emails",0,"primary"]
["emails",0,"type"]
["id"]
jq --stream -nc '[inputs[0] | map(strings)] | unique[]' file*.json
["customerId"]
["emails"]
["emails","address"]
["emails","customType"]
["emails","primary"]
["emails","type"]
["id"]
The following program meets these two key requirements:
"all keys from input objects must be present in output object";
"the solution must be agnostic of any keys/values and the solution must not assume any structure or depth."
The approach is the same as one suggested by #pmf, and for the example given in the Q, produces results that are very similar to the one that is shown:
jq -n --stream '
def squish: map(select(type == "string"));
reduce (inputs | select(length==2)) as [$p, $v] ({};
setpath($p|squish; $v))
'
With the given input, this produces:
{
"customerId": "xx",
"emails": {
"address": "peter#x.com",
"customType": "",
"type": "custom",
"primary": true
},
"id": "654"
}

jq select, but preserve parent objects

This works to search for tactics that equal "impact". However, it will only pull the objects themselves.
jq '.techniques[] | select(.tactic == "impact")'
Is there no way to use select while walking through json with something like jq '. | select(.techniques[].tactic == "impact")'? I'm guessing the issue is that something like that, even if it works, still does not explicitly say to leave the previous items as well.
It is not viable to manually rebuild the parent.
input
{
"viewMode": 0,
"hideDisabled": false,
"techniques": [
{
"name": "john",
"tactic": "reconnaissance"
},
{
"name": "jane",
"tactic": "impact"
},
{
"name": "jill",
"tactic": "execution"
}
],
"karma": "yes"
}
desired output
{
"viewMode": 0,
"hideDisabled": false,
"techniques": [
{
"name": "jane",
"tactic": "impact"
}
],
"karma": "yes"
}
If this is so remedial that it warrants no response, I'll figure it out and update. It seems like the most basic thing. I'll also be doing a !=, which also works fine normally, but doesn't capture the entire body.
I have tried using variables to do it, which get me close;
jq '{techniques: [.techniques[] | select(.tactic == "impact")]} as $a| $a' test.json
However, trying to add a key "techniques" to that array ruins my ability to use it;
jq '{techniques: [.techniques[] | select(.tactic == "impact")]} as $a| $a + [.]' test.json
jq: error (at test.json:19): object ({"technique...) and array ([{"viewMode...) cannot be added
|= is your friend, e.g.
.techniques |= map(select(.tactic == "impact"))

in place edit, search for nested value and then replace another value

I have an input JSON document with roughly the following form (actual data has additional keys, which should be passed through unmodified; the whitespace is adjusted for human readability, and there's no expectation that it be maintained):
{
"Rules": [
{"Filter": { "Prefix": "to_me/" }, "Status": "Enabled" },
{"Filter": { "Prefix": "from_me/" }, "Status": "Enabled" },
{"Filter": { "Prefix": "__bg/" }, "Status": "Enabled" }
]
}
I need to match .Rules[].Filter.Prefix=="to_me/" and then change the associated "Status": "Enabled" to "Disabled". Since only the first rule above has a prefix of to_me/, status of that rule would be changed to Disabled, making correct output look like the following:
{
"Rules": [
{"Filter": { "Prefix": "to_me/" }, "Status": "Disabled" },
{"Filter": { "Prefix": "from_me/" }, "Status": "Enabled" },
{"Filter": { "Prefix": "__bg/" }, "Status": "Enabled" }
]
}
I've tried several different combinations but can't seem to get it right.
Anyone have ideas?
I prefer the idiom ARRAY |= map(...) over ARRAY[] |= ..., mainly because the former can be used reliably whether or not any of the substitutions evaluate to empty:
jq '.Rules |= map(if .Filter.Prefix == "to_me/"
then .Status="Disabled" else . end)'
To overwrite the input file, you might like to consider sponge from moremutils.
Doing in-place updates can be done with |=, and deciding whether to modify content in-place can be done with if/then/else. Thus:
jq '.Rules[] |= (if .Filter.Prefix == "to_me/" then .Status="Disabled" else . end)'

Replace subkey without exact path in jq

Example JSON file:
{
"u": "stuff",
"x": [1,2,3],
"y": {
"field": "value"
},
"z": {
"zz": {
"name": "change me",
"more": "stuff"
},
"randomKey": {
"name": "change me",
"random": "more stuff"
}
}
}
How can I update all the name fields to "something", maintaining the rest of the JSON file the same?
{
"u": "stuff",
"x": [1,2,3],
"y": {
"field": "value"
},
"z": {
"zz": {
"name": "something",
"more": "stuff"
},
"randomKey": {
"name": "something",
"random": "more stuff"
}
}
}
With a direct path, this would be easy, but the parent keys (z and randomKey in these case) varies.
I tried something like:
jq '.z | .. | .name? |= "something"' file.json
And it's updating the names, but putting also all the recursive stuff..
If it is acceptable to change the "name" field wherever it occurs, you could use walk/1:
walk(if type == "object" and has("name") then .name = "something" else . end)
Please note that walk/1 was only included with jq after jq 1.5 was released. If your jq does not have it, then you can find its definition on the jq FAQ, for example.
If you only want to modify the "name" field in the "z" context, then consider:
.z |= with_entries(if .value.name?
then .value.name = "something"
else . end)
Assuming every value within z has a name property, you could do this:
$ jq --arg newname 'something' '.z[].name = $newname' input.json
Using [] on an object will yield all the values contained in that object. And for each of those values, we were simply setting the name to the new name.
If you needed to be more selective with what gets updated, you'll have to add more conditions to what objects to update. In general, I'd use peak's approach, but here's another way it could be achieved using a structure similar to the first approach, assuming we only want to update objects that already have a name property:
$ jq --arg newname 'something' '(.z[] | select(has("name")).name) = $newname' input.json
It's important to wrap the LHS of the assignment in parentheses, we don't want to change the context prior to the assignment, otherwise we won't see the rest of the results.

Using jq to list keys in a JSON object

I have a hierarchically deep JSON object created by a scientific instrument, so the file is somewhat large (1.3MB) and not readily readable by people. I would like to get a list of keys, up to a certain depth, for the JSON object. For example, given an input object like this
{
"acquisition_parameters": {
"laser": {
"wavelength": {
"value": 632,
"units": "nm"
}
},
"date": "02/03/2525",
"camera": {}
},
"software": {
"repo": "github.com/username/repo",
"commit": "a7642f",
"branch": "develop"
},
"data": [{},{},{}]
}
I would like an output like such.
{
"acquisition_parameters": [
"laser",
"date",
"camera"
],
"software": [
"repo",
"commit",
"branch"
]
}
This is mainly for the purpose of being able to enumerate what is in a JSON object. After processing the JSON objects from the instrument begin to diverge: for example, some may have a field like .frame.cross_section.stats.fwhm, while others may have .sample.species, so it would be convenient to be able to interrogate the JSON object on the command line.
The following should do exactly what you want
jq '[(keys - ["data"])[] as $key | { ($key): .[$key] | keys }] | add'
This will give the following output, using the input you described above:
{
"acquisition_parameters": [
"camera",
"date",
"laser"
],
"software": [
"branch",
"commit",
"repo"
]
}
Given your purpose you might have an easier time using the paths builtin to list all the paths in the input and then truncate at the desired depth:
$ echo '{"a":{"b":{"c":{"d":true}}}}' | jq -c '[paths|.[0:2]]|unique'
[["a"],["a","b"]]
Here is another variation uing reduce and setpath which assumes you have a specific set of top-level keys you want to examine:
. as $v
| reduce ("acquisition_parameters", "software") as $k (
{}; setpath([$k]; $v[$k] | keys)
)