How to merge multiple Json files in folder in certain order

How to merge multiple Json files in folder in certain order - json

I have several JSON files in a folder and I want to combine them in a single one following certain order given in order.json
I´ve tested the code below that merges all the files but in alphabetical order based on file name.
jq -s . *.json > Merged.json
This is the file Shoes.json
{ "document": { "product": "Shoes", "info": [ { "day": "1", "month": "1", "entry": "Some text about Shoes for day 1 and month 1", "code": "AJKD" }, { "day": "2", "month": "1", "entry": "Some text about Shoes for day 2 and month 1", "code": "KKGIR" } ] } }
This is the file Watches.json
{ "document": { "product": "Watches", "info": [ { "day": "2", "month": "3", "entry": "Some text about Watches for day 2 and month 3", "code": "PEWQ" } ] } }
This is the file Accesories.json
{ "document": { "product": "Accesories", "info": [ { "day": "7", "month": "2", "entry": "Some text about Accesories for day 7 and month 2", "code": "UYAAC" } ] } }
This is the file that gives the order I want to get in output order.json
{
"order":{
"product 1":"Watches",
"product 2":"Accesories",
"product 3":"Shoes"
}
}
And the output file I´d like to get would be like this Merged.json:
{
"document":[
{
"product":"Watches",
"info":[
{
"day":"2",
"month":"3",
"entry":"Some text about Watches for day 2 and month 3",
"code":"PEWQ"
}
]
},
{
"product":"Accesories",
"info":[
{
"day":"7",
"month":"2",
"entry":"Some text about Accesories for day 7 and month 2",
"code":"UYAAC"
}
]
},
{
"product":"Shoes",
"info":[
{
"day":"1",
"month":"1",
"entry":"Some text about Shoes for day 1 and month 1",
"code":"AJKD"
},
{
"day":"2",
"month":"1",
"entry":"Some text about Shoes for day 2 and month 1",
"code":"KKGIR"
}
]
}
]
}
Maybe someone could help me with this case.
Any help would be very appreciated.

Here's a solution that uses order.json to determine which files are to be read by jq:
jq -n '{documents: [inputs.document]}' $(jq -r '.order[] + ".json"' order.json)
The approach exemplified immediately above has many advantages, but the above line makes various assumptions that might not be warranted. For example, it assumes there are no spaces in any of the file names.
Robust handling of file names
The following assumes bash-ful functionality:
mapfile -t args < <(jq -r '.order[] + ".json"' order.json)
jq -n '{documents: [inputs.document]}' "${args[#]}"
If your bash does not have mapfile, you could set the bash variable as follows:
args=()
while read -r f; do
args+=("$f")
done < <(jq -r '.order[] + ".json"' order.json)

The following solution has the advantage of not requiring any shell-specific functionality and only requiring a single invocation of jq. It assumes you want to be able to handle arbitrarily many files, as determined by order.json. Amongst other assumptions is that in the pwd, we can use the pattern [A-Z]*.json to select the relevant "documents".
jq -n --argfile order order.json '
INDEX(inputs.document; .product) as $dict
| reduce $order.order[] as $product ([]; . + [$dict[$product]])
| {document: .}
' [A-Z]*.json
def INDEX
If your jq does not have INDEX/2, then it might be a good time to upgrade; alternatively, you could simply add (prepend) its def:
def INDEX(stream; idx_expr):
reduce stream as $row ({}; .[$row|idx_expr|tostring] = $row);

Related

jq - return array value if its length is not null

I have a report.json generated by a gitlab pipeline.
It looks like:
{"version":"14.0.4","vulnerabilities":[{"id":"64e69d1185ecc48a1943141dcb6dbd628548e725f7cef70d57403c412321aaa0","category":"secret_detection"....and so on
If no vulnerabilities found, then "vulnerabilities":[]. I'm trying to come up with a bash script that would check if vulnerabilities length is null or not. If not, print the value of the vulnerabilities key. Sadly, I'm very far from scripting genius, so it's been a struggle.
While searching web for a solution to this, I've come across jq. It seems like select() should do the job.
I've tried:
jq "select(.vulnerabilities!= null)" report.json
but it returned {"version":"14.0.4","vulnerabilities":[{"id":"64e69d1185ecc48a194314... instead of expected "vulnerabilities":[{"id":"64e69d1185ecc48a194314...
and
map(select(.vulnerabilities != null)) report.json
returns "No matches found"
Would you mind pointing out what's wrong apart from my 0 experience with bash and JSON parsing? :)
Thanks in advance

Just use . filter to identify the object vulnerabilities.
these is some cases below
$ jq '.vulnerabilities' <<END
heredoc> {"version":"14.0.4","vulnerabilities":[{"id":"64e69d1185ecc48a1943141dcb6dbd628548e725f7cef70d57403c412321aaa0","category":"secret_detection"}]}
heredoc> END
[
{
"id": "64e69d1185ecc48a1943141dcb6dbd628548e725f7cef70d57403c412321aaa0",
"category": "secret_detection"
}
]
if vulnerabilities null, then jq will return null
$ jq '.vulnerabilities' <<END
{"version":"14.0.4","vulnerabilities":null}
END
null
then with pipe |, you can change it to any output you wanted.
change null to []: .vulnerabilities | if . == null then [] else . end
filter empty array: .vulnerabilities | select(length > 0)
For further information about jq filters, you can read the jq manual.

Assuming, by "print the value of the vulnerabilities key" you mean the value of an item's id field. You can retrieve it using .id and have it extracted to bash with the -r option.
If in case the array is not empty you want all of the "keys", iterate over the array using .[]. If you just wanted a specific key, let's say the first, address it using a 0-based index: .[0].
To check the length of an array there is a dedicated length builtin. However, as your final goal is to extract, you can also attempt to do so right anyway, suppress a potential unreachability error using the ? operator, and have your bash script read an appropriate exit status using the -e option.
Your bash script then could include the following snippet
if key=$(jq -re '.vulnerabilities[0].id?' report.json)
then
# If the array was not empty, $key contains the first key
echo "There is a vulnerability in key $key."
fi
# or
if keys=$(jq -re '.vulnerabilities[].id?' report.json)
then
# If the array was not empty, $keys contains all the keys
for k in $keys
do echo "There is a vulnerability in key $k."
done
fi

Firstly, please note that in the JSON world, it is important to distinguish
between [] (the empty array), the values 0 and null, and the absence of a value (e.g. as the result of the absence of a key in an object).
In the following, I'll assume that the output should be the value of .vulnerabilities
if it is not `[]', or nothing otherwise:
< sample.json jq '
select(.vulnerabilities != []).vulnerabilities
'
If the goal were to differentiate between two cases based on the return code from jq, you could use the -e command-line option.

You can use if-then-else.
Filter
if (.vulnerabilities | length) > 0 then {vulnerabilities} else empty end
Input
{
"version": "1.1.1",
"vulnerabilities": [
{
"id": "111",
"category": "secret_detection"
},
{
"id": "112",
"category": "secret_detection"
}
]
}
{
"version": "1.2.1",
"vulnerabilities": [
{
"id": "121",
"category": "secret_detection 2"
}
]
}
{
"version": "3.1.1",
"vulnerabilities": []
}
{
"version": "4.1.1",
"vulnerabilities": [
{
"id": "411",
"category": "secret_detection 4"
},
{
"id": "412",
"category": "secret_detection"
},
{
"id": "413",
"category": "secret_detection"
}
]
}
Output
{
"vulnerabilities": [
{
"id": "111",
"category": "secret_detection"
},
{
"id": "112",
"category": "secret_detection"
}
]
}
{
"vulnerabilities": [
{
"id": "121",
"category": "secret_detection 2"
}
]
}
{
"vulnerabilities": [
{
"id": "411",
"category": "secret_detection 4"
},
{
"id": "412",
"category": "secret_detection"
},
{
"id": "413",
"category": "secret_detection"
}
]
}
Demo
https://jqplay.org/s/wicmr4uVRm

iterating through JSON files adding properties to each with jq

I am attempting to iterate through all my JSON files and add properties but I am relatively new jq.
here is what I am attempting:
find hashlips_art_engine/build -type f -name '*.json' | jq '. + {
"creators": [
{
"address": "4iUFmB3H3RZGRrtuWhCMtkXBT51iCUnX8UV7R8rChJsU",
"share": 10
},
{
"address": "2JApg1AXvo1Xvrk3vs4vp3AwamxQ1DHmqwKwWZTikS9w",
"share": 45
},
{
"address": "Zdda4JtApaPs47Lxs1TBKTjh1ZH2cptjxXMwrbx1CWW",
"share": 45
}
]
}'
However this is returning an error:
parse error: Invalid numeric literal at line 2, column 0
I have around 10,000 JSON files that I need to iterate over and add
{
"creators": [
{
"address": "4iUFmB3H3RZGRrtuWhCMtkXBT51iCUnX8UV7R8rChJsU",
"share": 10
},
{
"address": "2JApg1AXvo1Xvrk3vs4vp3AwamxQ1DHmqwKwWZTikS9w",
"share": 45
},
{
"address": "Zdda4JtApaPs47Lxs1TBKTjh1ZH2cptjxXMwrbx1CWW",
"share": 45
}
]
}
to, is this possible or am I barking up the wrong tree on this?
thanks for your assistance with this, I have been searching the web for several hours now but either my terminology is incorrect or there isn't much out there regarding this issue.

The problem is that you are piping the filenames to jq rather than making the contents available to jq.
Most likely you could use the following approach, e.g. if you want the augmented contents of each file to be handled separately:
find ... | while read f ; do jq ... "$f" ; done
An alternative that might be relevant would be:
jq ... $(find ...)

If you have 2 files:
file01.json :
{"a":"1","b":"2"}
file02.json :
{"x":"10","y":"12","z":"15"}
you can:
for f in file*.json ;do cat $f | jq '. + { creators:[{address: "xxx",share:1}] } ' ; done
result:
{
"a": "1",
"b": "2",
"creators": [
{
"address": "xxx",
"share": 1
}
]
}
{
"x": "10",
"y": "12",
"z": "15",
"creators": [
{
"address": "xxx",
"share": 1
}
]
}

how to denormalise this json structure

I have a json formatted overview of backups, generated using pgbackrest. For simplicity I removed a lot of clutter so the main structures remain. The list can contain multiple backup structures, I reduced here to just 1 for simplicity.
[
{
"backup": [
{
"archive": {
"start": "000000090000000200000075",
"stop": "000000090000000200000075"
},
"info": {
"size": 1200934840
},
"label": "20220103-122051F",
"type": "full"
},
{
"archive": {
"start": "00000009000000020000007D",
"stop": "00000009000000020000007D"
},
"info": {
"size": 1168586300
},
"label": "20220103-153304F_20220104-081304I",
"type": "incr"
}
],
"name": "dbname1"
}
]
Using jq I tried to generate a simpeler format out of this, until now without any luck.
What I would like to see is the backup.archive, backup.info, backup.label, backup.type, name combined in one simple structure, without getting into a cartesian product. I would be very happy to get the following output:
[
{
"backup": [
{
"archive": {
"start": "000000090000000200000075",
"stop": "000000090000000200000075"
},
"name": "dbname1",
"info": {
"size": 1200934840
},
"label": "20220103-122051F",
"type": "full"
},
{
"archive": {
"start": "00000009000000020000007D",
"stop": "00000009000000020000007D"
},
"name": "dbname1",
"info": {
"size": 1168586300
},
"label": "20220103-153304F_20220104-081304I",
"type": "incr"
}
]
}
]
where name is redundantly added to the list. How can I use jq to convert the shown input to the requested output? In the end I just want to generate a simple csv from the data. Even with the simplified structure using
'.[].backup[].name + ":" + .[].backup[].type'
I get a cartesian product:
"dbname1:full"
"dbname1:full"
"dbname1:incr"
"dbname1:incr"
how to solve that?

So, for each object in the top-level array you want to pull in .name into each of its .backup array's elements, right? Then try
jq 'map(.backup[] += {name} | del(.name))'
Demo
Then, generating a CSV output using jq is easy: There is a builtin called #csv which transforms an array into a string of its values with quotes (if they are stringy) and separated by commas. So, all you need to do is to iteratively compose your desired values into arrays. At this point, removing .name is not necessary anymore as we are piecing together the array for CSV output anyway. And we're giving the -r flag to jq in order to make the output raw text rather than JSON.
jq -r '.[]
| .backup[] + {name}
| [(.archive | .start, .stop), .name, .info.size, .label, .type]
| #csv
'
Demo

First navigate to backup and only then “print” the stuff you’re interested.
.[].backup[] | .name + ":" + .type

Remove matching/non-matching elements of a nested array using jq

I need to split the results of a sonarqube analysis history into individual files. Assuming a starting input below,
{
"paging": {
"pageIndex": 1,
"pageSize": 100,
"total": 3
},
"measures": [
{
"metric": "coverage",
"history": [
{
"date": "2018-11-18T12:37:08+0000",
"value": "100.0"
},
{
"date": "2018-11-21T12:22:39+0000",
"value": "100.0"
},
{
"date": "2018-11-21T13:09:02+0000",
"value": "100.0"
}
]
},
{
"metric": "bugs",
"history": [
{
"date": "2018-11-18T12:37:08+0000",
"value": "0"
},
{
"date": "2018-11-21T12:22:39+0000",
"value": "0"
},
{
"date": "2018-11-21T13:09:02+0000",
"value": "0"
}
]
},
{
"metric": "vulnerabilities",
"history": [
{
"date": "2018-11-18T12:37:08+0000",
"value": "0"
},
{
"date": "2018-11-21T12:22:39+0000",
"value": "0"
},
{
"date": "2018-11-21T13:09:02+0000",
"value": "0"
}
]
}
]
}
How do I use jq to clean the results so it only retains the history array entries for each element? The desired output is something like this (output-20181118123808.json for analysis done on "2018-11-18T12:37:08+0000"):
{
"paging": {
"pageIndex": 1,
"pageSize": 100,
"total": 3
},
"measures": [
{
"metric": "coverage",
"history": [
{
"date": "2018-11-18T12:37:08+0000",
"value": "100.0"
}
]
},
{
"metric": "bugs",
"history": [
{
"date": "2018-11-18T12:37:08+0000",
"value": "0"
}
]
},
{
"metric": "vulnerabilities",
"history": [
{
"date": "2018-11-18T12:37:08+0000",
"value": "0"
}
]
}
]
}
I am lost on how to operate only on the sub-elements while leaving the parent structure intact. The naming of the JSON file is going to be handled externally from the jq utility. The sample data provided will be split into 3 files. Some other input can have a variable number of entries, some may be up to 10000. Thanks.

Here is a solution which uses awk to write the distinct files. The solution assumes that the dates for each measure are the same and in the same order, but imposes no limit on the number of distinct dates, or the number of distinct measures.
jq -c 'range(0; .measures[0].history|length) as $i
| (.measures[0].history[$i].date|gsub("[^0-9]";"")), # basis of filename
reduce range(0; .measures|length) as $j (.;
.measures[$j].history |= [.[$i]])' input.json |
awk -F\\t 'fn {print >> fn; fn="";next}{fn="output-" $1 ".json"}'
Comments
The choice of awk here is just for convenience.
The disadvantage of this approach is that if each file is to be neatly formatted, an additional run of a pretty-printer (such as jq) would be required for each file. Thus, if the output in each file is required to be neat, a case could be made for running jq once for each date, thus obviating the need for the post-processing (awk) step.
If the dates of the measures are not in lock-step, then the same approach as above could still be used, but of course the gathering of the dates and the corresponding measures would have to be done differently.
Output
The first two lines produced by the invocation of jq above are as follows:
"201811181237080000"
{"paging":{"pageIndex":1,"pageSize":100,"total":3},"measures":[{"metric":"coverage","history":[{"date":"2018-11-18T12:37:08+0000","value":"100.0"}]},{"metric":"bugs","history":[{"date":"2018-11-18T12:37:08+0000","value":"0"}]},{"metric":"vulnerabilities","history":[{"date":"2018-11-18T12:37:08+0000","value":"0"}]}]}

In the comments, the following addendum to the original question appeared:
is there a variation wherein the filtering is based on the date value and not the position? It is not guaranteed that the order will be the same or the number of elements in each metric is going to be the same (i.e. some dates may be missing "bugs", some might have additional metric such as "complexity").
The following will produce a stream of JSON objects, one per date. This stream can be annotated with the date as per my previous answer, which shows how to use these annotations to create the various files. For ease of understanding, we use two helper functions:
def dates:
INDEX(.measures[].history[].date; .)
| keys;
def gather($date): map(select(.date==$date));
dates[] as $date
| .measures |= map( .history |= gather($date) )
INDEX/2
If your jq does not have INDEX/2, now would be an excellent time to upgrade, but in case that's not feasible, here is its def:
def INDEX(stream; idx_expr):
reduce stream as $row ({};
.[$row|idx_expr|
if type != "string" then tojson
else .
end] |= $row);

jq update contents of one file to another as key value

I am trying to update branches.json.branch2 values from branch2.json.Employes values
Using jq, How can I merge content of one file to another file
Below are the files
I have tried this but it did work, it just prints the original data without updating the details
#!/bin/sh
#call file with branch name for example ./update.sh branch2
set -xe
branchName=$1
fullPath=`pwd`/$1".json"
list=$(cat ${fullPath})
branchDetails=$(echo ${list} | /usr/local/bin/jq -r '.Employes')
newJson=$(cat branches.json |
jq --arg updateKey "$1" --arg updateValue "$branchDetails" 'to_entries |
map(if .key == "$updateKey"
then . + {"value":"$updateValue"}
else .
end) |
from_entries')
echo $newJson &> results.json
branch1.json
{
"Employes": [
{
"Name": "Ikon",
"age": "30"
},
{
"Name": "Lenon",
"age": "35"
}
]
}
branch2.json
{
"Employes": [
{
"Name": "Ken",
"age": "40"
},
{
"Name": "Frank",
"age": "23"
}
]
}
brances.json / results.json fromat
{
"branch1": [
{
"Name": "Ikon",
"age": "30"
},
{
"Name": "Lenon",
"age": "35"
}
],
"branch2": [
{
"Name": "Ken",
"age": "40"
},
{
"Name": "Frank",
"age": "23"
}
]
}
Note: I dont have the list of all the branch files at any given point, so script is responsible only to update the that branch details.

If the file name is the name of the property you want to update, you could utilize input_filename to select the files. No testing needed, just pass in the files you want to update. Just be aware of the order you pass in the input files.
Merge the contents of the file as you see fit. To simply replace, just do a plain assignment.
$ jq 'reduce inputs as $i (.;
.[input_filename|rtrimstr(".json")] = $i.Employes
)' branches.json branch{1,2}.json
Your script would just need to be:
#!/bin/sh
#call file with branch name for example ./update.sh branch2
set -xe
branchName=$1
newJson=$(jq 'reduce inputs as $i (.; .[input_filename|rtrimstr(".json")] = $i.Employees)' branches.json "$branchName.json")

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How to merge multiple Json files in folder in certain order - json

Related

jq - return array value if its length is not null

iterating through JSON files adding properties to each with jq

how to denormalise this json structure

Remove matching/non-matching elements of a nested array using jq

jq update contents of one file to another as key value

Categories

Resources