How to merge objects with InstanceId unique in bash shell? - json

I have two json files as below:
I wanna merge objects in tmp1.json and tmp2.json with InstanceId unique value in bash shell.
I have tried jq with argjson option but my jq 1.4 version not support this option. Sorry, I unable update jq to 1.5 version.
#cat tmp1.json
{
"VolumeId": "vol-046e0be08ac95095a",
"Instances": [
{
"InstanceId": "i-020ce1b2ad08fa6bd"
}
]
}
{
"VolumeId": "vol-007253a7d24c1c668",
"Instances": [
{
"InstanceId": "i-0c0650c15b099b993"
}
]
}
#cat tmp2.json
{
"InstanceId": "i-0c0650c15b099b993",
"InstanceName": "Test1"
}
{
"InstanceId": "i-020ce1b2ad08fa6bd",
"InstanceName": "Test"
}
My desired is:
{
"VolumeId": "vol-046e0be08ac95095a",
"Instances": [
{
"InstanceId": "i-020ce1b2ad08fa6bd"
"InstanceName": "Test"
}
]
}
{
"VolumeId": "vol-007253a7d24c1c668",
"Instances": [
{
"InstanceId": "i-0c0650c15b099b993"
"InstanceName": "Test1"
}
]
}

#!/bin/bash
JQ=jq-1.4
# For ease of understanding, the following is a bit more verbose than
# necessary.
# One way to get around the constraints of using jq 1.4 is
# to use the "slurp" option so that the contents of the two files can
# be kept separately.
# Note that jq 1.6 includes the following def of INDEX, but we can use it with jq 1.4.
($JQ -s . tmp1.json ; $JQ -s . tmp2.json) | $JQ -s '
def INDEX(stream; idx_expr):
reduce stream as $row ({};
.[$row|idx_expr|
if type != "string" then tojson
else .
end] |= $row);
.[0] as $tmp1
| .[1] as $tmp2
| INDEX($tmp2[]; .InstanceId) as $dict
| $tmp1
| map( .Instances |= map(.InstanceName = $dict[.InstanceId].InstanceName))
| .[]
'
Streamlined
INDEX(.[1][]; .InstanceId) as $dict
| .[0][]
| .Instances |= map(.InstanceName = $dict[.InstanceId].InstanceName)

minify the two json files
try the following command:
cat tmp2.json|jq -r '"\(.InstanceId) \(.InstanceName)"'|xargs -n2 sh -c 'cat tmp1.json|jq "if .Instances[0].InstanceId==\"$0\" then .Instances[0].InstanceName=\"$1\" else empty end"'
Here is the output:
{
"VolumeId": "vol-007253a7d24c1c668",
"Instances": [
{
"InstanceId": "i-0c0650c15b099b993",
"InstanceName": "Test1"
}
]
}
{
"VolumeId": "vol-046e0be08ac95095a",
"Instances": [
{
"InstanceId": "i-020ce1b2ad08fa6bd",
"InstanceName": "Test"
}
]
}

Related

Convert Json to CSV by jq filter

I have a json file with this content and want to convert it to CSV like below:
{
"fields": [
{
"id": 17,
"name": "Business Division",
"values": [
{
"id": 131,
"name": "Accounting",
"industry": [
"Accounting"
]
}
]
},
{
"id": 16,
"name": "Cancellation Reason",
"values": [
{
"id": 114,
"name": "Forgot"
}
]
}
]
}
CSV File format:
17,Business Division,131,Accounting,Accounting
16,Cancellation Reason,114,Forgot
I ran this command on the terminal:
jq -M -r -f industry.jq source.json |tr -d '"' >source.csv
this is the content of industry.jq file that is used as the filter:
.fields[]
| .values[] as $e
| $e.industry[]? as $s
| [.id, .name, $e.id, $e.name, $s? ]
| #csv
As result, the second line of the CSV file did not print
I think it's because of the .industry[] object that did not available in the second object in my Json
How can I print the above json in the needed format?
.fields[] | [ .id, .name, .values[].id, .values[].name, .values[].industry[]? ] | #csv
Will produce
17,"Business Division",131,"Accounting","Accounting"
16,"Cancellation Reason",114,"Forgot"
When invoked like:
jq --raw-output '.fields[] | [ .id, .name, .values[].id, .values[].name, .values[].industry[]? ] | #csv'
Multiple industries will be added behind
Try it online
I'd use try ... catch ... to make the value of the default explicit, e.g.
(try $e.industry[] catch null) as $s

Create JSON with jq

I am trying to create a JSON file with jq from the result of the command "lsb_release"
What i have done :
if [ -x "$(command -v lsb_release)" ]; then
lsb_release -a | jq --raw-input 'split("\t") | { (.[0]) : .[1] }' > ubuntu_release.json
fi
the result is
{
"Distributor ID:": "Ubuntu"
}
{
"Description:": "Ubuntu 20.04.3 LTS"
}
{
"Release:": "20.04"
}
{
"Codename:": "focal"
}
but i want the result
[
{
"Distributor ID:": "Ubuntu"
},
{
"Description:": "Ubuntu 20.04.3 LTS"
},
{
"Release:": "20.04"
},
{
"Codename:": "focal"
}
]
can anybody body help me ? :)
Usually, when we want to create an array from a stream of inputs, we can use --slurp/-s. But when combined with --raw-input/-R, this causes the entire input to be provided as a single string (that contains line feeds).
Slurping can also be achieved using --null-input/-n and [ inputs | ... ]. And this works as desired for text files.
jq -nR '[ inputs | split("\t") | { (.[0]) : .[1] } ]'
Demo on jqplay
That said, I suspect you will find the following output format more useful:
{
"Distributor ID": "Ubuntu",
"Description": "Ubuntu 20.04.3 LTS",
"Release": "20.04",
"Codename": "focal"
}
This can be achieved by simply adding | add.
jq -nR '[ inputs | split(":\t") | { (.[0]) : .[1] } ] | add'
Demo on jqplay
One can also use reduce.
jq -nR 'reduce ( inputs | split(":\t") ) as [ $k, $v ] ( {}; . + { ($k): $v } )'
Demo on jqplay
Filter
reduce (inputs / ":\t") as [$key, $value] ({}; .+{($key): $value})
Input
Distributor ID: Ubuntu
Description: Ubuntu 20.04.3 LTS
Release: 20.04
Codename: focal
Output
{
"Distributor ID": "Ubuntu",
"Description": "Ubuntu 20.04.3 LTS",
"Release": "20.04",
"Codename": "focal"
}
Note that each line of $key and $value from inputs is processed and combined by reduce.
Demo
https://jqplay.org/s/ZBvKf6vQ0F

Update one JSON file values with values from another JSON using JQ (on all levels)

I have two JSON files:
source.json:
{
"general": {
"level1": {
"key1": "x-x-x-x-x-x-x-x",
"key3": "z-z-z-z-z-z-z-z",
"key4": "w-w-w-w-w-w-w-w"
},
"another" : {
"key": "123456",
"comments": {
"one": "111",
"other": "222"
}
}
},
"title": "The best"
}
and the
target.json:
{
"general": {
"level1": {
"key1": "xxxxxxxx",
"key2": "yyyyyyyy",
"key3": "zzzzzzzz"
},
"onemore": {
"kkeeyy": "0000000"
}
},
"specific": {
"stuff": "test"
},
"title": {
"one": "one title",
"other": "other title"
}
}
I need all the values for keys which exist in both files, copied from source.json to target.json, considering all the levels.
I've seen and tested the solution from this post.
It only copies the first level of keys, and I couldn't get it to do what I need.
The result from solution in this post, looks like this:
{
"general": {
"level1": {
"key1": "x-x-x-x-x-x-x-x",
"key3": "z-z-z-z-z-z-z-z",
"key4": "w-w-w-w-w-w-w-w"
},
"another": {
"key": "123456",
"comments": {
"one": "111",
"other": "222"
}
}
},
"specific": {
"stuff": "test"
},
"title": "The best"
}
Everything under the "general" key was copied as is.
What I need, is this:
{
"general": {
"level1": {
"key1": "x-x-x-x-x-x-x-x",
"key2": "yyyyyyyy",
"key3": "z-z-z-z-z-z-z-z"
},
"onemore": {
"kkeeyy": "0000000"
}
},
"specific": {
"stuff": "test"
},
"title": {
"one": "one title",
"other": "other title"
}
}
Only "key1" and "key3" should be copied.
Keys in target JSON must not be deleted and new keys should not be created.
Can anyone help?
One approach you could take is get all the paths to all scalar values for each input and take the set intersections. Then copy values from source to target from those paths.
First we'll need an intersect function (which was surprisingly difficult to craft):
def set_intersect($other):
(map({ ($other[] | tojson): true }) | add) as $o
| reduce (.[] | tojson) as $v ({}; if $o[$v] then .[$v] = true else . end)
| keys_unsorted
| map(fromjson);
Then to do the update:
$ jq --argfile s source.json '
reduce ([paths(scalars)] | set_intersect([$s | paths(scalars)])[]) as $p (.;
setpath($p; $s | getpath($p))
)
' target.json
[Note: this response answers the original question, with respect to the original data. The OP may have had paths in mind rather than keys.]
There is no need to compute the intersection to achieve a reasonably efficient solution.
First, let's hypothesize the following invocation of jq:
jq -n --argfile source source.json --argfile target target.json -f copy.jq
In the file copy.jq, we can begin by defining a helper function:
# emit an array of the distinct terminal keys in the input entity
def keys: [paths | .[-1] | select(type=="string")] | unique;
In order to inspect all the paths to leaf elements of $source, we can use tostream:
($target | keys) as $t
| reduce ($source|tostream|select(length==2)) as [$p,$v]
($target;
if $t|index($p[-1]) then setpath($p; $v) else . end)
Alternatives
Since $t is sorted, it would (at least in theory) make sense to use bsearch instead of index:
bsearch($p[-1]) > -1
Also, instead of tostream we could use paths(scalars).
Putting these alternatives together:
($target | keys) as $t
| reduce ($source|paths(scalars)) as $p
($target;
if $t|bsearch($p[-1]) > -1
then setpath($p; $source|getpath($p))
else . end)
Output
{
"general": {
"level1": {
"key1": "x-x-x-x-x-x-x-x",
"key2": "yyyyyyyy",
"key3": "z-z-z-z-z-z-z-z"
},
"onemore": {
"kkeeyy": "0000000"
}
},
"specific": {
"stuff": "test"
}
}
The following provides a solution to the revised question, which is actually about "paths" rather than "keys".
([$target|paths(scalars)] | unique) as $paths
| reduce ($source|paths(scalars)) as $p
($target;
if $paths | bsearch($p) > -1
then setpath($p; $source|getpath($p))
else . end)
unique is called so that binary search can be used subsequently.
Invocation:
jq -n --argfile source source.json --argfile target target.json -f program.jq

jq: translate array of objects to object

I have a response from curl in a format like this:
[
{
"list": [
{
"value": 1,
"id": 12
},
{
"value": 15,
"id": 13
},
{
"value": -4,
"id": 14
}
]
},
...
]
Given a mapping between ids like this:
{
"12": "newId1",
"13": "newId2",
"14": "newId3"
}
I want to make this:
[
{
"list": {
"newId1": 1,
"newId2": 15,
"newId3": -4,
}
},
...
]
Such that I get a mapping from ids to values (and along the way I'd like to remap the ids).
I've been working at this for a while and every time I get a deadend.
Note: I can use Shell or the like to preform loops if necessary.
edit: Here's one version what I've developed so far:
jq '[].list.id = ($mapping.[] | select(.id == key)) | del(.id)' -M --argjson "mapping" "$mapping"
I don't think it's the best one, but I'm looking to see if I can find an old version that was closer to what I need.
[EDIT: The following response was in answer to the question when it described (a) the mapping as shown below, and (b) the input data as having the form:
[
{
"list": [
{
"value": 1,
"id1": 12
},
{
"value": 15,
"id2": 13
},
{
"value": -4,
"id3": 14
}
]
}
]
END OF EDIT]
In the following I'll assume that the mapping is available via the following function, but that is an inessential assumption:
def mapping: {
"id1": "newId1",
"id2": "newId2",
"id3": "newId3"
} ;
The following jq filter will then produce the desired output:
map( .list
|= (map( to_entries[]
| (mapping[.key]) as $mapped
| select($mapped)
| {($mapped|tostring): .value} )
| add) )
There's plenty of ways to skin a cat. I'd do it like this:
.[].list |= reduce .[] as $i ({};
($i.id|tostring) as $k
| (select($mapping | has($k))[$mapping[$k]] = $i.value) // .
)
You would just provide the mapping through a separate file or argument.
$ cat program.jq
.[].list |= reduce .[] as $i ({};
($i.id|tostring) as $k
| (select($mapping | has($k))[$mapping[$k]] = $i.value) // .
)
$ cat mapping.json
{
"12": "newId1",
"13": "newId2",
"14": "newId3"
}
$ jq --argfile mapping mapping.json -f program.jq input.json
[
{
"list": {
"newId1": 1,
"newId2": 15,
"newId3": -4
}
}
]
Here is a reduce-free solution to the revised problem.
In the following I'll assume that the mapping is available via the following function, but that is an inessential assumption:
def mapping:
{
"12": "newId1",
"13": "newId2",
"14": "newId3"
} ;
map( .list
|= (map( mapping[.id|tostring] as $mapped
| select($mapped)
| {($mapped): .value} )
| add) )
The "select" is for safety (i.e., it checks that the .id under consideration is indeed mapped). It might also be appropriate to ensure that $mapped is a string by writing {($mapped|tostring): .value}.

CSV to JSON using BASH

I am trying to covert the below csv into json format.
Africa,Kenya,NAI,281
Africa,Kenya,NAI,281
Asia,India,NSI,100
Asia,India,BSE,160
Asia,Pakistan,ISE,100
Asia,Pakistan,ANO,100
European Union,United Kingdom,LSE,100
This is the desired json format and I just cannot get to create it. I will post my work in progress below this.. Any help or direction would be appreciated...
{"name":"Africa",
"children":[
{"name":"Kenya",
"children":[
{"name":"NAI","size":"109"},
{"name":"NAA","size":"160"}]}]},
{"name":"Asia",
"children":[
{"name":"India",
"children":[
{"name":"NSI","size":"100"},
{"name":"BSE","size":"60"}]},
{"name":"Pakistan",
"children":[
{"name":"ISE","size":"120"},
{"name":"ANO","size":"433"}]}]},
{"name":"European Union",
"children":[
{"name":"United Kingdom",
"children":[
{"name":"LSE","size":"550"},
{"name":"PLU","size":"123"}]}]}
Work in Progress.
$1 is the file with the csv values pasted above.
#!/bin/bash
pcountry=$(head -1 $1 | cut -d, -f2)
cat $1 | while read line ; do
region=$(echo $line|cut -d, -f1)
country=$(echo $line|cut -d, -f2)
code=$(echo $line|cut -d, -f3-)
size=$(echo $line|cut -d, -f4)
if test "$pcountry" == "$country" ;
then
echo -e {\"name\":\"$region\", '\n' \"children\": [ '\n'{\"name\":\"$country\",'\n'\"children\": [ '\n' \{\"name\":\"NAI\",\"size\":\"$size\"\}
else
if test "$pregion" == "$region"
then :
else
echo -e ,'\n'{\"name\":\""$region\", '\n' \"children\": [ '\n'{\"name\":\"$country\",'\n'\"children\": [ '\n' \{\"name\":\"NAI\",\"size\":\"$size\"\},
pcountry=$country
pregion=$region
fi ; done
Problem is that I cannot seem to find a way to find out when a countries value ends.
As a number of the commenters have said, using the shell for this kind of conversion is a horrible idea. And, it would be nigh impossible to do it with just bash builtins; and shell scripts are used to combine standard unix commands like sed, awk, cut, etc. anyway. You should choose a better language that's built for that kind of iterative parsing/processing to solve your problem.
However, because it's late and I've had too much coffee, I threw together a bash script (with a few bits of sed thrown in for parsing help) that takes the example .csv data you have and outputs the JSON in the format you noted. Here's the script:
#! /bin/bash
# Initial input file format:
#
# Africa,Kenya,NAI,281
# Africa,Kenya,NAA,281
# Asia,India,NSI,100
# Asia,India,BSE,160
# Asia,Pakistan,ISE,100
# Asia,Pakistan,ANO,100
# European Union,United Kingdom,LSE,100
#
# Intermediate file format for parsing to JSON:
#
# Africa|Kenya:NAI=281
# Asia|India:BSE=160&NSI=100|Pakistan:ISE=100&ANO=100
# European Union|United Kingdom:LSE=100
#
# Call as:
#
# $ ./script INPUTFILE.csv >OUTPUTFILE.json
#
# temporary files for output/parsing
TMP="./tmp.dat"
TMP2="./tmp2.dat"
>$TMP
>$TMP2
# read through initial file and output intermediate format
while read line
do
region=$(echo $line | cut -d, -f1)
country=$(echo $line | cut -d, -f2)
code=$(echo $line | cut -d, -f3)
size=$(echo $line | cut -d, -f4)
# region record already started
if grep "^$region" $TMP 2>&1 >/dev/null ;then
>$TMP2
while read rec
do
if echo $rec | grep "^$region" 2>&1 >/dev/null
then
if echo "$rec" | grep "\|$country:" 2>&1 >/dev/null
then
echo "$rec" | sed -e 's/\('"$country"':[^\|][^\|]*\)/\1\&'"$code"'='"$size"'/' >>$TMP2
else
echo "$rec|$country:$code=$size" >>$TMP2
fi
else
echo $rec >>$TMP2
fi
done < $TMP
mv $TMP2 $TMP
else
# new region
echo "$region|$country:$code=$size" >>$TMP
fi
done < $1
# Parse through our intermediary format and output JSON to standard out
echo "["
country_count=$(cat $TMP | wc -l)
while read line
do
country=$(echo $line | cut -d\| -f1)
echo "{ \"name\": \"$country\", "
echo " \"children\": ["
region_count=$(echo $line | cut -d\| -f2- | sed -e 's/|/\n/g' | wc -l)
echo $line | cut -d\| -f2- | sed -e 's/|/\n/g' |
while read region
do
name=$(echo $region | cut -d: -f1)
echo " { \"name\": \"$name\", "
echo " \"children\": ["
code_count=$(echo $region | sed -e 's/^'"$name"'://' -e 's/&/\n/g' | wc -l)
echo $region | sed -e 's/^'"$name"'://' -e 's/&/\n/g' |
while read code_size
do
code=$(echo $code_size | cut -d= -f1)
size=$(echo $code_size | cut -d= -f2)
code_count=$((code_count - 1))
COMMA=""
if [ $code_count -gt 0 ]; then
COMMA=","
fi
echo " { \"name\": \"$code\", \"size\": \"$size\" }$COMMA "
done
echo " ]"
region_count=$((region_count - 1))
if [ $region_count -gt 0 ]; then
echo " },"
else
echo " }"
fi
done
echo " ]"
country_count=$((country_count - 1))
COMMA=""
if [ $country_count -gt 0 ]; then
COMMA=","
fi
echo "}$COMMA"
done < $TMP
echo "]"
exit 0
And, here's the resulting output from the above script:
[
{ "name": "Africa",
"children": [
{ "name": "Kenya",
"children": [
{ "name": "NAI", "size": "281" },
{ "name": "NAA", "size": "281" }
]
}
]
},
{ "name": "Asia",
"children": [
{ "name": "India",
"children": [
{ "name": "NSI", "size": "100" },
{ "name": "BSE", "size": "160" }
]
},
{ "name": "Pakistan",
"children": [
{ "name": "ISE", "size": "100" },
{ "name": "ANO", "size": "100" }
]
}
]
},
{ "name": "European Union",
"children": [
{ "name": "United Kingdom",
"children": [
{ "name": "LSE", "size": "100" }
]
}
]
}
]
Please don't use code like the above in any production environment.
Here is a solution using jq.
If filter.jq contains the following filter
reduce (
split("\n")[] # split string into lines
| split(",") # split data
| select(length>0) # eliminate blanks
) as [$c1,$c2,$c3,$c4] ( # convert to object
{} # e.g. "Africa": { "Kenya": {
; setpath([$c1,$c2,"name"];$c3) # "name": "NAI",
| setpath([$c1,$c2,"size"];$c4) # "size": "281"
) # }, }
| [ # then build final array of objects format:
keys[] as $k1 # [ {
| {name: $k1, children: ( # "name": "Africa",
.[$k1] # "children": {
| keys[] as $k2 # "name": "Kenya",
| {name: $k2, children:.[$k2]} # "children": { "name": "NAI", "size": "281" }
)} # ...
]
and data contains the sample data then the command
$ jq -M -Rsr -f filter.jq data
produces
[
{
"name": "Africa",
"children": {
"name": "Kenya",
"children": {
"name": "NAI",
"size": "281"
}
}
},
{
"name": "Asia",
"children": {
"name": "India",
"children": {
"name": "BSE",
"size": "160"
}
}
},
{
"name": "Asia",
"children": {
"name": "Pakistan",
"children": {
"name": "ANO",
"size": "100"
}
}
},
{
"name": "European Union",
"children": {
"name": "United Kingdom",
"children": {
"name": "LSE",
"size": "100"
}
}
}
]
You'd be much better off using a tool like xidel that can manipulate csv / raw text and understands JSON:
I'm going to assume so_24300508.csv :
Africa,Kenya,NAI,109
Africa,Kenya,NAA,160
Asia,India,NSI,100
Asia,India,BSE,60
Asia,Pakistan,ISE,120
Asia,Pakistan,ANO,433
European Union,United Kingdom,LSE,550
European Union,United Kingdom,PLU,123
(this is extracted from your JSON sample instead of the CSV sample you provided)
xidel -s so_24300508.csv --json-mode=deprecated --xquery '
[
let $csv:=x:lines($raw)
for $region in distinct-values($csv ! tokenize(.,",")[1])
return {
"name":$region,
"children":[
for $country in distinct-values($csv[starts-with(.,$region)] ! tokenize(.,",")[2]) return {
"name":$country,
"children":for $data in $csv[starts-with(.,$region) and contains(.,$country)]
let $value:=tokenize($data,",")
return {
"name":$value[3],
"size":$value[4]
}
}
]
}
]
'
(without --json-mode=deprecated replace [ ] with array{ })
See this code snippet for intermediate steps leading to this query.
Also see this online xidelcgi demo.
Output:
[
{
"name": "Africa",
"children": [
{
"name": "Kenya",
"children": [
{
"name": "NAI",
"size": "109"
},
{
"name": "NAA",
"size": "160"
}
]
}
]
},
{
"name": "Asia",
"children": [
{
"name": "India",
"children": [
{
"name": "NSI",
"size": "100"
},
{
"name": "BSE",
"size": "60"
}
]
},
{
"name": "Pakistan",
"children": [
{
"name": "ISE",
"size": "120"
},
{
"name": "ANO",
"size": "433"
}
]
}
]
},
{
"name": "European Union",
"children": [
{
"name": "United Kingdom",
"children": [
{
"name": "LSE",
"size": "550"
},
{
"name": "PLU",
"size": "123"
}
]
}
]
}
]