jq add fields from a file into another file - json

I have 2 files (which are quite long):
file1.json (540 objects - i'll write just 2 mockups for ease of use)
[
{
"a": "apple",
"b": "banana",
"c": ["car1", "car2", "car3"],
"d": ["doodle1", "doodle2", "doodle3"],
"e": "elephant"
},
{
"a": "aqua",
"b": "bay",
"c": ["carrot", "chile", "cucumber"],
"d": ["dice", "drop", "dang"],
"e": "elastic"
}
]
file2.json (540 objects - i'll write just 2 mockups for ease of use)
[
{
"l": ["link1", "link2", "link3"]
},
{
"l": ["link4", "link5", "link6"]
}
]
expected result
[
{
"a": "apple",
"b": "banana",
"c": ["car1", "car2", "car3"],
"d": ["doodle1", "doodle2", "doodle3"],
"e": "elephant",
"l": ["link1", "link2", "link3"]
},
{
"a": "aqua",
"b": "bay",
"c": ["carrot", "chile", "cucumber"],
"d": ["dice", "drop", "dang"],
"e": "elastic",
"l": ["link4", "link5", "link6"]
}
]
Is this possible to achieve this with jq or should I process it through other programming languages like python or javascript?

jq is a perfect choice for all kinds of JSON processing. In this case, you could transpose an array of the contents of both files, then add up the aligned items using a map:
jq -n '[inputs] | transpose | map(add)' file1.json file2.json
Demo

Related

having trouble converting csv to json using bash jq

I have one stream output stored in csv file, I need help converting csv to json:
my csv looks like:
cat output.csv
"k","a1",1,"b1","c1","d1",1
"l","a2",2,"b2","c2","d2",2
"m","a3",3,"b3","c3","d3",3
"n","a4",4,"b4","c4","d4",4
"o","a5",5,"b5","c5","d5",5
Required output:
note: I need key configuration to be added to json.
{
"configuration": {
"k": {
"a": "a1",
"number1": "1",
"c": "b1",
"d": "c1",
"e": "d1",
"number2": "1"
},
"l": {
"a": "a2",
"number1": "2",
"c": "b2",
"d": "c2",
"e": "d2",
"number2": "2"
},
.
.
.
}
}
So far tried with jq:
my function is:
cat api.jq
[
inputs |
split(",") |
map(ltrimstr("\"")) |
map(rtrimstr("\"")) |
{
a: .[1],
number1: .[2],
c: .[3],
d: .[4],
e: .[5],
number2: .[6]
}
] | {configuration: .}
Output:
jq -nRf api.jq output.csv
{
"cluster_configuration": [
{
"a": "a1",
"number1": "1",
"c": "b1",
"d": "c1",
"e": "d1",
"number2": "1"
},
{
"a": "a2",
"number1": "2",
"c": "b2",
"d": "c2",
"e": "d2",
"number2": "2"
},
{
"a": "a3",
"number1": "3",
"c": "b3",
"d": "c3",
"e": "d3",
"number2": "3"
},
{
"a": "a4",
"number1": "4",
"c": "b4",
"d": "c4",
"e": "d4",
"number2": "4"
},
{
"a": "a5",
"number1": "5",
"c": "b5",
"d": "c5",
"e": "d5",
"number2": "5"
}
]
}
Here's a possible solution with Miller (available here for several OSs), an interesting tool that supports multiple input/output formats:
mlr --icsv -N put -q '
#map[$1] = {"a": $2, "number1": $3, "c": $4, "d": $5, "e": $6, "number2": $7};
end { dump { "configuration": #map } }
' file.csv
{
"configuration": {
"k": {
"a": "a1",
"number1": 1,
"c": "b1",
"d": "c1",
"e": "d1",
"number2": 1
},
"l": {
...
note: for forcing the numbers to be treated as strings you can use the --infer-none option.
Insofar as your goal is to make a be a key, using from_entries is suitable for that:
[
inputs |
split(",") |
map(ltrimstr("\"")) |
map(rtrimstr("\"")) |
{
"key": .[1],
"value": {
number: .[2],
c: .[3],
d: .[4],
e: .[5],
number: .[6]
}
}
] |
from_entries |
{ configuration: . }
When run with
jq -R -f api.jq <output.csv
...the output is:
{
"configuration": {
"a2": {
"number": "2",
"c": "b2",
"d": "c2",
"e": "d2"
},
"a3": {
"number": "3",
"c": "b3",
"d": "c3",
"e": "d3"
},
"a4": {
"number": "4",
"c": "b4",
"d": "c4",
"e": "d4"
},
"a5": {
"number": "5",
"c": "b5",
"d": "c5",
"e": "d5"
}
}
}
If robustness of CSV parsing is a concern, you could easily adapt
the parser at rosettacode.org. The following converts the CSV rows to JSON arrays; since the "main" program below uses inputs, you'd use the -R and -n command-line options.
## The PEG * operator:
def star(E): (E | star(E)) // . ;
## Helper functions:
# Consume a regular expression rooted at the start of .remainder, or emit empty;
# on success, update .remainder and set .match but do NOT update .result
def consume($re):
# on failure, match yields empty
(.remainder | match("^" + $re)) as $match
| .remainder |= .[$match.length :]
| .match = $match.string;
def parse($re):
consume($re)
| .result = .result + [.match] ;
def ws: consume(" *");
### Parse a string into comma-separated values
def quoted_field_content:
parse("((\"\")|([^\"]))*")
| .result[-1] |= gsub("\"\""; "\"");
def unquoted_field: parse("[^,\"]*");
def quoted_field: consume("\"") | quoted_field_content | consume("\"");
def field: (ws | quoted_field | ws) // unquoted_field;
def record: field | star(consume(",") | field);
def csv2array:
{remainder: .} | record | .result;
inputs | csv2array
I know you raise this question as a bash+jq question, but, if it was a bash+python question, the solution would be trivial:
# csv2json.py
import sys, csv, json
data = { "configuration": { } }
for [k,a,n1,c,d,e,n2] in csv.reader(sys.stdin.readlines()):
data["configuration"][k] = { "a": a, "number1": n1, "c": c, "d": d, "e": e, "number2": n2 }
print(json.dumps(data, indent=2))
Then, in bash (I'm assuming Ubuntu here), we could go:
python3 csv2json.py < output.csv
jq's from_entries can be used to generate objects with chosen keys.
In the below example, we first use Miller to convert CSV to JSON more robustly (in a manner that supports values with commas or quotes) before proceeding with jq
Add a header line at the top so Miller knows what key name to associate with each value:
cat <(echo k,a,number1,c,d,e,number2) output.csv > output_with_header.csv
Convert the csv to json with miller:
mlr --icsv --ojson cat output_with_header.csv > output.json
transform with jq, generating lists of maps with key and value elements and then combining them with from_entries:
jq '{configuration: ([.[]|{key: .k,value: (.|del(.k))}]|from_entries)}' output.json
This results in:
{
"configuration": {
"k": {
"a": "a1",
"number1": 1,
"c": "b1",
"d": "c1",
"e": "d1",
"number2": 1
},
"l": {
"a": "a2",
"number1": 2,
"c": "b2",
"d": "c2",
"e": "d2",
"number2": 2
},
"m": {
"a": "a3",
"number1": 3,
"c": "b3",
"d": "c3",
"e": "d3",
"number2": 3
},
"n": {
"a": "a4",
"number1": 4,
"c": "b4",
"d": "c4",
"e": "d4",
"number2": 4
},
"o": {
"a": "a5",
"number1": 5,
"c": "b5",
"d": "c5",
"e": "d5",
"number2": 5
}
}
}
All together as a oneliner:
cat <(echo k,a,number1,c,d,e,number2) output.csv | mlr --icsv --ojson cat | jq '{configuration: ([.[]|{key: .k,value: (.|del(.k))}]|from_entries)}'

jq combine fields from array of objects in 2 json files into 3rd json file

file1.json
[
{
"a": "a",
"b": "b"
},
{
"a": "a",
"b": "b"
}
]
file2.json
[
{
"c": "c"
},
{
"c": "c"
}
]
desired output: file3.json
[
{
"a": "a",
"b": "b",
"c": "c"
},
{
"a": "a",
"b": "b",
"c": "c"
}
]
For this type of problem, transpose (think zip) can often be used to produce compact solutions. In the present case:
jq -s 'transpose | map(add)' file1.json file2.json
jq's transpose can also be used with arrays that are not of the same length.

How to filter some array in a sub-object with object in a json file with jq

I need to filter a JSON with a nested strucutre like below.
All objects in array b where attribute x contains a "z" in the value of x should be filtered out. The rest should stay in the file.
{
"a": {
"b": [
{
"c": "1",
"x": "aaa",
},
{
"c": "2",
"x": "aza",
},
{
"c": "7",
"x": "azb",
}
]
},
"d": {
"e": [
"1"
],
"f": [
"2"
]
}
}
Expected output:
{
"a": {
"b": [
{
"c": "1",
"x": "aaa"
}
]
},
"d": {
"e": [
"1"
],
"f": [
"2"
]
}
}
use select with contains:
jq '.a.b|=[.[]|select(.x|contains("z")|not)]' file

Combining JSON by common key-value pairs

I'm currently working through an issue, and can't seem to figure this one out. Here's some data so you know what I'm talking about below:
foo.json
{
"Schedule": [
{
"deviceId": 123,
"reservationId": 123456,
"username": "jdoe"
},
{
"deviceId": 456,
"reservationId": 589114,
"username": "jsmith"
}
],
"serverTime": 1522863125.019958
}
bar.json
[
{
"a": {
"b": "10.0.0.1",
"c": "hostname1"
},
"deviceId": 123
},
{
"a": {
"b": "10.0.0.2",
"c": "hostname2"
},
"deviceId": 456
}
]
foobar.json
{
"Schedule": [
{
"deviceId": 123,
"reservationId": 123456,
"username": "jdoe",
"a": {
"b": "10.0.0.1",
"c": "hostname1"
}
}
},
{
"deviceId": 456,
"reservationId": 789101,
"username": "jsmith",
"a": {
"b": "10.0.0.2",
"c": "hostname2"
}
}
],
"serverTime": 1522863125.019958
}
I'm trying to use jq to do this, and had some help from this post: https://github.com/stedolan/jq/issues/1090
The goal is to be able to combine JSON, using some key as a common point between the documents. The data may be nested any amount of levels.. In this case foo.json has nested data only two levels deep, but needs to be combined with data nested 1 level deep.
Any and all suggestions would be super helpful. I'm also happy to clarify and answer questions if needed. Thank you!
With foobar.jq as follows:
def dict(f):
reduce .[] as $o ({}; .[$o | f | tostring] = $o ) ;
($bar | dict(.deviceId)) as $dict
| .Schedule |= map(. + ($dict[.deviceId|tostring] ))
the invocation:
jq -f foobar.jq --argfile bar bar.json foo.json
yields the output shown below.
Notice that the referents in the dictionary contain the full object (including the key/value pair for "deviceId"), but it's not necessary to del(.deviceId) because of the way + is defined in jq.
Output
{
"Schedule": [
{
"deviceId": 123,
"reservationId": 123456,
"username": "jdoe",
"a": {
"b": "10.0.0.1",
"c": "hostname1"
}
},
{
"deviceId": 456,
"reservationId": 589114,
"username": "jsmith",
"a": {
"b": "10.0.0.2",
"c": "hostname2"
}
}
],
"serverTime": 1522863125.019958
}

Parsing float value from string by jq

I have a particular JSON data which contain float value that I need to conditionally process over an array of JSON. This is an example of one JSON instance:
[
{
"a": "0",
"b": "66.67",
"c": "0",
"d": "0"
},
{
"a": "12.33",
"b": "0",
"c": "60.2",
"d": "19.3"
},
{
"a": "70.0",
"b": "92.67",
"c": "0",
"d": "0"
}
]
and I wish to conditionally select like
cat mydata.json | jq '.[] | select((.a > 50) and (.b > 50))'
and it should sound like
{
"a": "70.0",
"b": "92.67",
"c": "0",
"d": "0"
}
The problem is my original data is a string value and I have no idea how to parse it for a conditional selection.
Simply with jq's tonumber function:
jq '.[] | select((.a|tonumber) > 50 and (.b|tonumber) > 50)' mydata.json
The output:
{
"a": "70.0",
"b": "92.67",
"c": "0",
"d": "0"
}