I have a bit of a weird data setup, I have the following json files:
file 1:
[
["04-05-2020",
12
],
["03-05-2020",
16
]
]
file 2:
[
["04-05-2020",
50
],
["03-05-2020",
70
]
]
I want to merge the 2 json files using the Dates (which are not specified as keys) and reassign keys and values to the output, such that the output is something like:
file 1:
[
{date: "04-05-2020",
value1 : 12,
value2 : 50
},
{date: "03-05-2020",
value1 : 16,
value2: 70
}
]
My thoughts are I might have to merge the files together first and do some kind of reduce operation on the dates in the array, but my attempts have so far been unsuccessful. Or perhaps I should be formatting the Array first into Key + Value and then do a jq -s 'add'? I'm actually not sure how to reformat this.
One way of doing it using reduce:
reduce inputs[] as [$date, $value] ({};
if has($date) then
.[$date] += {value2: $value}
else
.[$date] = {$date, value1: $value}
end
) | map(.)
Note that you need to specify -n/--null-input option on the command line in order for inputs to work.
Online demo
If the arrays in file1.json and file2.json are in lockstep as in your example, you could simply write:
.[0] |= map({date: .[0], value1: .[1]})
| .[1] |= map({date: .[0], value2: .[1]})
| transpose
| map(add)
using an invocation along the lines:
jq -s -f program.jq file1.json file2.json
Of course there are many variations.
Related
I have a requirement where in 2 parameter files needs to be merged to one using Jq
param1.json
[
"name=xyz",
"age=40",
"email=qqqq"
]
param2.json
[
"name=xyz",
"age=42",
"drivingLicense=2761"
]
I need a resultant value to be
[
"name=xyz",
"age=42",
"email=qqqq",
"drivingLicense=2761"
]
When I try to use Jq add jq -s '.[0] + .[1]' param1.json param2.json the resultant
[
"name=xyz",
"age=40",
"email=qqqq",
"name=xyz",
"age=42",
"drivingLicense=2761"
]
I tried using jq '. * input' param1.json param2.json but that is not working either
What is the best way to merge them
TIA
This approach makes use of the circumstance that object field names are unique. On collision, latter items overwrite former ones.
jq -s '[add | with_entries(.key = (.value | .[:index("=")]))[]]'
[
"name=xyz",
"age=42",
"email=qqqq",
"drivingLicense=2761"
]
Demo
Note: Instead of add you can, of course, still use .[0] + .[1] or . + input (the latter without -s).
You can first convert your arrays into objects, then add those objects together; then convert to an array again:
$ jq -s 'map(map(./"="|{(first):.[1:]|join("=")})|add)|add|to_entries|map(join("="))' param1.json param2.json
[
"name=xyz",
"age=42",
"email=qqqq",
"drivingLicense=2761"
]
If your values cannot contain an equal sign, then {(first):.[1:]|join("=")} can be simplified to {(first):last}.
Or merging the arrays to one big array before converting to objects:
add
|map(./"="|{(first):.[1:]|join("=")})
|add
|to_entries
|map(join("="))
Levaraging the fact that this can be reformulated as a grouping problem, you can group by the "key" of your string, then select the last item in each group (A reusable function to build group objects can help but is not required).
$ jq -s 'add | map(./"=") | group_by(first) | map(last|join("="))' param1.json param2.json
[
"age=42",
"drivingLicense=2761",
"email=qqqq",
"name=xyz"
]
I have a json file I am parsing with jq. This is a sample of the file
[{
"key1":{...},
"key2":{...}
}]
[{
"key1":{...},
"key2":{...}
}]
...
each line is a list containing a json (which I know is not technically a json format but jq still works on such a file)
The below jq command works:
cat file.json | jq -r '.[] | [.key1,.key2]'
The above correctly shows:
[
<value_of_key1>,<value_of_key2>
]
[
<value_of_key1>,<value_of_key2>
]
However, I want .key1,.key2 to be dynamic since these keys can change. So I want to pass a variable to jq. Something like:
$KEYS=.key1,.key2
cat file.json | jq -r --arg var "$KEYS" '.[] | [$var]'
But the above is returning the keys themselves:
[
".key1,.key2"
]
[
".key1,.key2"
]
why is this happening? what is the correct command to make this happen?
This answer does not help me. I am not getting any errors as the OP in that question.
Fetching the value of a jq variable doesn't cause it to be executed as jq code.
Furthermore, jq lacks the facility to take a string, compile it as jq code, and evaluate the result. (This is commonly known as eval.)
So, short of a writing a jq parser and evaluator in jq, you will need to impose limits and/or accept a different format.
For example,
keys='[ [ "key1", "childkey" ], [ "key2", "childkey2" ] ]' # JSON
jq --argjson keys "$keys" '.[] | [ getpath( $keys[] ) ]' file.json
or
keys='key1.childkey,key2.childkey2'
jq --arg keys "$keys" '
( ( $keys / "," ) | map( . / "." ) ) as $keys |
.[] | [ getpath( $keys[] ) ]
' file.json
Suppose you have:
cat file
[{
"key1":1,
"key2":2
}]
[{
"key1":1,
"key2":2
}]
You can use a jq command like so:
jq '.[] | [.key1,.key2]' file
[
1,
2
]
[
1,
2
]
You can use -f to execute a filter from a file and nothing keeps you from creating the file separately from the shell variables.
Example:
keys=".key1"
echo ".[] | [${keys}]" >jqf
jq -f jqf file
[
1
]
[
1
]
Or just build the string directly into jq:
# note double " causing string interpolation
jq ".[] | [${keys}]" file
You can use --argjson option and destructuring.
file.json
[{"key1":{"a":1},"key2":{"b":2}}]
[{"key1":{"c":1},"key2":{"d":2}}]
$ in='["key1","key2"]' jq -c --argjson keys "$in" '$keys as [$key1,$key2] | .[] | [.[$key1,$key2]]' file.json
output:
[{"a":1},{"b":2}]
[{"c":1},{"d":2}]
Elaborating on ikegami's answer.
To start with here's my version of the answer:
$ in='key1.a,key2.b'; jq -c --arg keys "$in" '($keys/","|map(./".")) as $paths | .[] | [getpath($paths[])]' <<<$'[{"key1":{"a":1},"key2":{"b":2}}] [{"key1":{"a":3},"key2":{"b":4}}]'
This gives output
[1,2]
[3,4]
Let's try it.
We have input
[{"key1":{"a":1},"key2":{"b":2}}]
[{"key1":{"a":3},"key2":{"b":4}}]
And we want to construct array
[["key1","a"],["key2","b"]]
then use it on getpath(PATHS) builtin to extract values out of our input.
To start with we are given in shell variable with string value key1.a,key2.b. Let's call this $keys.
Then $keys/"," gives
["key1.a","key2.b"]
["key1.a","key2.b"]
After that $keys/","|map(./".") gives what we want.
[["key1","a"],["key2","b"]]
[["key1","a"],["key2","b"]]
Let's call this $paths.
Now if we do .[]|[getpath($paths[])] we get the values from our input equivalent to
[.[] | .key1.a, .key2.b]
which is
[1,2]
[3,4]
I am attempting to parse a JSON structure to extract a dependency path, for use in an automation script.
The structure of this JSON is extracted to a format like this:
[
{
"Id": "abc",
"Dependencies": [
]
},
{
"Id": "def",
"Dependencies": [
"abc"
]
},
{
"Id": "ghi",
"Dependencies": [
"def"
]
}
]
Note: Lots of other irrelevant fields removed.
The plan is to be able to pass into my JQ command the Id of one of these and get back out a list.
Eg:
Input: abc
Expected Output: []
Input: def
Expected Output: ["abc"]
Input: ghi
Expected Output: ["abc", "def"]
Currently have a jq script like this (https://jqplay.org/s/NAhuXNYXXO):
jq
'. as $original | .[] |
select(.Id == "INPUTVARIABLE") |
[.Dependencies[]] as $level1Dep | [$original[] | select( [ .Id == $level1Dep[] ] | any )] as $level1Full | $level1Full[] |
[.Dependencies[]] as $level2Dep | [$original[] | select ( [ .Id == $level2Dep[] ] | any )] as $level2Full |
[$level1Dep[], $level2Dep[]]'
Input: abc
Output: empty
Input: def
Output: ["abc"]
Input: ghi
Output: ["def","abc"]
Great! However, as you can see this is not particularly scale-able and will only handle two dependency levels (https://jqplay.org/s/Zs0xIvJ2Zn), and also falls apart horribly when there are multiple dependencies on an item (https://jqplay.org/s/eB9zHQSH2r).
Is there a way of constructing this within JQ or do I need to move out to a different language?
I know that the data cannot have circular dependencies, it is pulled from a database that enforces this.
It's trivial then. Reduce your input JSON down to an object where each Id and corresponding Dependencies array are paired, and walk through it aggregating dependencies using a recursive function.
def deps($depdb; $id):
def _deps($id): $depdb[$id] // empty
| . + map(_deps(.)[]);
_deps($id);
deps(map({(.Id): .Dependencies}) | add; $fid)
Invocation:
jq -c --arg fid 'ghi' -f prog.jq file
Online demo - arbitrary dependency levels
Online demo - multiple dependencies per Id
Here's a short program that handles circular dependencies efficiently and illustrates how a subfunction can be defined after the creation of a local variable (here, $next) for efficiency:
def dependents($x):
(map( {(.Id): .Dependencies}) | add) as $next
# Input: array of dependents computed so far
# Output: array of all dependents
| def tc($x):
($next[$x] - .) as $new
| if $new == [] then .
else (. + $new | unique)
# avoid calling unique again:
| . + ([tc($new[])[]] - .)
end ;
[] | tc($x);
dependents($start)
Usage
With the given input and an invocation such as
jq --arg start START -f program.jq input.json
the output for various values of START is:
START output
abc []
def ["abc"]
ghi ["def", "abc"]
If the output must be sorted, then simply add a call to sort.
[
{
"Description": "Copied for Destination xxx from Sourc 30c for Snapshot 1. Task created on X,52,87,14,76.",
"Encrypted": false,
"ID": "snap-074",
"Progress": "100%",
"Time": "2019-06-11T09:25:23.110Z",
"Owner": "883065",
"Status": "completed",
"Volume": "vol1",
"Size": 16
},
{
"Description": "Copied for Destination yy from Source 31c for Snapshot 2. Task created on X,52,87,14,76.",
"Encrypted": false,
"ID": "snap-096",
"Progress": "100%",
"Time": "2019-06-11T10:18:01.410Z",
"Owner": "1259",
"Status": "completed",
"Volume": "vol-2",
"Size": 4
}
]
I have that json file that I'm trying to convert to csv using the following command:
jq -r '. | map(.Description[], .Encrypted, .ID, .Progress, .Time, .Owner, .Status, .Volume, .Size | join(",")) | join("\n")' snapshots1.json
But I'm getting error:
jq: error (at snapshots1.json:24): Cannot iterate over string ("Copied for...)
I look at similar post in jq: error: Cannot iterate over string but can't figure out the error. Any help is appreciated.
jq -r '(map(keys) | add | unique) as $cols | map(. as $row | $cols | map($row[.])) as $rows | $cols, $rows[] | #csv' snapshots1.json >> myfile.csv
Found this post that explains this code and it worked for me.
I think you were on the right track. Here is how I'd do it:
jq -r '.[] | map(..) | #csv' snapshot1.json > snapshot1.csv
There's a couple of small problems with your code:
.Descriptions[] - Descriptions doesn't have an array so the square brackets don't work - there's no array to open.
Suppose we get rid of the square brackets, you see that the code works insofar as it puts the contents of the objects into an array. However, it put the contents into one array - the result is that your csv will only have one line (and I'm assuming that you want each object on separate rows.). This is because the map function puts all the contents into one array (see documentation: jq Manual) - so you have to split open the array first.
The first part of your code with the dot (.) doesn't do anything - it simply returns the whole JSON as is. If you want play around with it, try .[] and then experiment from there.
Edited: Spelling
There's a risk in using .. here to extract the "values" in an object: what if the ordering of the keys in the input objects differs between objects?
Here's a generic filter which addresses this and other issues. It also emits a suitable "header" line:
def object2array(stream):
foreach stream as $x (null;
if . == null then $x | [true, keys_unsorted] else .[0]=false end;
(if .[0] then .[1] else empty end),
.[1] as $keys | $x | [getpath( $keys[] | [.]) ] );
Example
def data: [{a:1,b:2}, {b:22,a:11,c:0}];
object2array(data[])
produces:
["a","b"]
[1,2]
[11,22]
Just right for piping to #csv or #tsv.
Solution
So the solution to the original problem would essentially be:
object2array(.[]) | #csv
I receive the following input file:
input.json:
[
{"ID":"aaa_12301248","time_CET":"00:00:00","VALUE":10,"FLAG":"0"},
{"ID":"aaa_12301248","time_CET":"00:15:00","VALUE":18,"FLAG":"0"},
{"ID":"aaa_12301248","time_CET":"00:30:00","VALUE":160,"FLAG":"0"},
{"ID":"bbb_0021122","time_CET":"00:00:00","VALUE":null,"FLAG":"?"},
{"ID":"bbb_0021122","time_CET":"00:15:00","VALUE":null,"FLAG":"?"},
{"ID":"bbb_0021122","time_CET":"00:30:00","VALUE":22,"FLAG":"0"},
{"ID":"ccc_0021122","time_CET":"00:00:00","VALUE":null,"FLAG":"?"},
{"ID":"ccc_0021122","time_CET":"00:15:00","VALUE":null,"FLAG":"?"},
{"ID":"ccc_0021122","time_CET":"00:30:00","VALUE":20,"FLAG":"0"},
{"ID":"ddd_122455","time_CET":"00:00:00","VALUE":null,"FLAG":"?"},
{"ID":"ddd_122455","time_CET":"00:15:00","VALUE":null,"FLAG":"?"},
{"ID":"ddd_122455","time_CET":"00:30:00","VALUE":null,"FLAG":"?"},
]
As you can see there are some valid values (FLAG: 0) and some invalid values (FLAG: "?").
Now I got a file looking like this (one for each ID):
aaa.json:
[
{"ID":"aaa_12301248","time_CET":"00:00:00","VALUE":10,"FLAG":"0"},
{"ID":"aaa_12301248","time_CET":"00:15:00","VALUE":null,"FLAG":"?"},
{"ID":"aaa_12301248","time_CET":"00:55:00","VALUE":45,"FLAG":"0"}
]
As you can see, object one is the same as in input.json but object two is invalid (FLAG: "?"). That's why object two has to be replaced by the correct object from input.json (with VALUE:18).
Objects can be identified by "time_CET" and "ID" element.
Additionally, there will be new objects in input.json, that have not been part of aaa.json etc. These objects should be added to the array, and valid objects from aaa.json should be kept.
In the end, aaa.json should look like this:
[
{"ID":"aaa_12301248","time_CET":"00:00:00","VALUE":10,"FLAG":"0"},
{"ID":"aaa_12301248","time_CET":"00:15:00","VALUE":18,"FLAG":"0"},
{"ID":"aaa_12301248","time_CET":"00:30:00","VALUE":160,"FLAG":"0"},
{"ID":"aaa_12301248","time_CET":"00:55:00","VALUE":45,"FLAG":"0"}
]
So, to summarize:
look for FLAG: "?" in aaa.json
replace this object with matching object from input.json using "ID"
and "time_CET" for mapping.
Keep exisiting valid objects and add objects from input.json that
did not exist in aaa.json before (this means only objects starting
with "aaa" in "ID" field)
repeat this for bbb.json, ccc.json and ddd.json
I am not sure if it's possible to get this done all at once with a command like this, because the output has to go to back to the correct id files (aaa, bbb ccc.json):
jq --argfile aaa aaa.json --argfile bbb bbb.json .... -f prog.jq input.json
The problem is, that the number after the identifier (aaa, bbb, ccc etc.) may change. So to make sure objects are added to the correct file/array, a statement like this would be required:
if (."ID"|contains("aaa")) then ....
Or is it better to run the program several times with different input parameters? I am not sure..
Thank you in advance!!
Here is one approach
#!/bin/bash
# usage: update.sh input.json aaa.json bbb.json....
# updates each of aaa.json bbb.json....
input_json="$1"
shift
for i in "$#"; do
jq -M --argfile input_json "$input_json" '
# functions to restrict input.json to keys of current xxx.json file
def prefix: input_filename | split(".")[0];
def selectprefix: select(.ID | startswith(prefix));
# functions to build and probe a lookup table
def pk: [.ID, .time_CET];
def lookup($t;$k): $t | getpath($k);
def lookup($t): lookup($t;pk);
def organize(s): reduce s as $r ({}; setpath($r|pk; $r));
# functions to identify objects in input.json missing from xxx.json
def pks: paths | select(length==2);
def missing($t1;$t2): [$t1|pks] - [$t2|pks] | .[];
def getmissing($t1;$t2): [ missing($t1;$t2) as $p | lookup($t1;$p)];
# main routine
organize(.[]) as $xxx
| organize($input_json[] | selectprefix) as $inp
| map(if .FLAG != "?" then . else . += lookup($inp) end)
| . + getmissing($inp;$xxx)
' "$i" | sponge "$i"
done
The script uses jq in a loop to read and update each aaa.json... file.
The filter creates temporary objects to facilitate looking up values by [ID,time_CET], updates any values in the aaa.json with a FLAG=="?" and finally adds any values from input.json that are missing in aaa.json.
The temporary lookup table for input.json uses input_filename so that only keys starting with a prefix matching the name of the currently processed file will be included.
Sample Run:
$ ./update.sh input.json aaa.json
aaa.json after run:
[
{
"ID": "aaa_12301248",
"time_CET": "00:00:00",
"VALUE": 10,
"FLAG": "0"
},
{
"ID": "aaa_12301248",
"time_CET": "00:15:00",
"VALUE": 18,
"FLAG": "0"
},
{
"ID": "aaa_12301248",
"time_CET": "00:55:00",
"VALUE": 45,
"FLAG": "0"
},
{
"ID": "aaa_12301248",
"time_CET": "00:30:00",
"VALUE": 160,
"FLAG": "0"
}
]