Create a nested object json file using bash - json

I have a small bash script that scours through a directory and its subs (media/) and adds the output to a json file.
The line that outputs the json is as follows:
printf '{"id":%s,"foldername":"%s","path":"%s","date":"%s","filename":"%s"},\n' $num "$folder" "/media/$file2" "$fullDate" "$filename" >> /media/files.json
The json file looks like this:
{"id":1,"foldername":"5101","path":"/media/5101/Musicali10.mp3","date":"2015-08-09:13:16","filename":"Musicali10"},
{"id":2,"foldername":"5101","path":"/media/5101/RumCollora.mp3","date":"2015-08-09:13:16","filename":"RumCollora"}
I would like it group all files in a folder and output something like this
[ {
"id":1,
"foldername":"5101",
"files":[
{
"path":"/media/5101/Musicali10.mp3",
"date":"2015-08-09:13:16",
"filename":"Musicali10"
},
{
"path":"/media/5101/RumCollora.mp3",
"date":"2015-08-09:13:16",
"filename":"RumCollora"
}
] },
{
"id":2,
"foldername":"3120",
"files":[
{
"path":"/media/3120/Marimba4.mp3",
"date":"2015-08-04:10:15",
"filename":"Marimba4"
},
{
"path":"/media/3120/Rumbidzaishe6.mp3",
"date":"2015-08-04:09:10",
"filename":"Rumbidzaishe6"
}
]
}
]
My question is how to create a json file that has nested "files" objects? I want each "foldername" to have a nested list of of files. So far I am only able to output each file as an array using the printf statement above.

Related

JSON is parsed incorrectly in Powershell script and I do not know why

I have the following JSON (from an REST API):
{
"QueryBD_REST_CI_01Response": {
"rsStart": 0,
"rsCount": 3,
"rsTotal": 18872,
"BD_REST_CI_01Set": {
"CI": [
{
"CINAME": "ciname_number_one",
"CINUM": "cinum_number_one",
"CISPEC": [
{
"ALNVALUE": "value_one",
"ASSETATTRID": "SCOPEOFADMINISTRATION"
}
]
},
{
"CINAME": "ciname_number_two",
"CINUM": "cinum_number_two",
"CISPEC": [
{
"ALNVALUE": "value_two",
"ASSETATTRID": "SCOPEOFADMINISTRATION"
}
]
},
{
"CINAME": "ciname_number_three",
"CINUM": "cinum_number_two",
"CISPEC": [
{
"ALNVALUE": "value_three",
"ASSETATTRID": "SCOPEOFADMINISTRATION"
}
]
}
]
}
}
}
I have saved this particular piece of json in a file (dummydata.json). In Powershell I run the following command:
$dummydata = Get-Content -Raw -Path G:\CMDB_FNMSDB_vergelijking\dummydata.json | ConvertFrom-Json
echo $dummydata
And the results that I see are the following:
BD_REST_CI_01Set is empty here where I expect to see data (CI array and its contents). I checked the JSON on https://jsonlint.com/ and it validated succesfully. Maybe I just don't know how to fetch that piece of data but I believe its not here in Powershell.
Why do I not see the data?
I typically write a script which reads the JSON from the file and then parses it. Ths script is something like this:
foreach ($name in $x) {
if ( $name.name_with_namespace -match ".*myProjectName.*" ) {
write-host "$($name.name_with_namespace) - $($name.id)"
}
}
Hope this helps.
-Suhas

jq: Conditional insert using "lookup" & "target" JSON objects

I'm trying to improve a bash script I wrote using jq (Python version), but can't quite get the conditional nature of the task at hand to work.
The task: insert array from one JSON object ("lookup") into another ("target") only if the key of the "lookup" matches a particular "higher-level" value in the "target". Assume that the two JSON objects are in lookup.json and target.json, respectively.
A minimal example to make this clearer:
"Lookup" JSON:
{
"table_one": [
"a_col_1",
"a_col_2"
],
"table_two": [
"b_col_1",
"b_col_2",
"b_col_3"
]
}
"Target" JSON:
{
"top_level": [
{
"name": "table_one",
"tests": [
{
"test_1": {
"param_1": "some_param"
}
},
{
"test_2": {
"param_1": "another_param"
}
}]
},
{
"name": "table_two",
"tests": [
{
"test_1": {
"param_1": "some_param"
}
},
{
"test_2": {
"param_1": "another_param"
}
}
]
}
]
}
I want the output to be:
{
"top_level": [{
"name": "table_one",
"tests": [{
"test_1": {
"param_1": "some_param"
}
},
{
"test_2": {
"param_1": "another_param",
"param_2": [
"a_col_1",
"a_col_2"
]
}
},
{
"name": "table_two",
"tests": [{
"test_1": {
"param_1": "some_param"
}
},
{
"test_2": {
"param_1": "another_param",
"param_2": [
"b_col_1",
"b_col_2",
"b_col_3"
]
}
}
]
}
]
}
]
}
Hopefully, that makes sense. Early attempts slurped both JSON blobs and assigned them to two variables. I'm trying to select for a match on [roughly] ($lookup | keys[]) == $target.top_level.name, but I can't quite get this match or the subsequent the array insert working.
Any advice is well-received!
Assuming the JSON samples have been corrected, and that the following program is in the file "target.jq", the invocation:
jq --argfile lookup lookup.json -f target.jq target.json
produces the expected result.
target.jq
.top_level |= map(
$lookup[.name] as $value
| .tests |= map(
if has("test_2")
then .test_2.param_2 = $value
else . end) )
Caveat
Since --argfile is officially deprecated, you might wish to choose an alternative method of passing in the contents of lookup.json, but --argfile is supported by all extant versions of jq as of this writing.
The jq answer is already given, but the ask itself is so fascinating - it requires a cross-lookup from a source file into the file being inserted, so I could not help providing also an alternative solution using jtc utility:
<target.json jtc -w'<name>l:<N>v[-1][tests][-1:][0]' \
-i file.json -i'<N>t:' -T'{"param_2":{{}}}'
A brief overlook of the used options:
-w'<name>l:<N>v[-1][tests][-1:][0]' - selects points of insertions in the source (target.json) by finding and memorizing into namespace N keys to be looked up in the inserted file, then rolling back 1 level up in the JSON tree, selecting tests label, then the last entry in it and finally addressing a 1st element of the last one
-i file.json make an insertion from the file
-i'<N>t:' - this walk over file.json finds recursively a tag (label) preserved in the namespace N from the respective walk -w (if not this insert option with the walk argument, then the whole file would get inserted into the insertion points -w..)
-T'{"param_2":{{}}}' - finally, a template operation is applied onto the insertion result transforming found entry (in file.json) into the one with the right label
PS. I'm the developer of the jtc - multithreading JSON processing utility for unix.
PPS. the disclaimer is required by SO.

Create .jsonl files from .csv

I want to use AutoML, specifically the Entity extraction, however, I'm asked to upload a .jsonl file.
I don't know that a .jsonl file is nor how to create it. I only have a .csv file.
So, how can I create a .jsonl file from a .csv file? And if that is not possible, how can I create a .jsonl file?
This is JSONlines http://jsonlines.org/
And you can use Miller (https://github.com/johnkerl/miller). In example if your input CSV is
fieldOne,FieldTwo
1,lorem
2,ipsum
you can run
mlr --c2j cat input_01.csv >output.json
to have
{ "fieldOne": 1, "FieldTwo": "lorem" }
{ "fieldOne": 2, "FieldTwo": "ipsum" }
This output is a JSON Lines (one valid JSON object, for each row). If you want a JSON you must add the --jlistwrap flag.
mlr --c2j --jlistwrap cat input.csv
to have
[
{ "fieldOne": 1, "FieldTwo": "lorem" }
,{ "fieldOne": 2, "FieldTwo": "ipsum" }
]

how to extract multiple values from json object using jq command

I am trying to get multiple values from a json object.
{
"nextToken": "9i2x1mbCpfo5hQ",
"jobSummaryList": [
{
"jobName": "012210",
"jobId": "0196f81cae73"
}
]
}
I want nextToken's value and jobName in one jq command.
https://stedolan.github.io/jq/manual/
jq '.nextToken, .jobSummaryList[].jobName' file

Need help! - Unable to load JSON using COPY command

Need your expertise here!
I am trying to load a JSON file (generated by JSON dumps) into redshift using copy command which is in the following format,
[
{
"cookieId": "cb2278",
"environment": "STAGE",
"errorMessages": [
"70460"
]
}
,
{
"cookieId": "cb2271",
"environment": "STG",
"errorMessages": [
"70460"
]
}
]
We ran into the error - "Invalid JSONPath format: Member is not an object."
when I tried to get rid of square braces - [] and remove the "," comma separator between JSON dicts then it loads perfectly fine.
{
"cookieId": "cb2278",
"environment": "STAGE",
"errorMessages": [
"70460"
]
}
{
"cookieId": "cb2271",
"environment": "STG",
"errorMessages": [
"70460"
]
}
But in reality most JSON files from API s have this formatting.
I could do string replace or reg ex to get rid of , and [] but I am wondering if there is a better way to load into redshift seamlessly with out modifying the file.
One way to convert a JSON array into a stream of the array's elements is to pipe the former into jq '.[]'. The output is sent to stdout.
If the JSON array is in a file named input.json, then the following command will produce a stream of the array's elements on stdout:
$ jq ".[]" input.json
If you want the output in jsonlines format, then use the -c switch (i.e. jq -c ......).
For more on jq, see https://stedolan.github.io/jq