How to get a subobject out of JSON using jq, keeping final key in the result without Bash processing? - json

I'm writing a Bash function to get a portion of a JSON object. The API for the function is:
GetSubobject()
{
local Filter="$1" # Filter is of the form .<key>.<key> ... .<key>
local File="$2" # File is the JSON to get the subobject
# Code to get subobject using jq
# ...
}
To illustrate what I mean by a subobject, consider the Bash function call:
GetSubobject .b.x.y example.json
where the file example.json contains:
{
"a": { "p": 1, "q": 2 },
"b":
{
"x":
{
"y": { "j": true, "k": [1,2,3] },
"z": [4,5,6]
}
}
}
The result from the function call would be emitted to stdout:
{
"y": {
"j": true,
"k": [
1,
2,
3
]
}
}
Note that the code jq -r "$Filter" "$File" would not give the desired answer. It would give:
{ "j": true, "k": [1,2,3] }
Please note that the answer I'm looking for needs to be something I can use in the Bash function API above. So, the answer should use the Filter and File variables as show above and not be specific to the example above.
I have come up with a solution; however, it relies on Bash to do part of the job. I am hoping that the solution can be pure jq without reliance on Bash processing.
#!/bin/bash
GetSubobject()
{
local Filter="$1"
local File="$2"
# General case: separate:
# .<key1>.<key2> ... .<keyN-1>.<keyN>
# into:
# Prefix=.<key1>.<key2> ... .<keyN-1>
# Suffix=<keyN>
local Suffix="${Filter##*.}"
local Prefix="${Filter%.$Suffix}"
# Edge case: where Filter = .<key>
# Set:
# Prefix=.
# Suffix=<key>
if [[ -z $Prefix ]]; then
Prefix='.'
Suffix="${Filter#.}"
fi
jq -r "$Prefix|to_entries|map(select(.key==\"$Suffix\"))|from_entries" "$File"
}
GetSubobject "$#"
How would I complete the above Bash function using jq to obtain the desired result, hopefully in a less brute-force way that takes advantage of jq's capabilities without having to do pre-processing in Bash?

Somewhat further simplifying the jq part but with the same general constraints as JawguyChooser's answer, how about the much simpler Bash function
GetSubject () {
local newroot=${1##*.}
jq -r "{$newroot: $1}" "$2"
}
I may be overlooking some nuances of your more-complex Bash processing, but this seems to work for the example you provided.

If I understand what you're trying to do correctly, it doesn't seem possible to me to do it "pure jq" having read the docs (and being a regular jq user myself). The closest I could come to helping here was to simplify the jq part itself:
jq -r "$Prefix| { $Suffix }" "$File"
This has the same behavior as your example (on this limited set of cases):
GetSubobject '.b.x.y' example.json
{
"y": {
"j": true,
"k": [
1,
2,
3
]
}
}
This is really a case of metaprogramming, you want to programmatically operate on a jq program. Well, it makes sense (to me) that jq takes its program as input but doesn't allow you to alter the program itself. bash seems like an appropriate choice for doing the metaprogramming here: to convert a jq program into another one and then run jq using that.

If the goal is to do as little as possible in bash, then maybe the following bash function will fill the bill:
function GetSubobject {
local Filter="$1" # Filter is of the form .<key>.<key> ... .<key>
local File="$2" # File is the JSON to get the subobject
jq '(null|path('"$Filter"')) as $path
| {($path[-1]): '"$Filter"'}' "$File"
}
An alternative would be to pass $Filter in as a string (e.g. --arg filter "$Filter"), have jq do the parsing, and then use getpath.
It would of course be simplest if GetSubobject could be called with the path separated from the field of interest, like this:
GetSubobject .b.x y filename

Related

How to add commas in between JSON objects using Linux Shell and SnowSQL?

While there are several posts about this topic on Stack Overflow, none match my exact use case. I am using a Linux shell script to run SnowSQL to generate a json file.
========================
My json file needs to have a comma between json objects.
This:
{
"CAMPAIGN": "Welcome_New",
"UUID": "fe881781-bdc2-41b2-95f2-e0e8c19dc597"
}
{
"CAMPAIGN": "Welcome_Existing",
"UUID": "77a41c02-beb9-48bf-ada4-b2074c1a78cb"
}
...needs to look this:
{
"CAMPAIGN": "Welcome_New",
"UUID": "fe881781-bdc2-41b2-95f2-e0e8c19dc597"
},
{
"CAMPAIGN": "Welcome_Existing",
"UUID": "77a41c02-beb9-48bf-ada4-b2074c1a78cb"
}
Here is my complete ksh script:
#!/usr/bin/ksh
. /appl/.snf_logon
export SNOW_PKEY_FILE=$(mktemp ./pkey-XXXXXX)
trap "rm -f ${SNOW_PKEY_FILE}" EXIT
LibGetSnowCred
{
outFile=JSON_FILE_TYPE_TEST.json
inDir=/testing
outFileNm=#my_db.my_schema.my_file_stage/${outFile}
snowsql \
--private-key-path $SNOW_PKEY_FILE \
-o exit_on_error=true \
-o friendly=false \
-o timing=false \
-o log_level=ERROR \
-o echo=true <<!
COPY INTO ${outFileNm}
FROM (SELECT object_construct(
'UUID',UUID
,'CAMPAIGN',CAMPAIGN)
FROM my_db.my_schema.JSON_Test_Table
LIMIT 2)
FILE_FORMAT=(
TYPE=JSON
COMPRESSION=NONE
)
OVERWRITE=True
HEADER=False
SINGLE=True
MAX_FILE_SIZE=4900000000
;
get ${outFileNm} file://${inDir}/;
rm ${outFileNm};
!
if [ $? -eq 0 ]; then
echo "Export successful"
else
echo "ERROR in export"
fi
}
Is the best practice to add the comma during the SELECT or after the file is generated and how?
With or without that comma, the text is still not JSON but just a random text that looks like JSON. You export several rows, each row as an independent object. You need to gather all these objects into an array to produce a valid JSON.
A JSON that encodes an array of rows looks like this:
[
{
"CAMPAIGN": "Welcome_New",
"UUID": "fe881781-bdc2-41b2-95f2-e0e8c19dc597"
},
{
"CAMPAIGN": "Welcome_Existing",
"UUID": "77a41c02-beb9-48bf-ada4-b2074c1a78cb"
}
]
The easiest way to produce this output would be to ask the database, if it supports this option (to wrap all the records into a list before generating the JSON, to not export each record in a separate JSON).
If this is not possible then you have a file that contains multiple JSONs. You can use jq to convert these individual JSONs into a JSON similar to the one described above (encoding an array of objects).
It is as simple as that:
jq --slurp '.' input_file > output_file
The option --slurp tells jq to read all the JSONs from the file input_file in memory, to parse them and to put them into an array. That is the program input.
'.' is the jq program. It says "dump the current object". It does not do any processing to the input data. The current object is the array.
After it executes the program (which, in this case doesn't do anything), jq dumps the modified value (as JSON, of course) to the standard output (by default, on screen).
The > output_file part redirects this output to a file (named output_file) instead of showing it on screen.
You can see how it works on the jq playground.

JQ write each object to subdirectory file

I'm new to jq (around 24 hours). I'm getting the filtering/selection already, but I'm wondering about advanced I/O features. Let's say I have an existing jq query that works fine, producing a stream (not a list) of objects. That is, if I pipe them to a file, it produces:
{
"id": "foo"
"value": "123"
}
{
"id": "bar"
"value": "456"
}
Is there some fancy expression I can add to my jq query to output each object individually in a subdirectory, keyed by the id, in the form id/id.json? For example current-directory/foo/foo.json and current-directory/bar/bar.json?
As #pmf has pointed out, an "only-jq" solution is not possible. A solution using jq and awk is as follows, though it is far from robust:
<input.json jq -rc '.id, .' | awk '
id=="" {id=$0; next;}
{ path=id; gsub(/[/]/, "_", path);
system("mkdir -p " path);
print >> path "/" id ".json";
id="";
}
'
As you will need help from outside jq anyway (see #peak's answer using awk), you also might want to consider using another JSON processor instead which offers more I/O features. One that comes to my mind is mikefarah/yq, a jq-inspired processor for YAML, JSON, and other formats. It can split documents into multiple files, and since its v4.27.2 release it also supports reading multiple JSON documents from a single input source.
$ yq -p=json -o=json input.json -s '.id'
$ cat foo.json
{
"id": "foo",
"value": "123"
}
$ cat bar.json
{
"id": "bar",
"value": "456"
}
The argument following -s defines the evaluation filter for each output file's name, .id in this case (the .json suffix is added automatically), and can be manipulated to further needs, e.g. -s '"file_with_id_" + .id'. However, adding slashes will not result in subdirectories being created, so this (from here on comparatively easy) part will be left over for post-processing in the shell.

Accessing JSON value via jq using variable key

I have a file with JSON like:
test.json
{
"NUTS|/nuts/2010": {
"type": "small",
"mfg": "TSQQ",
"colors": []
}
}
I am getting "NUTS|/nuts/2010" from outside and I am storing it in a shell variable. I am trying to use the below snippet and using jq util, but I am not able to access the corresponding json against the above key.
test.sh
#!/bin/bash
NUTS_PATH="NUTS|/nuts/2010" #Storing in shell variable
INPUT_FILE="test.json"
RESULT=($(jq -r --arg NUTS_PATH_ALIAS "$NUTS_PATH" '.[$NUTS_PATH_ALIAS]' $INPUT_FILE))
echo "Result: $RESULT"
echo $RESULT > item.json
When I run this, I am getting:
Result: {
But it should return
{
"type": "small",
"mfg": "TSQQ",
"colors": []
}
Any help. Thanks
The problem isn't associated with jq at all. What you have should work fine, but the issue is with the assignment of the result to an array when you might have intended to store it in a variable
RESULT=($(jq -r --arg NUTS_PATH_ALIAS "$NUTS_PATH" '.[$NUTS_PATH_ALIAS]' $INPUT_FILE))
# ^^^ =( .. ) result is stored into an array
A variable like expansion of an array of form $RESULT refers to element at index 0, i.e. ${RESULT[0]}, which contains the { character of the raw JSON output
You should ideally be doing
RESULT="$(jq -r --arg NUTS_PATH_ALIAS "$NUTS_PATH" '.[$NUTS_PATH_ALIAS]' "$INPUT_FILE")"
I always end up swearing at jq, too!
For me, this jq query works:
$ jq '.["NUTS|/nuts/2010"]' test.json
{
"type": "small",
"mfg": "TSQQ",
"colors": []
}
However, because you've got pipes and slashes in your string, the variable quoting gets a bit funny.
NUTS_PATH='"NUTS|/nuts/2010"' #Note the two sets of quotes
INPUT_FILE="test.json"
RESULT=$(jq ".[$NUTS_PATH]" $INPUT_FILE)
echo "Result: $RESULT"
Result: {
"type": "small",
"mfg": "TSQQ",
"colors": []
}
Disclaimer, I'm not a BASH expert, there may (probably is) be a better way to sort out the quoting

Using JQ to merge two JSON snippets from one file

I've got output from a script that outputs two structurally identical JSON snippets into one file:
{
"Objects": [
{
"Key": "somevalue",
"VersionId": "someversion"
}
],
"Quiet": false
}
{
"Objects": [
{
"Key": "someothervalue",
"VersionId": "someotherversion"
}
],
"Quiet": false
}
I would like to pass this output through JQ to have one Objects[] list, concatenating all of the objects within the two lists, and outputting the same overall structure. I can accomplish it with piping between two separate JQ commands:
jq '.Objects[]' inputfile | jq -s '{"Objects":., "Quiet":false}' -
But I'm wondering if there is a more elegant way to do so using only one invocation of JQ.
I'm currently using JQ version 1.5 but can update if needed.
You don't need to invoke JQ twice there. The second object can be fetched using the input keyword.
.Objects += input.Objects
Online demo
You can use reduce:
jq -s 'reduce .[] as $item ({ Quiet: false }; .Objects += $item.Objects)'
See it in action.
As #oguz-ismail suggested in a comment, the -s (slurp) flag can be removed by using inputs to get the rest of the entries after the first one:
jq 'reduce inputs as $item (.; .Objects += $item.Objects)'
See it in action.
Both versions work with any number of entries in the input (the second version requires at least one).

unix command to filter the json

[
{
"name":"sandboxserver.tar.gz.part-aa",
"hash":"010d126f8ccf199f3cd5f468a90d5ae1",
"bytes":4294967296,
"last_modified":"2018-10-10T01:32:00.069000",
"content_type":"binary/octet-stream"
},
{
"name":"sandboxserver.tar.gz.part-ab",
"hash":"49a6f22068228f51488559c096aa06ce",
"bytes":397973601,
"last_modified":"2018-10-10T01:32:22.395000",
"content_type":"binary/octet-stream"
},
{
"name":"sandboxserver.tar.gz.part-ac",
"hash":"2c5e845f46357e203214592332774f4c",
"bytes":5179281858,
"last_modified":"2018-10-11T08:20:11.566000",
"content_type":"binary/octet-stream"
}
]
I am getting above JSON as response while listing the objects in cloud object storage using curl -l -X GET. How can I get the object "name" assigned to an array while looping through all the objects.
for example
array[1]="sandboxserver.tar.gz.part- aa"
array[2]="sandboxserver.tar.gz.part- ab"
array[3]="sandboxserver.tar.gz.part- ac"
You can use jq.
jq is a powerful tool that lets you read, filter, and write JSON in bash.
You might need to install it first.
Try this:
I've pasted your json into a file:
~$ cat n1.json
[
{
"name":"sandboxserver.tar.gz.part-aa",
"hash":"010d126f8ccf199f3cd5f468a90d5ae1",
"bytes":4294967296,
"last_modified":"2018-10-10T01:32:00.069000",
"content_type":"binary/octet-stream"
},
{
"name":"sandboxserver.tar.gz.part-ab",
"hash":"49a6f22068228f51488559c096aa06ce",
"bytes":397973601,
"last_modified":"2018-10-10T01:32:22.395000",
"content_type":"binary/octet-stream"
},
{
"name":"sandboxserver.tar.gz.part-ac",
"hash":"2c5e845f46357e203214592332774f4c",
"bytes":5179281858,
"last_modified":"2018-10-11T08:20:11.566000",
"content_type":"binary/octet-stream"
}
]
And then used jq to find the names:
~$ jq -r '.[].name' n1.json
sandboxserver.tar.gz.part-aa
sandboxserver.tar.gz.part-ab
sandboxserver.tar.gz.part-ac
If you don't want to depend on external utility like jq, use can use python + bash combo do the trick.
response="$(cat data.json)"
declare -a array
array=($(python -c "import json,sys; data=[arr['name'] for arr in json.loads(sys.argv[1])]; print('\n'.join(data));" "$response"))
echo "${array[#]}"
Advice: Writing embedded python code may soon become unreadable so you may want to put the python code in a separate script and run the script.