unix command to filter the json - json

[
{
"name":"sandboxserver.tar.gz.part-aa",
"hash":"010d126f8ccf199f3cd5f468a90d5ae1",
"bytes":4294967296,
"last_modified":"2018-10-10T01:32:00.069000",
"content_type":"binary/octet-stream"
},
{
"name":"sandboxserver.tar.gz.part-ab",
"hash":"49a6f22068228f51488559c096aa06ce",
"bytes":397973601,
"last_modified":"2018-10-10T01:32:22.395000",
"content_type":"binary/octet-stream"
},
{
"name":"sandboxserver.tar.gz.part-ac",
"hash":"2c5e845f46357e203214592332774f4c",
"bytes":5179281858,
"last_modified":"2018-10-11T08:20:11.566000",
"content_type":"binary/octet-stream"
}
]
I am getting above JSON as response while listing the objects in cloud object storage using curl -l -X GET. How can I get the object "name" assigned to an array while looping through all the objects.
for example
array[1]="sandboxserver.tar.gz.part- aa"
array[2]="sandboxserver.tar.gz.part- ab"
array[3]="sandboxserver.tar.gz.part- ac"

You can use jq.
jq is a powerful tool that lets you read, filter, and write JSON in bash.
You might need to install it first.
Try this:
I've pasted your json into a file:
~$ cat n1.json
[
{
"name":"sandboxserver.tar.gz.part-aa",
"hash":"010d126f8ccf199f3cd5f468a90d5ae1",
"bytes":4294967296,
"last_modified":"2018-10-10T01:32:00.069000",
"content_type":"binary/octet-stream"
},
{
"name":"sandboxserver.tar.gz.part-ab",
"hash":"49a6f22068228f51488559c096aa06ce",
"bytes":397973601,
"last_modified":"2018-10-10T01:32:22.395000",
"content_type":"binary/octet-stream"
},
{
"name":"sandboxserver.tar.gz.part-ac",
"hash":"2c5e845f46357e203214592332774f4c",
"bytes":5179281858,
"last_modified":"2018-10-11T08:20:11.566000",
"content_type":"binary/octet-stream"
}
]
And then used jq to find the names:
~$ jq -r '.[].name' n1.json
sandboxserver.tar.gz.part-aa
sandboxserver.tar.gz.part-ab
sandboxserver.tar.gz.part-ac

If you don't want to depend on external utility like jq, use can use python + bash combo do the trick.
response="$(cat data.json)"
declare -a array
array=($(python -c "import json,sys; data=[arr['name'] for arr in json.loads(sys.argv[1])]; print('\n'.join(data));" "$response"))
echo "${array[#]}"
Advice: Writing embedded python code may soon become unreadable so you may want to put the python code in a separate script and run the script.

Related

How to add commas in between JSON objects using Linux Shell and SnowSQL?

While there are several posts about this topic on Stack Overflow, none match my exact use case. I am using a Linux shell script to run SnowSQL to generate a json file.
========================
My json file needs to have a comma between json objects.
This:
{
"CAMPAIGN": "Welcome_New",
"UUID": "fe881781-bdc2-41b2-95f2-e0e8c19dc597"
}
{
"CAMPAIGN": "Welcome_Existing",
"UUID": "77a41c02-beb9-48bf-ada4-b2074c1a78cb"
}
...needs to look this:
{
"CAMPAIGN": "Welcome_New",
"UUID": "fe881781-bdc2-41b2-95f2-e0e8c19dc597"
},
{
"CAMPAIGN": "Welcome_Existing",
"UUID": "77a41c02-beb9-48bf-ada4-b2074c1a78cb"
}
Here is my complete ksh script:
#!/usr/bin/ksh
. /appl/.snf_logon
export SNOW_PKEY_FILE=$(mktemp ./pkey-XXXXXX)
trap "rm -f ${SNOW_PKEY_FILE}" EXIT
LibGetSnowCred
{
outFile=JSON_FILE_TYPE_TEST.json
inDir=/testing
outFileNm=#my_db.my_schema.my_file_stage/${outFile}
snowsql \
--private-key-path $SNOW_PKEY_FILE \
-o exit_on_error=true \
-o friendly=false \
-o timing=false \
-o log_level=ERROR \
-o echo=true <<!
COPY INTO ${outFileNm}
FROM (SELECT object_construct(
'UUID',UUID
,'CAMPAIGN',CAMPAIGN)
FROM my_db.my_schema.JSON_Test_Table
LIMIT 2)
FILE_FORMAT=(
TYPE=JSON
COMPRESSION=NONE
)
OVERWRITE=True
HEADER=False
SINGLE=True
MAX_FILE_SIZE=4900000000
;
get ${outFileNm} file://${inDir}/;
rm ${outFileNm};
!
if [ $? -eq 0 ]; then
echo "Export successful"
else
echo "ERROR in export"
fi
}
Is the best practice to add the comma during the SELECT or after the file is generated and how?
With or without that comma, the text is still not JSON but just a random text that looks like JSON. You export several rows, each row as an independent object. You need to gather all these objects into an array to produce a valid JSON.
A JSON that encodes an array of rows looks like this:
[
{
"CAMPAIGN": "Welcome_New",
"UUID": "fe881781-bdc2-41b2-95f2-e0e8c19dc597"
},
{
"CAMPAIGN": "Welcome_Existing",
"UUID": "77a41c02-beb9-48bf-ada4-b2074c1a78cb"
}
]
The easiest way to produce this output would be to ask the database, if it supports this option (to wrap all the records into a list before generating the JSON, to not export each record in a separate JSON).
If this is not possible then you have a file that contains multiple JSONs. You can use jq to convert these individual JSONs into a JSON similar to the one described above (encoding an array of objects).
It is as simple as that:
jq --slurp '.' input_file > output_file
The option --slurp tells jq to read all the JSONs from the file input_file in memory, to parse them and to put them into an array. That is the program input.
'.' is the jq program. It says "dump the current object". It does not do any processing to the input data. The current object is the array.
After it executes the program (which, in this case doesn't do anything), jq dumps the modified value (as JSON, of course) to the standard output (by default, on screen).
The > output_file part redirects this output to a file (named output_file) instead of showing it on screen.
You can see how it works on the jq playground.

JQ write each object to subdirectory file

I'm new to jq (around 24 hours). I'm getting the filtering/selection already, but I'm wondering about advanced I/O features. Let's say I have an existing jq query that works fine, producing a stream (not a list) of objects. That is, if I pipe them to a file, it produces:
{
"id": "foo"
"value": "123"
}
{
"id": "bar"
"value": "456"
}
Is there some fancy expression I can add to my jq query to output each object individually in a subdirectory, keyed by the id, in the form id/id.json? For example current-directory/foo/foo.json and current-directory/bar/bar.json?
As #pmf has pointed out, an "only-jq" solution is not possible. A solution using jq and awk is as follows, though it is far from robust:
<input.json jq -rc '.id, .' | awk '
id=="" {id=$0; next;}
{ path=id; gsub(/[/]/, "_", path);
system("mkdir -p " path);
print >> path "/" id ".json";
id="";
}
'
As you will need help from outside jq anyway (see #peak's answer using awk), you also might want to consider using another JSON processor instead which offers more I/O features. One that comes to my mind is mikefarah/yq, a jq-inspired processor for YAML, JSON, and other formats. It can split documents into multiple files, and since its v4.27.2 release it also supports reading multiple JSON documents from a single input source.
$ yq -p=json -o=json input.json -s '.id'
$ cat foo.json
{
"id": "foo",
"value": "123"
}
$ cat bar.json
{
"id": "bar",
"value": "456"
}
The argument following -s defines the evaluation filter for each output file's name, .id in this case (the .json suffix is added automatically), and can be manipulated to further needs, e.g. -s '"file_with_id_" + .id'. However, adding slashes will not result in subdirectories being created, so this (from here on comparatively easy) part will be left over for post-processing in the shell.

How to replace parameter of a json file by a shell script?

Let's say 123.json with below content:
{
"LINE" : {
"A_serial" : "1234",
"B_serial" : "2345",
"C_serial" : "3456",
"X_serial" : "76"
}
}
If I want to use a shell script to change the parameter of X_serial by the original number +1 which is 77 in this example.
I have tried the below script to take out the parameter of X_serial:
grep "X_serial" 123.json | awk {print"$3"}
which outputs 76. But then I don't know how to make it into 77 and then put it back to the parameter of X_serial.
It's not a good idea to use line-oriented tools for parsing/manipulating JSON data. Use jq instead, for example:
$ jq '.LINE.X_serial |= "\(tonumber + 1)"' 123.json
{
"LINE": {
"A_serial": "1234",
"B_serial": "2345",
"C_serial": "3456",
"X_serial": "77"
}
}
This simply updates .LINE.X_serial by converting its value to a number, increasing the result by one, and converting it back to a string.
You need to install powerful JSON querying processor like jq processor. you can can easily install from here
once you install jq processor, try following command to extract the variable from JSON key value
value=($(jq -r '.X_serial' yourJsonFile.json))
you can modify the $value as you preferred operations
With pure Javascript: nodejs and bash :
node <<EOF
var o=$(</tmp/file);
o["LINE"]["X_serial"] = parseInt(o["LINE"]["X_serial"]) + 1;
console.log(o);
EOF
 Output
{ LINE:
{ A_serial: '1234',
B_serial: '2345',
C_serial: '3456',
X_serial: 78 }
}
sed or perl, depending on whether you just need string substitution or something more sophisticated, like arithmetic.
Since you tried grep and awk, let's start with sed:
In all lines that contain TEXT, replace foo with bar
sed -n '/TEXT/ s/foo/bar/ p'
So in your case, something like:
sed -n '/X_serial/ s/\"76\"/\"77\"/ p'
or
$ cat 123.json | sed '/X_serial/ s/\"76\"/\"77\"/' > new.json
This performs a literal substiution: "76" -> "77"
If you would like to perform arithmetic, like "+1" or "+10" then use perl not sed:
$ cat 123.json | perl -pe 's/\d+/$&+10/e if /X_serial/'
{
"LINE" : {
"A_serial" : "1234",
"B_serial" : "2345",
"C_serial" : "3456",
"X_serial" : "86"
}
}
This operates on all lines containing X_serial (whether under "LINE" or under something else), as it is not a json parser.

CURL Get download link from request and download file

I'm using conversocial API:
https://api-docs.conversocial.com/1.1/reports/
Using the sample from the documentation, as after all tweaks I receive this "output"
{
"report": {
"name": "dump", "generation_start_date": "2012-05-30T17:09:40",
"url": "https://api.conversocial.com/v1.1/reports/5067",
"date_from": "2012-05-21",
"generated_by": {
"url": "https://api.conversocial.com/v1.1/moderators/11599",
"id": "11599"
},
"generated_date": "2012-05-30T17:09:41",
"channel": {
"url": "https://api.conversocial.com/v1.1/channels/387",
"id": "387"
},
"date_to": "2012-05-28",
"download": "https://s3.amazonaws.com/conversocial/reports/70c68360-1234/#twitter-from-may-21-2012-to-may-28-2012.zip",
"id": "5067"
}
}
Currently, I can sort this JSON output to download only and will receive this output
{
"report" : {
"download" : "https://s3.amazonaws.com/conversocial/reports/70c68360-1234/#twitter-from-may-21-2012-to-may-28-2012.zip"
}
}
Is it anyway of automating this process by using CURL, to make curl download this file?
To download I'm planning to use simple way as:
curl URL_LINK > FILEPATH/EXAMPLE.ZIP
Currently thinking is there is a way to replace URL_LINK with download link?? Or any other way, method, way around????
Give a try to this:
curl $(curl -s https://httpbin.org/get | jq ".url" -r) > file
Just replace your url and the jq params, based in your json, thay may be:
jq ".report.download" -r
The -r will remove the double quotes "
The way it works is by using a command substitution $():
$(curl -s https://httpbin.org/get | jq ".url" -r)
This will fetch you URL and extract the new URL from the returned JSON using jq the one later is passed to curl as an argument.

How to get a subobject out of JSON using jq, keeping final key in the result without Bash processing?

I'm writing a Bash function to get a portion of a JSON object. The API for the function is:
GetSubobject()
{
local Filter="$1" # Filter is of the form .<key>.<key> ... .<key>
local File="$2" # File is the JSON to get the subobject
# Code to get subobject using jq
# ...
}
To illustrate what I mean by a subobject, consider the Bash function call:
GetSubobject .b.x.y example.json
where the file example.json contains:
{
"a": { "p": 1, "q": 2 },
"b":
{
"x":
{
"y": { "j": true, "k": [1,2,3] },
"z": [4,5,6]
}
}
}
The result from the function call would be emitted to stdout:
{
"y": {
"j": true,
"k": [
1,
2,
3
]
}
}
Note that the code jq -r "$Filter" "$File" would not give the desired answer. It would give:
{ "j": true, "k": [1,2,3] }
Please note that the answer I'm looking for needs to be something I can use in the Bash function API above. So, the answer should use the Filter and File variables as show above and not be specific to the example above.
I have come up with a solution; however, it relies on Bash to do part of the job. I am hoping that the solution can be pure jq without reliance on Bash processing.
#!/bin/bash
GetSubobject()
{
local Filter="$1"
local File="$2"
# General case: separate:
# .<key1>.<key2> ... .<keyN-1>.<keyN>
# into:
# Prefix=.<key1>.<key2> ... .<keyN-1>
# Suffix=<keyN>
local Suffix="${Filter##*.}"
local Prefix="${Filter%.$Suffix}"
# Edge case: where Filter = .<key>
# Set:
# Prefix=.
# Suffix=<key>
if [[ -z $Prefix ]]; then
Prefix='.'
Suffix="${Filter#.}"
fi
jq -r "$Prefix|to_entries|map(select(.key==\"$Suffix\"))|from_entries" "$File"
}
GetSubobject "$#"
How would I complete the above Bash function using jq to obtain the desired result, hopefully in a less brute-force way that takes advantage of jq's capabilities without having to do pre-processing in Bash?
Somewhat further simplifying the jq part but with the same general constraints as JawguyChooser's answer, how about the much simpler Bash function
GetSubject () {
local newroot=${1##*.}
jq -r "{$newroot: $1}" "$2"
}
I may be overlooking some nuances of your more-complex Bash processing, but this seems to work for the example you provided.
If I understand what you're trying to do correctly, it doesn't seem possible to me to do it "pure jq" having read the docs (and being a regular jq user myself). The closest I could come to helping here was to simplify the jq part itself:
jq -r "$Prefix| { $Suffix }" "$File"
This has the same behavior as your example (on this limited set of cases):
GetSubobject '.b.x.y' example.json
{
"y": {
"j": true,
"k": [
1,
2,
3
]
}
}
This is really a case of metaprogramming, you want to programmatically operate on a jq program. Well, it makes sense (to me) that jq takes its program as input but doesn't allow you to alter the program itself. bash seems like an appropriate choice for doing the metaprogramming here: to convert a jq program into another one and then run jq using that.
If the goal is to do as little as possible in bash, then maybe the following bash function will fill the bill:
function GetSubobject {
local Filter="$1" # Filter is of the form .<key>.<key> ... .<key>
local File="$2" # File is the JSON to get the subobject
jq '(null|path('"$Filter"')) as $path
| {($path[-1]): '"$Filter"'}' "$File"
}
An alternative would be to pass $Filter in as a string (e.g. --arg filter "$Filter"), have jq do the parsing, and then use getpath.
It would of course be simplest if GetSubobject could be called with the path separated from the field of interest, like this:
GetSubobject .b.x y filename