How can I prettyprint JSON on the command line, but allow invalid JSON objects to pass though? - json

I'm currently tailing some logs in bash that are half JSON, half text like below:
{"response":{"message":"asdfasdf"}}
{"log":{"example":"asdfasdf"}}
here is some text
{"another":{"example":"asdfasdf"}}
more text
Each line is either a full valid JSON object or some text that would fail a JSON parser.
I've looked at jq and underscore-cli to see if they have options to return the invalid object in the case of failure, but I'm not seeing any.
I've also tried to use a || operator to cat the piped input, but I'm losing the value somehow. Maybe I should read up on pipes more? Example: getLogs -t | (underscore print || cat)
I think I could write a script that stores the input. Format it, and return the output if successful. If it fails returned the stored value. I feel like there should be a simpler way though. Any thoughts?

You can use this node library
install with
$ npm install -g js-beautify
Here is what I did:
$ js-beautify -r test.js
beautified test.js
I tested it with an incomplete json file and it worked

jq can check for invalid json
#!/bin/bash
while read p; do
if jq -e . >/dev/null 2>&1 <<<"$p"; then
echo $p | jq
else
echo 'Skipping invalid json'
fi
done < /tmp/tst.txt
{
"response": {
"message": "asdfasdf"
}
}
{
"log": {
"example": "asdfasdf"
}
}
Skipping invalid json
{
"another": {
"example": "asdfasdf"
}
}
Skipping invalid json

Related

How to add commas in between JSON objects using Linux Shell and SnowSQL?

While there are several posts about this topic on Stack Overflow, none match my exact use case. I am using a Linux shell script to run SnowSQL to generate a json file.
========================
My json file needs to have a comma between json objects.
This:
{
"CAMPAIGN": "Welcome_New",
"UUID": "fe881781-bdc2-41b2-95f2-e0e8c19dc597"
}
{
"CAMPAIGN": "Welcome_Existing",
"UUID": "77a41c02-beb9-48bf-ada4-b2074c1a78cb"
}
...needs to look this:
{
"CAMPAIGN": "Welcome_New",
"UUID": "fe881781-bdc2-41b2-95f2-e0e8c19dc597"
},
{
"CAMPAIGN": "Welcome_Existing",
"UUID": "77a41c02-beb9-48bf-ada4-b2074c1a78cb"
}
Here is my complete ksh script:
#!/usr/bin/ksh
. /appl/.snf_logon
export SNOW_PKEY_FILE=$(mktemp ./pkey-XXXXXX)
trap "rm -f ${SNOW_PKEY_FILE}" EXIT
LibGetSnowCred
{
outFile=JSON_FILE_TYPE_TEST.json
inDir=/testing
outFileNm=#my_db.my_schema.my_file_stage/${outFile}
snowsql \
--private-key-path $SNOW_PKEY_FILE \
-o exit_on_error=true \
-o friendly=false \
-o timing=false \
-o log_level=ERROR \
-o echo=true <<!
COPY INTO ${outFileNm}
FROM (SELECT object_construct(
'UUID',UUID
,'CAMPAIGN',CAMPAIGN)
FROM my_db.my_schema.JSON_Test_Table
LIMIT 2)
FILE_FORMAT=(
TYPE=JSON
COMPRESSION=NONE
)
OVERWRITE=True
HEADER=False
SINGLE=True
MAX_FILE_SIZE=4900000000
;
get ${outFileNm} file://${inDir}/;
rm ${outFileNm};
!
if [ $? -eq 0 ]; then
echo "Export successful"
else
echo "ERROR in export"
fi
}
Is the best practice to add the comma during the SELECT or after the file is generated and how?
With or without that comma, the text is still not JSON but just a random text that looks like JSON. You export several rows, each row as an independent object. You need to gather all these objects into an array to produce a valid JSON.
A JSON that encodes an array of rows looks like this:
[
{
"CAMPAIGN": "Welcome_New",
"UUID": "fe881781-bdc2-41b2-95f2-e0e8c19dc597"
},
{
"CAMPAIGN": "Welcome_Existing",
"UUID": "77a41c02-beb9-48bf-ada4-b2074c1a78cb"
}
]
The easiest way to produce this output would be to ask the database, if it supports this option (to wrap all the records into a list before generating the JSON, to not export each record in a separate JSON).
If this is not possible then you have a file that contains multiple JSONs. You can use jq to convert these individual JSONs into a JSON similar to the one described above (encoding an array of objects).
It is as simple as that:
jq --slurp '.' input_file > output_file
The option --slurp tells jq to read all the JSONs from the file input_file in memory, to parse them and to put them into an array. That is the program input.
'.' is the jq program. It says "dump the current object". It does not do any processing to the input data. The current object is the array.
After it executes the program (which, in this case doesn't do anything), jq dumps the modified value (as JSON, of course) to the standard output (by default, on screen).
The > output_file part redirects this output to a file (named output_file) instead of showing it on screen.
You can see how it works on the jq playground.

How to use regex to parse this json: { "success" : true }?

I am using a shell script to make an API call and I need to verify that the json response is this:
{ "success" : true }
I am able to echo the call response to see that it has that value but I need to validate the response in an if statement so that the script can continue, I have tried to do this a number of ways with no success
Regex - I have used regex to extract values from other json responses, but I have not found a regex pattern that can extract the value of "success" with this json
String Comparison - I thought of simply using this condition to attempt to match the strings:
if [ "$callResponse" = '{ "success" : true }' ]
However I quickly ran into issues with the script reading the json due to its special characters, I tried using sed to add a backslash before each special character but sed could not read the json either
Lastly I tried to pipe the response to python but got the error "ValueError: No JSON object could be decoded" when using this command:
status=${$callResponse | python -c "import sys, json; print(json.load(sys.stdin)['success'])"}
Does anyone know a regex pattern that could find that specific json string? Is there another simple solution to this issue?
(Note that it is not possible to download jq or any other utilities for this PC)
Since the caller knows that the response is { "success" : true }, I can't think of any reason to not use jq in this case. For instance, you can try something like this:
if echo '{ "success" : true }' | jq --exit-status '.success == true' >/dev/null; then
echo "success"
# Do something success is true in the response.
else
echo "not success" response.
# Do something else success is not true or absent in the
fi
If you want to make an API call and get the response, you can easily pass the JSON response directly from wget to jq instead of going the roundabout way of storing it in an intermediate variable by tweaking it like this:
if wget --timeout 10 -O - -q -t 1 https://your.api.com/endpoint | jq --exit-status '.success == true' >/dev/null; then
echo "success"
else
echo "not success"
fi
To match when the value of success is true in a flexible way:
"success"\s*:\s*"?true"?
This will match all of these:
{ "success" : true }
{ "success" : "true" }
{ "success":true}
To be strict and match the above, but not imbalanced quotes like { "success" : "true }, use this:
"success"\s*:\s*("?)true\1
I would highly recommend not doing it that way.
We used to do it this way long ago and got into trouble when the response code is "200 OK" but receiving {"success": false} seemed to contradict each other.
A better approach is to use the response status codes instead.
Simply return 200 OK if success is true otherwise return the appropriate error status code if its not.
https://www.restapitutorial.com/httpstatuscodes.html
EDIT:
Bash script to help:
COOKIE_FILE="cookies.txt"
SERVER_IP="172.1.2.3"
LOGFILE="logs/api-calls.log"
WGETLOGFILE="logs/last-api-call.log"
#Helper function
on_wget_err ( )
{
EXITCODE=${1}
case ${EXITCODE} in
0) RESULT="OK";;
*) cat ${WGETLOGFILE} >> ${LOGFILE};
grep "HTTP/1.1" ${WGETLOGFILE} | gawk '{print substr($0,16)}'
exit 0;;
esac
}
if wget -O - -qT 4 -t 1 ${SERVER_IP} > /dev/null; then
echo "Server is up"
wget -S -O - --load-cookies ${COOKIE_FILE} "http://${SERVER_IP}${SERVER_ADDRESS}/my/api?param=$1" 2> ${WGETLOGFILE}
on_wget_err ${?}
echo "API was successfull"
else
echo "Server or network down"
exit 1;
fi
Your Python attempt was close. Here's a working one:
callResponse='{ "success" : true }'
status=$(echo "$callResponse" |
python -c "import sys, json; print(json.load(sys.stdin)['success'])")
echo "$status"
Or alternatively, rewritten to go straight in an if statement:
callResponse='{ "success" : true }'
if echo "$callResponse" | python -c "import sys, json; sys.exit(0 if json.load(sys.stdin)['success'] else 1)"
then
echo "Success"
fi

How to use jq to give true or false when uri field is present in my output json

I have a JSON which goes like this:
{
"results":[
{
"uri":"www.xxx.com"
}
]
}
EDIT
When uri is not present, JSON looks like this:
{
"results":[
]
}
In some cases, uri is present and in some cases, it is not.
Now, I want to use jq to return boolean value if uri is present or not.
This is what I wrote so far but despite uri being present, it gives null.
${search_query_response} contains the JSON
file_status=$(jq -r '.uri' <<< ${search_query_response})
Can anyone guide me?
Since you use jq, it means you are working within a shell script context.
If the boolean result is to be handled by the shell script, you can make jq set its EXIT_CODE depending on the JSON request success or failure status, with jq -e
Example shell script using the EXIT_CODE from jq:
if uri=$(jq -je '.results[].uri') <<<"$search_query_response"
then
printf 'Search results contains an URI: %s.\n' "$uri"
else
echo 'No URI in search results.'
fi
See man jq:
-e / --exit-status:
Sets the exit status of jq to 0 if the last output values was neither false nor null, 1 if the last output value was either false or null, or 4 if no valid result was ever produced. Normally jq exits with 2 if there was any usage problem or system error, 3 if there was a jq program compile error, or 0 if the jq program ran.
Another way to set the exit status is with the halt_error builtin function.
The has function does the job:
jq '.results|map(has("uri"))|.[]'
map the has function on .results.

Loop through JSON array shell script

I am trying to write a shell script that loops through a JSON file and does some logic based on every object's properties. The script was initially written for Windows but it does not work properly on a MacOS.
The initial code is as follows
documentsJson=""
jsonStrings=$(cat "$file" | jq -c '.[]')
while IFS= read -r document; do
# Get the properties from the docment (json string)
currentKey=$(echo "$document" | jq -r '.Key')
encrypted=$(echo "$document" | jq -r '.IsEncrypted')
# If not encrypted then don't do anything with it
if [[ $encrypted != true ]]; then
echoComment " Skipping '$currentKey' as it's not marked for encryption"
documentsJson+="$document,"
continue
fi
//some more code
done <<< $jsonStrings
When ran on a MacOs, the whole file is processed at once, so it does not loop through objects.
The closest I got to making it work - after trying a lot of suggestions - is as follows:
jq -r '.[]' "$file" | while read i; do
for config in $i ; do
currentKey=$(echo "$config" | jq -r '.Key')
echo "$currentKey"
done
done
The console result is parse error: Invalid numeric literal at line 1, column 6
I just cannot find a proper way of grabbing the JSON object and reading its properties.
JSON file example
[
{
"Key": "PdfMargins",
"Value": {
"Left":0,
"Right":0,
"Top":20,
"Bottom":15
}
},
{
"Key": "configUrl",
"Value": "someUrl",
"IsEncrypted": true
}
]
Thank you in advance!
Try putting the $jsonStrings in doublequotes: done <<< "$jsonStrings"
Otherwise the standard shell splitting applies on the variable expansion and you probably want to retain the line structure of the output of jq.
You could also use this in bash:
while IFS= read -r document; do
...
done < <(jq -c '.[]' < "$file")
That would save some resources. I am not sure about making this work on MacOS, though, so test this first.

Parsing JSON from shell script using JSON.sh

I'm working on parsing JSON data using JSON.sh. And I wanted to read data from json file (test.json) whose content will be something like,
{
"/home/ukrishnan/projects/test.yml": {
"LOG_DRIVER": "syslog",
"IMAGE": "mysql:5.6"
},
"/home/ukrishnan/projects/mysql/app.xml": {
"ENV_ACCOUNT_BRIDGE_ENDPOINT": "/u01/src/test/sample.txt"
}
}
And I try to parse this JSON using JSON.sh by using,
test_parser=`sh ./lib/JSON.sh < test/test.json`
echo $test_parser
It prints,
["/home/ukrishnan/projects/test.yml","LOG_DRIVER"] "syslog" ["/home/ukrishnan/projects/test.yml","IMAGE"] "mysql:5.6" ["/home/ukrishnan/projects/test.yml"] {"LOG_DRIVER":"syslog","IMAGE":"mysql:5.6"} ["/home/ukrishnan/projects/mysql/app.xml","ENV_ACCOUNT_BRIDGE_ENDPOINT"] "/u01/src/test/sample.txt" ["/home/ukrishnan/projects/mysql/app.xml"] {"ENV_ACCOUNT_BRIDGE_ENDPOINT":"/u01/src/test/sample.txt"} [] {"/home/ukrishnan/projects/test.yml":{"LOG_DRIVER":"syslog","IMAGE":"mysql:5.6"},"/home/ukrishnan/projects/mysql/app.xml":{"ENV_ACCOUNT_BRIDGE_ENDPOINT":"/u01/src/test/sample.txt"}}
Whereas, the same command (sh ./lib/JSON.sh < test/test.json), if I run through terminal, it is printing with line breaks,
["/home/ukrishnan/projects/test.yml","LOG_DRIVER"] "syslog"
["/home/ukrishnan/projects/test.yml","IMAGE"] "mysql:5.6"
["/home/ukrishnan/projects/test.yml"] {"LOG_DRIVER":"syslog","IMAGE":"mysql:5.6"}
["/home/ukrishnan/projects/mysql/app.xml","ENV_ACCOUNT_BRIDGE_ENDPOINT"] "/u01/src/test/sample.txt"
["/home/ukrishnan/projects/mysql/app.xml"] {"ENV_ACCOUNT_BRIDGE_ENDPOINT":"/u01/src/test/sample.txt"}
[] {"/home/ukrishnan/projects/test.yml":{"LOG_DRIVER":"syslog","IMAGE":"mysql:5.6"},"/home/ukrishnan/projects/mysql/app.xml":{"ENV_ACCOUNT_BRIDGE_ENDPOINT":"/u01/src/test/sample.txt"}}
I wanted to read this and assign to bash variables like,
file_name='/home/ukrishnan/projects/test.yml'
key='LOG_DRIVER'
value='syslog'
As I'm almost completely new to shell script and grep or awk, I don't have much idea of how to achieve this. Any help on this would be greatly appreciated.
I wrote a JSON serializer / deserializer for gawk, if you're interested. Save that script and modify it, replacing everything above # === FUNCTIONS === with the following:
#!/usr/bin/gawk -f
# capture JSON string from beginning to end into a scalar variable
{ json = json ORS $0 }
END {
# objectify JSON string to the multilevel array "obj"
deserialize(json, obj)
for (filename in obj) {
print "file_name=" quote(filename)
for (key in obj[filename]) {
# print key="value"
print key "=" quote(obj[filename][key])
}
}
}
Do chmod 755 json.awk and execute it. Output will resemble this:
$ ./json.awk test5.json
file_name="/home/ukrishnan/projects/mysql/app.xml"
ENV_ACCOUNT_BRIDGE_ENDPOINT="/u01/src/test/sample.txt"
file_name="/home/ukrishnan/projects/test.yml"
LOG_DRIVER="syslog"
IMAGE="mysql:5.6"
Hopefully the logic is reasonably easy to follow. If you prefer to output filename=, key=, and value= on every loop iteration, modify the nested for loops accordingly:
for (filename in obj) {
for (key in obj[filename]) {
print "file_name=" quote(filename)
print "key=" quote(key)
print "value=" quote(obj[filename][key])
}
}
That change will result in the following output:
$ ./json.awk test5.json
file_name="/home/ukrishnan/projects/mysql/app.xml"
key="ENV_ACCOUNT_BRIDGE_ENDPOINT"
value="/u01/src/test/sample.txt"
file_name="/home/ukrishnan/projects/test.yml"
key="LOG_DRIVER"
value="syslog"
file_name="/home/ukrishnan/projects/test.yml"
key="IMAGE"
value="mysql:5.6"
Anyway, with that output, you can do something silly in BASH like this to populate and act upon the variables:
#!/bin/bash
./test.awk test5.json | while read -r line; do {
eval $line
[ "${line/=*/}" = "value" ] && {
echo "bash: file_name=$file_name"
echo "bash: key=$key"
echo "bash: value=$value"
echo "------"
}
}; done
It'd probably be more graceful just to do all processing within gawk from start to finish and not mess with the polyglot handoff, though.
Getting back to json.awk, if you prefer to keep json.awk modular for easy reuse in future projects, you could remove everything above # === FUNCTIONS ===, create a separate main.awk containing the code block at the top of this answer, and #include "json.awk" as a helper library pretty much anywhere outside of END {...} (just below the shbang, for example).
JSON.sh (from http://json.org) offers a nice bash friendly means of flattening out a JSON file. Which you've already provided how it looks in your question. So, the flatten form is the format:
[node] tab value
You have to think in UNIX script in extracting the information you want, you'll note the lines you're interested in actually follow this pattern:
["filename","key"] tab ["value"]
In regex notation, we replace:
filename with (.*)
key with (.*)
tab with \t
value with (.*)
We can retrieve the first, second and third matching groups with \1, \2, \3 respectively.
When used in sed we also note that these symbols []() need to be escaped with a backslash \, resulting in the following script:
./lib/JSON.sh < test/test.json | sed 's/\["\(.*\)","\(.*\)\"]\t"\(.*\)"/\1,\2,\3/;t;d'
/home/ukrishnan/projects/test.yml,LOG_DRIVER,syslog
/home/ukrishnan/projects/test.yml,IMAGE,mysql:5.6
/home/ukrishnan/projects/mysql/app.xml,ENV_ACCOUNT_BRIDGE_ENDPOINT,/u01/src/test/sample.txt
Now we put the lines in a loop and for each line, we can extract out filename,key,value:
for line in $(./lib/JSON.sh < test/test.json | sed 's/\["\(.*\)","\(.*\)\"]\t"\(.*\)"/\1,\2,\3/;t;d')
do
IFS="," read -ra arr <<< $line
filename=${arr[0]}
key=${arr[1]}
value=${arr[2]}
cat <<EOF
filename : $filename
key : $key
value : $value
EOF
done
Which outputs:
filename : /home/ukrishnan/projects/test.yml
key : LOG_DRIVER
value : syslog
filename : /home/ukrishnan/projects/test.yml
key : IMAGE
value : mysql:5.6
filename : /home/ukrishnan/projects/mysql/app.xml
key : ENV_ACCOUNT_BRIDGE_ENDPOINT
value : /u01/src/test/sample.txt