JSON jq execute with Process Substitution - json

I create bash script to parse json file and generate hosts. For that I use jq but I cannot get it work with variable domain_count changing.
domain_count=0
jq -r .domains[] variables.json | while read domain; do
host="0.0.0.0 ${domain}"
echo $host;
# ((domain_count++))
done
echo $domain_count
It is still 0.
So that is because Process Substitution. I tried change it different ways. But non of it works.
while read domain
do
host="0.0.0.0 ${domain}"
echo $host;
((domain_count++))
done < <(jq -r .domains[] variables.json)
echo $domain_count
I got next error
generate.sh: line 20: syntax error near unexpected token `<'
generate.sh: line 20: `done < <(jq -r .domains[] variables.json)'

Different steps of a pipeline behave like sub shells. Variables are inherited from the parent, but changes do not propagate back to the parent. You have to make your echo statement part of this step in the pipeline:
domain_count=0
jq -r '.domains[]' variables.json | {
while read -r domain; do
host="0.0.0.0 ${domain}"
echo "$host";
domain_count+$((domain_count++))
done
echo $domain_count
}
But if all you want to do is count the number of domains/hosts, that can be done with jq directly:
domain_count=$(jq '.domains|length' variables.json)
And to output all domains formatted as hosts:
jq -r '.domains[] | "0.0.0.0 \(.)"' variables.json
Summarized, without losing functionality, your script can be shortened to:
jq -r '.domains[] | "0.0.0.0 \(.)"' variables.json
domain_count=$(jq '.domains|length' variables.json)

Related

Loop through JSON array shell script

I am trying to write a shell script that loops through a JSON file and does some logic based on every object's properties. The script was initially written for Windows but it does not work properly on a MacOS.
The initial code is as follows
documentsJson=""
jsonStrings=$(cat "$file" | jq -c '.[]')
while IFS= read -r document; do
# Get the properties from the docment (json string)
currentKey=$(echo "$document" | jq -r '.Key')
encrypted=$(echo "$document" | jq -r '.IsEncrypted')
# If not encrypted then don't do anything with it
if [[ $encrypted != true ]]; then
echoComment " Skipping '$currentKey' as it's not marked for encryption"
documentsJson+="$document,"
continue
fi
//some more code
done <<< $jsonStrings
When ran on a MacOs, the whole file is processed at once, so it does not loop through objects.
The closest I got to making it work - after trying a lot of suggestions - is as follows:
jq -r '.[]' "$file" | while read i; do
for config in $i ; do
currentKey=$(echo "$config" | jq -r '.Key')
echo "$currentKey"
done
done
The console result is parse error: Invalid numeric literal at line 1, column 6
I just cannot find a proper way of grabbing the JSON object and reading its properties.
JSON file example
[
{
"Key": "PdfMargins",
"Value": {
"Left":0,
"Right":0,
"Top":20,
"Bottom":15
}
},
{
"Key": "configUrl",
"Value": "someUrl",
"IsEncrypted": true
}
]
Thank you in advance!
Try putting the $jsonStrings in doublequotes: done <<< "$jsonStrings"
Otherwise the standard shell splitting applies on the variable expansion and you probably want to retain the line structure of the output of jq.
You could also use this in bash:
while IFS= read -r document; do
...
done < <(jq -c '.[]' < "$file")
That would save some resources. I am not sure about making this work on MacOS, though, so test this first.

use curl/bash command in jq

I am trying to get a list of URL after redirection using bash scripting. Say, google.com gets redirected to http://www.google.com with 301 status.
What I have tried is:
json='[{"url":"google.com"},{"url":"microsoft.com"}]'
echo "$json" | jq -r '.[].url' | while read line; do
curl -LSs -o /dev/null -w %{url_effective} $line 2>/dev/null
done
So, is it possible for us to use commands like curl inside jq for processing JSON objects.
I want to add the resulting URL to existing JSON structure like:
[
{
"url": "google.com",
"redirection": "http://www.google.com"
},
{
"url": "microsoft.com",
"redirection": "https://www.microsoft.com"
}
]
Thank you in advance..!
curl is capable of making multiple transfers in a single process, and it can also read command line arguments from a file or stdin, so, you don't need a loop at all, just put that JSON into a file and run this:
jq -r '"-o /dev/null\nurl = \(.[].url)"' file |
curl -sSLK- -w'%{url_effective}\n' |
jq -R 'fromjson | map(. + {redirection: input})' file -
This way only 3 processes will be spawned for the whole task, instead of n + 2 where n is the number of URLs.
I would generate a dictionary with jq per url and slurp those dictionaries into the final list with jq -s:
json='[{"url":"google.com"},{"url":"microsoft.com"}]'
echo "$json" | jq -r '.[].url' | while read url; do
redirect=$(curl -LSs \
-o /dev/null \
-w '%{url_effective}' \
"${url}" 2>/dev/null)
jq --null-input --arg url "${url}" --arg redirect "${redirect}" \
'{url:$url, redirect: $redirect}'
done | jq -s
Alternative (first) solution:
You can output the url and the effective_url as tab separated data and create the output json with jq:
json='[{"url":"google.com"},{"url":"microsoft.com"}]'
echo "$json" | jq -r '.[].url' | while read line; do
prefix="${line}\t"
curl -LSs -o /dev/null -w "${prefix}"'%{url_effective}'"\n" "$line" 2>/dev/null
done | jq -r --raw-input 'split("\t")|{"url":.[0],"redirection":.[1]}'
Both solutions will generate valid json, independently of whatever characters the url/effective_url might contain.
Trying to keep this in JSON all the way is pretty cumbersome. I would simply try to make Bash construct a new valid JSON fragment inside the loop.
So in other words, if $url is the URL and $redirect is where it redirects to, you can do something like
printf '{"url": "%s", "redirection": "%s"}\n' "$url" "$redirect"
to produce JSON output from these strings. So tying it all together
jq -r '.[].url' <<<"$json" |
while read -r url; do
printf '{"url:" "%s", "redirection": "%s"}\n' \
"$url" "$(curl -LSs -o /dev/null -w '%{url_effective}' "$url")"
done |
jq -s
This is still pretty brittle; in particular, if either of the printf input strings could contain a literal double quote, that should properly be escaped.

How to parse JSON in shell script?

I run the curl command $(curl -i -o - --silent -X GET --cert "${CERT}" --key "${KEY}" "$some_url") and save the response in the variable response. ${response} is as shown below
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Content-Length: 34
Connection: keep-alive
Keep-Alive: timeout=5
X-XSS-Protection: 1;
{"status":"running","details":"0"}
I want to parse the JSON {"status":"running","details":"0"} and assign 'running' and 'details' to two different variables where I can print status and details both. Also if the status is equal to error, the script should exit. I am doing the following to achieve the task -
status1=$(echo "${response}" | awk '/^{.*}$/' | jq -r '.status')
details1=$(echo "${response}" | awk '/^{.*}$/' | jq -r '.details')
echo "Status: ${status1}"
echo "Details: ${details1}"
if [[ $status1 == 'error' ]]; then
exit 1
fi
Instead of parsing the JSON twice, I want to do it only once. Hence I want to combine the following lines but still assign the status and details to two separate variables -
status1=$(echo "${response}" | awk '/^{.*}$/' | jq -r '.status')
details1=$(echo "${response}" | awk '/^{.*}$/' | jq -r '.details')
First, stop using the -i argument to curl. That takes away the need for awk (or any other pruning of the header after-the-fact).
Second:
{
IFS= read -r -d '' status1
IFS= read -r -d '' details1
} < <(jq -r '.status + "\u0000" + .details + "\u0000"' <<<"$response")
The advantage of using a NUL as a delimiter is that it's the sole character that can't be present in the value of a C-style string (which is how shell variables' values are stored).
You can use a construction like:
read status1 details1 < <(jq -r '.status + " " + .details' <<< "${response}")
You use read to assign the different inputs to two variables (or an array, if you want), and use jq to print the data you need separated by whitespace.
As Benjamin already suggested, only retrieving the json is a better way to go. Poshi's solution is solid.
However, if you're looking for the most compact to do this, no need to save the response as a variable if the only thing your're going to do with it is extract other variables from it on a one time basis. Just pipe curl directly into:
curl "whatever" | jq -r '[.status, .details] |#tsv'
or
curl "whatever" | jq -r '[.status, .details] |join("\t")'
and you'll get your values fielded for you.

Using jq to combine json files, getting file list length too long error

Using jq to concat json files in a directory.
The directory contains a few hundred thousand files.
jq -s '.' *.json > output.json
returns an error that the file list is too long. Is there a way to write this that uses a method that will take in more files?
If jq -s . *.json > output.json produces "argument list too long"; you could fix it using zargs in zsh:
$ zargs *.json -- cat | jq -s . > output.json
That you could emulate using find as shown in #chepner's answer:
$ find -maxdepth 1 -name \*.json -exec cat {} + | jq -s . > output.json
"Data in jq is represented as streams of JSON values ... This is a cat-friendly format - you can just join two JSON streams together and get a valid JSON stream.":
$ echo '{"a":1}{"b":2}' | jq -s .
[
{
"a": 1
},
{
"b": 2
}
]
The problem is that the length of a command line is limited, and *.json produces too many argument for one command line. One workaround is to expand the pattern in a for loop, which does not have the same limits as a command line, because bash can iterate over the result internally rather than having to construct an argument list for an external command:
for f in *.json; do
cat "$f"
done | jq -s '.' > output.json
This is rather inefficient, though, since it requires running cat once for each file. A more efficient solution is to use find to call cat with as many files as possible each time.
find . -name '*.json' -exec cat '{}' + | jq -s '.' > output.json
(You may be able to simply use
find . -name '*.json' -exec jq -s '{}' + > output.json
as well; it may depend on what is in the files and how multiple calls to jq using the -s option compares to a single call.)
[EDITED to use find]
One obvious thing to consider would be to process one file at a time, and then "slurp" them:
$ while IFS= read -r f ; cat "$f" ; done <(find . -maxdepth 1 -name "*.json") | jq -s .
This however would presumably require a lot of memory. Thus the following may be closer to what you need:
#!/bin/bash
# "slurp" a bunch of files
# Requires a version of jq with 'inputs'.
echo "["
while read f
do
jq -nr 'inputs | (., ",")' $f
done < <(find . -maxdepth 1 -name "*.json") | sed '$d'
echo "]"

verify that a json field exists with jq and bash?

I have a script that uses jq for parsing a json string MESSAGE (that is read from another application). Meanwhile the json has changed and a field is split in 2 fields: file_path is now split into folder and file. The script was reading the file_path, now the folder may not be present, so for creating the path of the file I have to verify if the field is there. I have search for a while on the internet, and manage to do:
echo $(echo $MESSAGE | jq .folder -r)$'/'$(echo $MESSAGE | jq .file -r)
if [ $MESSAGE | jq 'has(".folder")' -r ]
then
echo $(echo $MESSAGE | jq .folder -r)$'/'$(echo $MESSAGE | jq .file -r)
else
echo $(echo $MESSAGE | jq .file -r)
fi
where MESSAGE='{"folder":"FLDR","file":"fl"}' or MESSAGE='{"file":"fl"}'
The first line is printing FLDR/fl or null/fl if the folder field is not present. So I have thought to create an if that is verifying if the folder field is present or not, but it seems that I am doing it wrong and cannot figure out what is wrong. The output is
bash: [: missing `]'
jq: ]: No such file or directory
null/fl
I'd do the whole thing in a jq filter:
echo "$MESSAGE" | jq -r '[ .folder, .file ] | join("/")'
In the event that you want to do it with bash (or to learn how to do this sort of thing in bash), two points:
Shell variables should almost always be quoted when they are used (i.e., "$MESSAGE" instead of $MESSAGE). You will run into funny problems if one of the strings in your JSON ever contains a shell metacharacter (such as *) and you forgot to do that; the string will be subject to shell expansion (and that * will be expanded into a list of files in the current working directory).
A shell if accepts as condition a command, and the decision where to branch is made depending on the exit status of that command (true if the exit status is 0, false otherwise). The [ you attempted to use is just a command (an alias for test, see man test) and not special in any way.
So, the goal is to construct a command that exits with 0 if the JSON object has a folder property, non-zero otherwise. jq has a -e option that makes it return 0 if the last output value was not false or null and non-zero otherwise, so we can write
if echo "$MESSAGE" | jq -e 'has("folder")' > /dev/null; then
echo "$MESSAGE" | jq -r '.folder + "/" + .file'
else
echo "$MESSAGE" | jq -r .file
fi
The > /dev/null bit redirects the output from jq to /dev/null (where it is ignored) so that we don't see it on the console.