Selection of multiple json keys using jq - json

As a newbee to bash and jq, I was trying to download several urls from a json file using jq command in bash scripts.
My items.json file looks like this :
[
{"title" : [bob], "link" :[a.b.c]},
{"title" : [alice], "link" :[d.e.f]},
{"title" : [carol], "link" :[]}
]
what I was initially doing was just filter the non-empty link and put them in an array and then download the array:
#!/bin/bash
lnk=( $(jq -r '.[].link[0] | select (.!=null)' items.json) )
for element in ${lnk[#]}
do
wget $element
done
But the problem of this approach is that all the files downloaded use the link as the file names.
I wish to filter json file but still keeps the title name with the link so that i can rename the file in the wget command. But I dont have any idea on what structure should I use here. So how can i keep the title to in the filter and use it after?

You can use this:
IFS=$'\n' read -d '' -a titles < <(jq -r '.[] | select (.link[0]!=null) | .title[0]' items.json);
IFS=$'\n' read -d '' -a links < <(jq -r '.[] | select (.link[0]!=null) | .link[0]' items.json);
Then you can iterate over arrays "${title[#]}" & ${links[#]}...
for i in ${!titles[#]}; do
wget -O "${titles[i]}" "${links[#]}"
done
EDIT: Easier & safer approach:
jq -r '.[] | select (.link[0]!=null) | #sh "wget -O \(.title[0]) \(.link[0])"' items.json | bash

Here is a bash script demonstrating reading the result of a jq filter into bash variables.
#!/bin/bash
jq -M -r '
.[]
| select(.link[0]!=null)
| .title[0], .link[0]
' items.json | \
while read -r title; read -r url; do
echo "$title: $url" # replace with wget command
done

Related

Can you separate distinct JSON attributes into two files using jq?

I am following this tutorial from Vault about creating your own certificate authority. I'd like to separate the response (change the output to API call using cURL to see the response) into two distinct files, one file possessing the certificate and issuing_ca attributes, the other file containing the private_key. The tutorial is using jq to parse JSON objects, but my unfamiliarity with jq isn't helpful here, and most searches are returning info on how to merge JSON using jq.
I've tried running something like
vault write -format=json pki_int/issue/example-dot-com \
common_name="test.example.com" \
ttl="24h" \
format=pem \
jq -r '.data.certificate, .data.issuing_ca > test.cert.pem \
jq -r '.data.private_key' > test.key.pem
or
vault write -format=json pki_int/issue/example-dot-com \
common_name="test.example.com" \
ttl="24h" \
format=pem \
| jq -r '.data.certificate, .data.issuing_ca > test.cert.pem \
| jq -r '.data.private_key' > test.key.pem
but no dice.
It is not an issue with jq invocation, but the way the output files get written. Per your usage indicated, after writing the file test.cert.pem, the contents over the read end of the pipe (JSON output) is no longer available to extract the private_key contents.
To duplicate the contents over at the write end of pipe, use tee along with process substitution. The following should work on bash/zsh or ksh93 and not on POSIX bourne shell sh
vault write -format=json pki_int/issue/example-dot-com \
common_name="test.example.com" \
ttl="24h" \
format=pem \
| tee >( jq -r '.data.certificate, .data.issuing_ca' > test.cert.pem) \
>(jq -r '.data.private_key' > test.key.pem) \
>/dev/null
See this in action
jq -n '{data:{certificate: "foo", issuing_ca: "bar", private_key: "zoo"}}' \
| tee >( jq -r '.data.certificate, .data.issuing_ca' > test.cert.pem) \
>(jq -r '.data.private_key' > test.key.pem) \
>/dev/null
and now observe the contents of both the files.
You could abuse jq's ability to write to standard error (version 1.6 or later) separately from standard output.
vault write -format=json pki_int/issue/example-dot-com \
common_name="test.example.com" \
ttl="24h" \
format=pem \
| jq -r '.data as $f | ($f.private_key | stderr) | ($f.certificate, $f.issuing_ca)' > test.cert.pem 2> test.key.pem
There's a general technique for this type of problem that is worth mentioning
because it has minimal prerequisites (just jq and awk), and because
it scales well with the number of files. Furthermore it is quite efficient in that only one invocation each of jq and awk is needed. The idea is to setup a pipeline of the form: jq ... | awk ...
There are many variants
of the technique but in the present case, the following would suffice:
jq -rc '
.data
| "test.cert.pem",
"\t\(.certificate)",
"\t\(.issuing_ca)",
"test.key.pem",
"\t\(.private_key)"
' | awk -F\\t 'NF == 1 {fn=$1; next} {print $2 > fn}'
Notice that this works even if the items of interest are strings with embedded tabs.

Create a json from given list of filenames in unix script

Hello I am trying to write unix script/command where I have to list out all filenames from given directory with filename format string-{number}.txt(eg: filename-1.txt,filename-2.txt) from which I have to form a json object. any pointers would be helpful.
[{
"filenumber": "1",
"name": "filename-1.txt"
},
{
"filenumber": "2",
"name": "filename-2.txt"
}
]
In the above json file-number should be read from {number} format of the each filename
A single call to jq should suffice :
shopt -s extglob
printf "%s\0" *-+([0-9]).txt | \
jq -sR 'split("\u0000") |
map({filenumber:capture(".*-(?<n>.*)\\.txt").n,
name:.})'
Very easy for the command-line tool xidel and its integrated EXPath File Module:
$ xidel -se '
array{
for $x in file:list(.,false(),"*.txt")
return {
"filenumber":extract($x,"(\d+)\.txt",1),
"name":$x
}
}
'
Intuitively, I'd say you can do this with jq. However, in practice I've rarely been able to achieve what I wanted with jq :-)
With some lunch break puzzling, I've come up with this beauty:
ls | jq -R '{filenumber:input_line_number, name:.}' | jq -s .
Instead of ls you could use any other command that produces a newline separated list of strings.
I have tried with multiple examples to achieve exact use case of mine and finally found this working fine exactly how I wanted Thanks
for file in $(ls *.txt); do file_version=$(echo $file | sed 's/\(^.*-\)\(.*\)\(.txt.*$\)/\2/'); jq -n --arg name "$file_version" --arg path "$file" '{name: $name, name: $path}'; done | jq -n '.urls |= [inputs]'

How to ignore particular keys inside .properties files while converting to json

I have .property file which I'm trying to convert to a json file using bash command(s) and I wanted to exclude particular keys being shown in the json file. Below are my .properties inside the property file, I want to exclude property 4 and 5 being converted to json
app.database.address=127.0.0.70
app.database.host=database.myapp.com
app.database.port=5432
app.database.user=dev-user-name
app.database.pass=dev-password
app.database.main=dev-database
Here's my bash command used for converting to json but it converts all the properties to json
cat fle.properties | jq -R -s 'split("\n") | map(split("=")) | map({(.[0]): .[1]}) | add' > zppprop.json
Is there any way we can include these parameters to exclude from converting to json
With xidel:
XPath + JSONiq solution
$ xidel -s fle.properties -e '
{|
x:lines($raw)[not(position() = (4,5))] ! {
substring-before(.,"="):substring-after(.,"=")
}
|}
'
{
"app.database.address": "127.0.0.70",
"app.database.host": "database.myapp.com",
"app.database.port": "5432",
"app.database.main": "dev-database"
}
x:lines($raw) is a shorthand for tokenize($raw,'\r\n?|\n') and turns $raw, the raw input, into a sequence where every new line is another item.
[not(position() = (4,5))] if it's always the 4th and 5th line you want to exclude. Otherwise, use [not(contains(.,"user") or contains(.,"pass"))] as seen below.
XQuery solution
$ xidel -s --xquery '
map:merge(
for $x in file:read-text-lines("fle.properties")[not(contains(.,"user") or contains(.,"pass"))]
let $kv:=tokenize($x,"=")
return
{$kv[1]:$kv[2]}
)
'
{
"app.database.address": "127.0.0.70",
"app.database.host": "database.myapp.com",
"app.database.port": "5432",
"app.database.main": "dev-database"
}
You can use file:read-text-lines() to do everything "in-query".
Playground.
You may filter out unneeded lines with grep:
cat fle.properties | grep -v -E "user|pass" | jq -R -s 'split("\n") | map(select(length > 0)) | map(split("=")) | map({(.[0]): .[1]}) | add'
It is also needed to remove the empty string at the end of the array returned by the split function. This is what map(select(length > 0)) is doing.
You can do the exclusion within the jq script:
properties2json
#!/usr/bin/env -S jq -sRf
split("\n") |
map(split("=")) |
map(
if .[0] | test(".*\\.(user|pass)";"i")
then
{}
else
{(.[0]): .[1]}
end
) |
add
# Make it executable
chmod +x properties2json
# Run it
./properties2json file.properties >file.json

use curl/bash command in jq

I am trying to get a list of URL after redirection using bash scripting. Say, google.com gets redirected to http://www.google.com with 301 status.
What I have tried is:
json='[{"url":"google.com"},{"url":"microsoft.com"}]'
echo "$json" | jq -r '.[].url' | while read line; do
curl -LSs -o /dev/null -w %{url_effective} $line 2>/dev/null
done
So, is it possible for us to use commands like curl inside jq for processing JSON objects.
I want to add the resulting URL to existing JSON structure like:
[
{
"url": "google.com",
"redirection": "http://www.google.com"
},
{
"url": "microsoft.com",
"redirection": "https://www.microsoft.com"
}
]
Thank you in advance..!
curl is capable of making multiple transfers in a single process, and it can also read command line arguments from a file or stdin, so, you don't need a loop at all, just put that JSON into a file and run this:
jq -r '"-o /dev/null\nurl = \(.[].url)"' file |
curl -sSLK- -w'%{url_effective}\n' |
jq -R 'fromjson | map(. + {redirection: input})' file -
This way only 3 processes will be spawned for the whole task, instead of n + 2 where n is the number of URLs.
I would generate a dictionary with jq per url and slurp those dictionaries into the final list with jq -s:
json='[{"url":"google.com"},{"url":"microsoft.com"}]'
echo "$json" | jq -r '.[].url' | while read url; do
redirect=$(curl -LSs \
-o /dev/null \
-w '%{url_effective}' \
"${url}" 2>/dev/null)
jq --null-input --arg url "${url}" --arg redirect "${redirect}" \
'{url:$url, redirect: $redirect}'
done | jq -s
Alternative (first) solution:
You can output the url and the effective_url as tab separated data and create the output json with jq:
json='[{"url":"google.com"},{"url":"microsoft.com"}]'
echo "$json" | jq -r '.[].url' | while read line; do
prefix="${line}\t"
curl -LSs -o /dev/null -w "${prefix}"'%{url_effective}'"\n" "$line" 2>/dev/null
done | jq -r --raw-input 'split("\t")|{"url":.[0],"redirection":.[1]}'
Both solutions will generate valid json, independently of whatever characters the url/effective_url might contain.
Trying to keep this in JSON all the way is pretty cumbersome. I would simply try to make Bash construct a new valid JSON fragment inside the loop.
So in other words, if $url is the URL and $redirect is where it redirects to, you can do something like
printf '{"url": "%s", "redirection": "%s"}\n' "$url" "$redirect"
to produce JSON output from these strings. So tying it all together
jq -r '.[].url' <<<"$json" |
while read -r url; do
printf '{"url:" "%s", "redirection": "%s"}\n' \
"$url" "$(curl -LSs -o /dev/null -w '%{url_effective}' "$url")"
done |
jq -s
This is still pretty brittle; in particular, if either of the printf input strings could contain a literal double quote, that should properly be escaped.

Using jq to combine json files, getting file list length too long error

Using jq to concat json files in a directory.
The directory contains a few hundred thousand files.
jq -s '.' *.json > output.json
returns an error that the file list is too long. Is there a way to write this that uses a method that will take in more files?
If jq -s . *.json > output.json produces "argument list too long"; you could fix it using zargs in zsh:
$ zargs *.json -- cat | jq -s . > output.json
That you could emulate using find as shown in #chepner's answer:
$ find -maxdepth 1 -name \*.json -exec cat {} + | jq -s . > output.json
"Data in jq is represented as streams of JSON values ... This is a cat-friendly format - you can just join two JSON streams together and get a valid JSON stream.":
$ echo '{"a":1}{"b":2}' | jq -s .
[
{
"a": 1
},
{
"b": 2
}
]
The problem is that the length of a command line is limited, and *.json produces too many argument for one command line. One workaround is to expand the pattern in a for loop, which does not have the same limits as a command line, because bash can iterate over the result internally rather than having to construct an argument list for an external command:
for f in *.json; do
cat "$f"
done | jq -s '.' > output.json
This is rather inefficient, though, since it requires running cat once for each file. A more efficient solution is to use find to call cat with as many files as possible each time.
find . -name '*.json' -exec cat '{}' + | jq -s '.' > output.json
(You may be able to simply use
find . -name '*.json' -exec jq -s '{}' + > output.json
as well; it may depend on what is in the files and how multiple calls to jq using the -s option compares to a single call.)
[EDITED to use find]
One obvious thing to consider would be to process one file at a time, and then "slurp" them:
$ while IFS= read -r f ; cat "$f" ; done <(find . -maxdepth 1 -name "*.json") | jq -s .
This however would presumably require a lot of memory. Thus the following may be closer to what you need:
#!/bin/bash
# "slurp" a bunch of files
# Requires a version of jq with 'inputs'.
echo "["
while read f
do
jq -nr 'inputs | (., ",")' $f
done < <(find . -maxdepth 1 -name "*.json") | sed '$d'
echo "]"