jq --arg variable used in quoted string within select() - json

I want to select() an object based on a string containing a jq variable ($ARCH) using -arg jq argument. Here's the use-case while looking for "/bin/linux/$ARCH/kubeadm" from Google...
# You may need to install `xml2json` IE
# sudo gem install --no-rdoc --no-ri xml2json and run the script I wrote to do the xml2json:
#!/usr/bin/ruby
# Written by Jim Conner
require 'xml2json'
xml = ARGV[0]
begin
if xml == '-'
xdata = ARGF.read.chomp
puts XML2JSON.parse(xdata)
else
puts XML2JSON.parse(File.read(file2parse).chomp)
end
rescue => e
$stderr.puts 'Unable to comply: %s' % [e.message]
end
Then run the following:
curl -sSL https://storage.googleapis.com/kubernetes-release/ > /var/tmp/k8s.xml | \
xml2json - | \
jq --arg ARCH amd64 '[.ListBucketResult.Contents[] | select(.Key | contains("/bin/linux/$arch/kubeadm"))]'
...which returns an empty set because jq doesn't transliterate inside quotes. I know I can get around this by using multiple select/contains() but I'd prefer not to if possible.
jq simply may not do it, but if someone knows a way to do it, I'd much appreciate it.

jq does support string interpolation, and in your case the string would be:
"/bin/linux/\($ARCH)/kubeadm"
Notice that this is not a JSON string: the occurrence of "\(" signals that the string is subject to interpolation. Very nifty.
(Alternatively, you could of course use string concatenation:
"/bin/linux/" + $ARCH + "/kubeadm")
Btw, you might wish to avoid contains here. Its semantics is (are?) quite complex and perhaps counter-intuitive. Consider using startswith, index, or (for regex matches) test.

Related

How to insert JSON as string into another JSON

I am writing script (bash script for Azure pipeline) and I need to combine JSON from different variables. For example, I have:
TYPE='car'
COLOR='blue'
ADDITIONAL_PARAMS='{"something": "big", "etc":"small"}'
So, as you can see, I have several string variables and one which consist JSON.
I need to combine these variables with this format (and I cant :( ):
some_script --extr-vars --extra_vars '{"var_type": "'$TYPE'", "var_color": "'$COLOR'", "var_additional_data": "'$ADDITIONAL_PARAMS'"}'
But this combination is not working, I have a string something like:
some_script --extr-vars --extra_vars '{"var_type": "car", "var_color": "blue", "var_additional_data": " {"something": "big", "etc":"small"} "}'
which is not correct and valid JSON.
How I can combine existing JSON (already formatted with double quotes ") with other variables? I am using bash / console / yq utilite (to convert yaml to json)
Use jq to generate the JSON. (You can probably do this in one step with yq, but I'm not as familiar with that tool.)
ev=$(jq --arg t "$TYPE" \
--arg c "$COLOR" \
--argjson ap "$ADDITIONAL_PARAMS" \
-n '{var_type: $t, var_color: $c var_additional_data: $ap}')
some_script --extr-vars --extra_vars "$ev"

Get JSON files from particular interval based on date field

I've a lot json file the structure of which looks like below:
{
key1: 'val1'
key2: {
'key21': 'someval1',
'key22': 'someval2',
'key23': 'someval3',
'date': '2018-07-31T01:30:30Z',
'key25': 'someval4'
}
key3: []
... some other objects
}
My goal is to get only these files where date field is from some period.
For example from 2018-05-20 to 2018-07-20.
I can't base on date of creation this files, because all of this was generated in one day.
Maybe it is possible using sed or similar program?
Fortunately, the date in this format can be compared as a string. You only need something to parse the JSONs, e.g. Perl:
perl -l -0777 -MJSON::PP -ne '
$date = decode_json($_)->{key2}{date};
print $ARGV if $date gt "2018-07-01T00:00:00Z";
' *.json
-0777 makes perl slurp the whole files instead of reading them line by line
-l adds a newline to print
$ARGV contains the name of the currently processed file
See JSON::PP for details. If you have JSON::XS or Cpanel::JSON::XS, you can switch to them for faster processing.
I had to fix the input (replace ' by ", add commas, etc.) in order to make the parser happy.
If your files actually contain valid JSON, the task can be accomplished in a one-liner with jq, e.g.:
jq 'if .key2.date[0:10] | (. >= "2018-05-20" and . <= "2018-07-31") then input_filename else empty end' *.json
This is just an illustration. jq has date-handling functions for dealing with more complex requirements.
Handling quasi-JSON
If your files contain quasi-JSON, then you could use jq in conjunction with a JSON rectifier. If your sample is representative, then hjson
could be used, e.g.
for f in *.qjson
do
hjson -j $f | jq --arg f "$f" '
if .key2.date[0:7] == "2018-07" then $f else empty end'
done
Try like this:
Find a online converter. (for example: https://codebeautify.org/json-to-excel-converter#) and convert Json to CSV
Open CSV file with Excel
Filter your data

Modifying JSON by using jq

I want to modify a JSON file by using the Linux command line.
I tried these steps:
[root#localhost]# INPUT="dsa"
[root#localhost]# echo $INPUT
dsa
[root#localhost]# CONF_FILE=test.json
[root#localhost]# echo $CONF_FILE
test.json
[root#localhost]# cat $CONF_FILE
{
"global" : {
"name" : "asd",
"id" : 1
}
}
[root#localhost]# jq -r '.global.name |= '""$INPUT"" $CONF_FILE > tmp.$$.json && mv tmp.$$.json $CONF_FILE
jq: error: dsa/0 is not defined at <top-level>, line 1:
.global.name |= dsa
jq: 1 compile error
Desired output:
[root#localhost]# cat $CONF_FILE
{ "global" : {
"name" : "dsa",
"id" : 1 } }
Your only problem was that the script passed to jq was quoted incorrectly.
In your particular case, using a single double-quoted string with embedded \-escaped " instances is probably simplest:
jq -r ".global.name = \"$INPUT\"" "$CONF_FILE" > tmp.$$.json && mv tmp.$$.json "$CONF_FILE"
Generally, however, chepner's helpful answer shows a more robust alternative to embedding the shell variable reference directly in the script: Using the --arg option to pass a value as a jq variable allows single-quoting the script, which is preferable, because it avoids confusion over what elements are expanded by the shell up front and obviates the need for escaping $ instances that should be passed through to jq.
Also:
Just = is sufficient to assign the value; while |=, the so-called update operator, works too, it behaves the same as = in this instance, because the RHS is a literal, not an expression referencing the LHS - see the manual.
You should routinely double-quote your shell-variable references and you should avoid use of all-uppercase variable names in order to avoid conflicts with environment variables and special shell variables.
As for why your quoting didn't work:
'.global.name |= '""$INPUT"" is composed of the following tokens:
String literal .global.name |= (due to single-quoting)
String literal "" - i.e., the empty string - the quotes will be removed by the shell before jq sees the script
An unquoted reference to variable $INPUT (which makes its value subject to word-splitting and globbing).
Another instance of literal "".
With your sample value, jq ended up seeing the following string as its script:
.global.name |= dsa
As you can see, the double quotes are missing, causing jq to interpret dsa as a function name rather than a string literal, and since no argument was passed to (non-existent) function dsa, jq's error message referenced it as dsa/0 - a function with no (0) arguments.
It's much simpler and safer to pass the value using the --arg option:
jq -r --arg newname "$INPUT" '.global.name |= $newname' "$CONF_FILE"
This ensures that the exact value of $INPUT is used and quoted as a JSON value.
Using jq with a straight forward filter, should do it for you.
.global.name = "dsa"
i.e.
jq '.global.name = "dsa"' json-file
{
"global": {
"name": "dsa",
"id": 1
}
}
You can play around with your json-filters, here.

How to delete the last character of prior line with sed

I'm trying to delete a line with a the last character of the prior line with sed:
I have a json file :
{
"name":"John",
"age":"16",
"country":"Spain"
}
I would like to delete country of all entries, to do that I have to delete the comma for the json syntax of the prior line.
I'm using this pattern :
sed '/country/d' test.json
sed -n '/resolved//.$//{x;d;};1h;1!{x;p;};${x;p;}' test.json
Editor's note:
The OP later clarified the following additional requirements, which invalidated some of the existing answers:
- multiple occurrences of country properties should be removed
- across all levels of the object hierarchy
- whitespace variations should be tolerated
Using a proper JSON parser such as jq is generally the best choice (see below), but if installing a utility is not an option, try this GNU sed command:
$ sed -zr 's/,\s*"country":[^\n]+//g' test.json
{
"name":"John",
"age":"16"
}
-z splits the input into records by NULs, which, in this case means that the whole file is read at once, which enables cross-line substitutions.
-r enables extended regular expressions for a more modern syntax with more features.
s/,\n"country":\s*//g replaces all occurrences of a comma followed by a (possibly empty) run of whitespace (including possibly a newline) and then "country" through the end of that line with the empty string, i.e., effectively removes the matched strings.
Note that this assumes that no other property or closing } follows such a country property on the same line.
To demonstrate a more robust solution based on jq.
Bertrand Martel's helpful answer contains a jq solution, which, however, does not address the requirement (added later) of replacing country attributes anywhere in the input object hierarchy.
In a not-yet-released version of jq higher than v1.5.2, a builtin walk/1 function will be available, which enables the following simple solution:
# Walk all nodes and remove a "country" property from any object.
jq 'walk(if type == "object" then del (.country) else . end)' test.json
In v1.5.2 and below, you can define a simplified variant of walk yourself:
jq '
# Define recursive function walk_objects/1 that walks all objects in the
# hierarchy.
def walk_objects(f): . as $in |
if type == "object" then
reduce keys[] as $key
( {}; . + { ($key): ($in[$key] | walk_objects(f)) } ) | f
elif type == "array" then map( walk_objects(f) )
else . end;
# Walk all objects and remove a "country" property, if present.
walk_objects(del(.country))
' test.json
As pointed out before you should really consider using a JSON parser to parse JSON.
When that is said you can slurp the whole file, remove newlines and then replace
accordantly:
$ sed ':a;N;$!ba;s/\n//g;s/,"country"[^}]*//' test.json
{"name":"John","age":"16"}
Breakdown:
:a; # Define label 'a'
N; # Append next line to pattern space
$!ba; # Goto 'a' unless it's the last line
s/\n//g; # Replace all newlines with nothing
s/,"country"[^}]*// # Replace ',"country...' with nothing
This might work for you (GNU sed):
sed 'N;s/,\s*\n\s*"country".*//;P;D' file
Read two lines into the pattern space and remove substitution string.
N.B. Allows for spaces either side of the line.
You can use a JSON parser like jq to parse json file. The following will return the document without the country field and write the new document in result.json :
jq 'del(.country)' file.json > result.json

Grepping out data from a returned wget

I am writing a bash script to use with badips.com
This command:
wget https://www.badips.com/get/key -qO -
Will return something like this:
{"err":"","suc":"new key 5f72253b673eb49fc64dd34439531b5cca05327f has been set.","key":"5f72253b673eb49fc64dd34439531b5cca05327f"}
Or like this:
{"err":"","suc":"Your Key was already present! To overwrite, see http:\/\/www.badips.com\/apidoc.","key":"5f72253b673eb49fc64dd34439531b5cca05327f"}
I need to parse the key value out (5f72253b673eb49fc64dd34439531b5cca05327f) into a variable in the script. I would prefer to use grep to do it but can't get it right.
Instead of parsing with some grep, you have the perfect tool for this: jq.
See:
jq '.key' file
or
.... your_commands .... | jq '.key'
will return
"5f72253b673eb49fc64dd34439531b5cca05327f"
See another example, for example to get the suc attribute:
$ cat a
{"err":"","suc":"new key 5f72253b673eb49fc64dd34439531b5cca05327f has been set.","key":"5f72253b673eb49fc64dd34439531b5cca05327f"}
{"err":"","suc":"Your Key was already present! To overwrite, see http:\/\/www.badips.com\/apidoc.","key":"5f72253b673eb49fc64dd34439531b5cca05327f"}
$ jq '.suc' a
"new key 5f72253b673eb49fc64dd34439531b5cca05327f has been set."
"Your Key was already present! To overwrite, see http://www.badips.com/apidoc."
You could try the below grep command,
grep -oP '"key":"\K[^"]*(?=")' file
Using perl :
wget https://www.badips.com/get/key -qO - |
perl -MJSON -MFile::Slurp=slurp -le '
my $s = slurp "/dev/stdin";
my $d = JSON->new->decode($s);
print $d->{key}
'
Not as strong as precedent one, but that don't require to install new modules, a stock perl can do it :
wget https://www.badips.com/get/key -qO - |
perl -lne 'print $& if /"key":"\K[[:xdigit:]]+/'
awk keeps it simple
wget ... - | awk -F: '{split($NF,k,"\"");print k[2]}'
the field separator is :;
the key is always in the last field, in awk this field is accessed using $NF (Number of Fields);
the split function splits $NF and puts the pieces in array k, according to separator "\"" that is just a single double quote character;
the second field of the k array is what you want.