Accessing JSON value via jq using variable key - json

I have a file with JSON like:
test.json
{
"NUTS|/nuts/2010": {
"type": "small",
"mfg": "TSQQ",
"colors": []
}
}
I am getting "NUTS|/nuts/2010" from outside and I am storing it in a shell variable. I am trying to use the below snippet and using jq util, but I am not able to access the corresponding json against the above key.
test.sh
#!/bin/bash
NUTS_PATH="NUTS|/nuts/2010" #Storing in shell variable
INPUT_FILE="test.json"
RESULT=($(jq -r --arg NUTS_PATH_ALIAS "$NUTS_PATH" '.[$NUTS_PATH_ALIAS]' $INPUT_FILE))
echo "Result: $RESULT"
echo $RESULT > item.json
When I run this, I am getting:
Result: {
But it should return
{
"type": "small",
"mfg": "TSQQ",
"colors": []
}
Any help. Thanks

The problem isn't associated with jq at all. What you have should work fine, but the issue is with the assignment of the result to an array when you might have intended to store it in a variable
RESULT=($(jq -r --arg NUTS_PATH_ALIAS "$NUTS_PATH" '.[$NUTS_PATH_ALIAS]' $INPUT_FILE))
# ^^^ =( .. ) result is stored into an array
A variable like expansion of an array of form $RESULT refers to element at index 0, i.e. ${RESULT[0]}, which contains the { character of the raw JSON output
You should ideally be doing
RESULT="$(jq -r --arg NUTS_PATH_ALIAS "$NUTS_PATH" '.[$NUTS_PATH_ALIAS]' "$INPUT_FILE")"

I always end up swearing at jq, too!
For me, this jq query works:
$ jq '.["NUTS|/nuts/2010"]' test.json
{
"type": "small",
"mfg": "TSQQ",
"colors": []
}
However, because you've got pipes and slashes in your string, the variable quoting gets a bit funny.
NUTS_PATH='"NUTS|/nuts/2010"' #Note the two sets of quotes
INPUT_FILE="test.json"
RESULT=$(jq ".[$NUTS_PATH]" $INPUT_FILE)
echo "Result: $RESULT"
Result: {
"type": "small",
"mfg": "TSQQ",
"colors": []
}
Disclaimer, I'm not a BASH expert, there may (probably is) be a better way to sort out the quoting

Related

Use jq to replace many values with variable values

Using jq, is it possible to replace the value of each parameter in the sample JSON with the value of the variable that is the initial value?
In my scenario, Azure DevOps does not carryout any kind of variable substitution on the JSON file, so I need to do it manually. So for example, say $SUBSCRIPTION_ID is set to abc-123, I'd like to use jq to update the JSON file.
I can pull out the values using .parameters[].value, but I can't seem to find a way of setting each individual value.
The main challenge here is that the solution should be reusable, and different JSON files will have different parameters, so I don't think I can use --argjson.
Example
Original JSON
{
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/parametersTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"subscriptionId": {
"value": "$SUBSCRIPTION_ID"
},
"topicName": {
"value": "$TOPIC_NAME"
}
}
}
Variables
SUBSCRIPTION_ID="abc-123"
TOPIC_NAME="SomeTopic"
Desired JSON
{
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/parametersTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"subscriptionId": {
"value": "abc-123"
},
"topicName": {
"value": "SomeTopic"
}
}
}
Export those variables so that you can access them from within jq.
export SUBSCRIPTION_ID TOPIC_NAME
jq '.parameters[].value |= (env[.[1:]] // .)' file
//. part is for leaving variables absent in the environment as is, you can drop it if not necessary
Use --argjson; essentially, you are just going to ignore the attempt at parameterizing the JSON and simply replace the values unconditionally.
jq --argjson x "$SUBSCRIPTION_ID" \
--argjson y "$TOPIC_NAME" \
'.parameters.subscriptionId.value = $x; .parameters.topicName.value = $y' \
config.json
Here is a "data-driven" approach based on the contents of the schema and the available environment variables:
export SUBSCRIPTION_ID="abc-123"
export TOPIC_NAME="SomeTopic"
< schema.json jq '.parameters
|= map_values(if .value | (startswith("$") and env[.[1:]])
then .value |= env[.[1:]] else . end)'
Notice that none of the template names appear in the jq program.
If your shell supports it, you could avoid the "export" commands by prefacing the jq command with the variable assignments along the lines of:
SUBSCRIPTION_ID="abc-123" TOPIC_NAME="SomeTopic" jq -f program.jq schema.json
Caveat
Using environment variables to pass in the parameter values may not be such a great idea. Two alternatives would be to provide the name-value pairs in a text file or as a JSON object. See also Using jq as a template engine

Retrieve one (last) value from influxdb

I'm trying to retrieve the last value inserted into a table in influxdb. What I need to do is then post it to another system via HTTP.
I'd like to do all this in a bash script, but I'm open to Python also.
$ curl -sG 'https://influx.server:8086/query' --data-urlencode "db=iotaWatt" --data-urlencode "q=SELECT LAST(\"value\") FROM \"grid\" ORDER BY time DESC" | jq -r
{
"results": [
{
"statement_id": 0,
"series": [
{
"name": "grid",
"columns": [
"time",
"last"
],
"values": [
[
"2018-01-17T04:15:30Z",
690.1
]
]
}
]
}
]
}
What I'm struggling with is getting this value into a clean format I can use. I don't really want to use sed, and I've tried jq but it complains the data is a string and not an index:
jq: error (at <stdin>:1): Cannot index array with string "series"
Anyone have a good suggestion?
Pipe that curl to the jq below
$ your_curl_stuff_here | jq '.results[].series[]|.name,.values[0][]'
"grid"
"2018-01-17T04:15:30Z"
690.1
The results could be stored into a bash array and used later.
$ results=( $(your_curl_stuff_here | jq '.results[].series[]|.name,.values[0][]') )
$ echo "${results[#]}"
"grid" "2018-01-17T04:15:30Z" 690.1
# Individual values could be accessed using "${results[0]}" and so, mind quotes
All good :-)
Given the JSON shown, the jq query:
.results[].series[].values[]
produces:
[
"2018-01-17T04:15:30Z",
690.1
]
This seems to be the output you want, but from the point of view of someone who is not familiar with influxdb, the requirements seem very opaque, so you might want to consider a variant, such as:
.results[-1].series[-1].values[-1]
which in this case produces the same result, as it happens.
If you just want the atomic values, you could simply append [] to either of the queries above.

How to get a subobject out of JSON using jq, keeping final key in the result without Bash processing?

I'm writing a Bash function to get a portion of a JSON object. The API for the function is:
GetSubobject()
{
local Filter="$1" # Filter is of the form .<key>.<key> ... .<key>
local File="$2" # File is the JSON to get the subobject
# Code to get subobject using jq
# ...
}
To illustrate what I mean by a subobject, consider the Bash function call:
GetSubobject .b.x.y example.json
where the file example.json contains:
{
"a": { "p": 1, "q": 2 },
"b":
{
"x":
{
"y": { "j": true, "k": [1,2,3] },
"z": [4,5,6]
}
}
}
The result from the function call would be emitted to stdout:
{
"y": {
"j": true,
"k": [
1,
2,
3
]
}
}
Note that the code jq -r "$Filter" "$File" would not give the desired answer. It would give:
{ "j": true, "k": [1,2,3] }
Please note that the answer I'm looking for needs to be something I can use in the Bash function API above. So, the answer should use the Filter and File variables as show above and not be specific to the example above.
I have come up with a solution; however, it relies on Bash to do part of the job. I am hoping that the solution can be pure jq without reliance on Bash processing.
#!/bin/bash
GetSubobject()
{
local Filter="$1"
local File="$2"
# General case: separate:
# .<key1>.<key2> ... .<keyN-1>.<keyN>
# into:
# Prefix=.<key1>.<key2> ... .<keyN-1>
# Suffix=<keyN>
local Suffix="${Filter##*.}"
local Prefix="${Filter%.$Suffix}"
# Edge case: where Filter = .<key>
# Set:
# Prefix=.
# Suffix=<key>
if [[ -z $Prefix ]]; then
Prefix='.'
Suffix="${Filter#.}"
fi
jq -r "$Prefix|to_entries|map(select(.key==\"$Suffix\"))|from_entries" "$File"
}
GetSubobject "$#"
How would I complete the above Bash function using jq to obtain the desired result, hopefully in a less brute-force way that takes advantage of jq's capabilities without having to do pre-processing in Bash?
Somewhat further simplifying the jq part but with the same general constraints as JawguyChooser's answer, how about the much simpler Bash function
GetSubject () {
local newroot=${1##*.}
jq -r "{$newroot: $1}" "$2"
}
I may be overlooking some nuances of your more-complex Bash processing, but this seems to work for the example you provided.
If I understand what you're trying to do correctly, it doesn't seem possible to me to do it "pure jq" having read the docs (and being a regular jq user myself). The closest I could come to helping here was to simplify the jq part itself:
jq -r "$Prefix| { $Suffix }" "$File"
This has the same behavior as your example (on this limited set of cases):
GetSubobject '.b.x.y' example.json
{
"y": {
"j": true,
"k": [
1,
2,
3
]
}
}
This is really a case of metaprogramming, you want to programmatically operate on a jq program. Well, it makes sense (to me) that jq takes its program as input but doesn't allow you to alter the program itself. bash seems like an appropriate choice for doing the metaprogramming here: to convert a jq program into another one and then run jq using that.
If the goal is to do as little as possible in bash, then maybe the following bash function will fill the bill:
function GetSubobject {
local Filter="$1" # Filter is of the form .<key>.<key> ... .<key>
local File="$2" # File is the JSON to get the subobject
jq '(null|path('"$Filter"')) as $path
| {($path[-1]): '"$Filter"'}' "$File"
}
An alternative would be to pass $Filter in as a string (e.g. --arg filter "$Filter"), have jq do the parsing, and then use getpath.
It would of course be simplest if GetSubobject could be called with the path separated from the field of interest, like this:
GetSubobject .b.x y filename

Linux CLI - How to get substring from JSON jq + grep?

I need to pull a substring from JSON. In the JSON doc below, I need the end of the value of jq '.[].networkProfile.networkInterfaces[].id' In other words, I need just A10NICvw4konls2vfbw-data to pass to another command. I can't seem to figure out how to pull a substring using grep. I've seem regex examples out there but haven't been successful with them.
[
{
"id": "/subscriptions/blah/resourceGroups/IPv6v2/providers/Microsoft.Compute/virtualMachines/A10VNAvw4konls2vfbw",
"instanceView": null,
"licenseType": null,
"location": "centralus",
"name": "A10VNAvw4konls2vfbw",
"networkProfile": {
"networkInterfaces": [
{
"id": "/subscriptions/blah/resourceGroups/IPv6v2/providers/Microsoft.Network/networkInterfaces/A10NICvw4konls2vfbw-data",
"resourceGroup": "IPv6v2"
}
]
}
}
]
In your case, sub(".*/";"") will do the trick as * is greedy:
.[].networkProfile.networkInterfaces[].id | sub(".*/";"")
Try this:
jq -r '.[]|.networkProfile.networkInterfaces[].id | split("/") | last'
The -r tells JQ to print the output in "raw" form - in this case, that means no double-quotes around the string value.
As for the jq expression, after you access the id you want, piping it (still inside jq) through split("/") turns it into an array of the parts between slashes. Piping that through the last function (thanks, #Thor) returns just the last element of the array.
If you want to do it with grep here is one way:
jq -r '.[].networkProfile.networkInterfaces[].id' | grep -o '[^/]*$'
Output:
A10NICvw4konls2vfbw-data

Search and replace string in a very big file

I have a preference for shell commands to get things done. I have a very, very big file -- about 2.8 GB and the content is that of JSON. Everything is on one line, and I was told there are at least 1.5 million records in there.
I must prepare the file for consumption. Each record must be on its own line. Sample:
{"RomanCharacters":{"Alphabet":[{"RecordId":"1",...]},{"RecordId":"2",...},{"RecordId":"3",...},{"RecordId":"4",...},{"RecordId":"5",...} }}
Or, use the following...
{"Accounts":{"Customer":[{"AccountHolderId":"9c585258-c94c-442b-a2f0-1ebbcc274795","Title":"Mrs","Forename":"Tina","Surname":"Wright","DateofBirth":"1988-01-01","Contact":[{"Contact_Info":"9168777943","TypeId":"Mobile Number","PrimaryFlag":"No","Index":"1","Superseded":"No" },{"Contact_Info":"9503588153","TypeId":"Home Telephone","PrimaryFlag":"Yes","Index":"2","Superseded":"Yes" },{"Contact_Info":"acne.pimple#microchimerism.com","TypeId":"Email Address","PrimaryFlag":"No","Index":"3","Superseded":"No" },{"Contact_Info":"swati.singh#microchimerism.com","TypeId":"Email Address","PrimaryFlag":"Yes","Index":"4","Superseded":"Yes" }, {"Contact_Info":"christian.bale#hollywood.com","TypeId":"Email Address","PrimaryFlag":"No","Index":"5","Superseded":"NO" },{"Contact_Info":"15482475584","TypeId":"Mobile_Phone","PrimaryFlag":"No","Index":"6","Superseded":"No" }],"Address":[{"AddressPtr":"5","Line1":"Flat No.14","Line2":"Surya Estate","Line3":"Baner","Line4":"Pune ","Line5":"new","Addres_City":"pune","Country":"India","PostCode":"AB100KP","PrimaryFlag":"No","Superseded":"No"},{"AddressPtr":"6","Line1":"A-602","Line2":"Viva Vadegiri","Line3":"Virar","Line4":"new","Line5":"banglow","Addres_City":"Mumbai","Country":"India","PostCode":"AB10V6T","PrimaryFlag":"Yes","Superseded":"Yes"}],"Account":[{"Field_A":"6884133655531279","Field_B":"887.07","Field_C":"A Loan Product",...,"FieldY_":"2015-09-18","Field_Z":"24275627"}]},{"AccountHolderId":"92a5788f-cd8f-423d-ae5f-4eb0ceb457fd","_Title":"Dr","_Forename":"Christopher","_Surname":"Carroll","_DateofBirth":"1977-02-02","Contact":[{"Contact_Info":"9168777943","TypeId":"Mobile Number","PrimaryFlag":"No","Index":"7","Superseded":"No" },{"Contact_Info":"9503588153","TypeId":"Home Telephone","PrimaryFlag":"Yes","Index":"8","Superseded":"Yes" },{"Contact_Info":"acne.pimple#microchimerism.com","TypeId":"Email Address","PrimaryFlag":"No","Index":"9","Superseded":"No" },{"Contact_Info":"swati.singh#microchimerism.com","TypeId":"Email Address","PrimaryFlag":"Yes","Index":"10","Superseded":"Yes" }],"Address":[{"AddressPtr":"11","Line1":"Flat No.14","Line2":"Surya Estate","Line3":"Baner","Line4":"Pune ","Line5":"new","Addres_City":"pune","Country":"India","PostCode":"AB11TXF","PrimaryFlag":"No","Superseded":"No"},{"AddressPtr":"12","Line1":"A-602","Line2":"Viva Vadegiri","Line3":"Virar","Line4":"new","Line5":"banglow","Addres_City":"Mumbai","Country":"India","PostCode":"AB11O8W","PrimaryFlag":"Yes","Superseded":"Yes"}],"Account":[{"Field_A":"4121879819185553","Field_B":"887.07","Field_C":"A Loan Product",...,"Field_X":"2015-09-18","Field_Z":"25679434"}]},{"AccountHolderId":"4aa10284-d9aa-4dc0-9652-70f01d22b19e","_Title":"Dr","_Forename":"Cheryl","_Surname":"Ortiz","_DateofBirth":"1977-03-03","Contact":[{"Contact_Info":"9168777943","TypeId":"Mobile Number","PrimaryFlag":"No","Index":"13","Superseded":"No" },{"Contact_Info":"9503588153","TypeId":"Home Telephone","PrimaryFlag":"Yes","Index":"14","Superseded":"Yes" },{"Contact_Info":"acne.pimple#microchimerism.com","TypeId":"Email Address","PrimaryFlag":"No","Index":"15","Superseded":"No" },{"Contact_Info":"swati.singh#microchimerism.com","TypeId":"Email Address","PrimaryFlag":"Yes","Index":"16","Superseded":"Yes" }],"Address":[{"AddressPtr":"17","Line1":"Flat No.14","Line2":"Surya Estate","Line3":"Baner","Line4":"Pune ","Line5":"new","Addres_City":"pune","Country":"India","PostCode":"AB12SQR","PrimaryFlag":"No","Superseded":"No"},{"AddressPtr":"18","Line1":"A-602","Line2":"Viva Vadegiri","Line3":"Virar","Line4":"new","Line5":"banglow","Addres_City":"Mumbai","Country":"India","PostCode":"AB12BAQ","PrimaryFlag":"Yes","Superseded":"Yes"}],"Account":[{"Field_A":"3288214945919484","Field_B":"887.07","Field_C":"A Loan Product",...,"Field_Y":"2015-09-18","Field_Z":"66264768"}]}]}}
Final outcome should be:
{"RomanCharacters":{"Alphabet":[{"RecordId":"1",...]},
{"RecordId":"2",...},
{"RecordId":"3",...},
{"RecordId":"4",...},
{"RecordId":"5",...} }}
Attempted commands:
sed -e 's/,{"RecordId"/}]},\n{"RecordId"/g' sample.dat
awk '{gsub(",{\"RecordId\"",",\n{\"RecordId\"",$0); print $0}' sample.dat
The attempted commands works perfectly fine for small files. But it does not work for the 2.8 GB file that I must manipulate. Sed quits midway after 10 mins without reason and nothing was done. Awk errored with a Segmentation Fault (core dump) reason after many hours in. I tried perl's search and replace and got an error saying "Out of memory".
Any help/ ideas would be great!
Additional info on my machine:
More than 105 GB disk space available.
8 GB memory
4 cores CPU
Running Ubuntu 14.04
Since you've tagged your question with sed, awk AND perl, I gather that what you really need is a recommendation for a tool. While that's kind of off-topic, I believe that jq is something you could use for this. It will be better than sed or awk because it actually understands JSON. Everything shown here with jq could also be done in perl with a bit of programming.
Assuming content like the following (based on your sample):
{"RomanCharacters":{"Alphabet": [ {"RecordId":"1","data":"data"},{"RecordId":"2","data":"data"},{"RecordId":"3","data":"data"},{"RecordId":"4","data":"data"},{"RecordId":"5","data":"data"} ] }}
You can easily reformat this to "prettify" it:
$ jq '.' < data.json
{
"RomanCharacters": {
"Alphabet": [
{
"RecordId": "1",
"data": "data"
},
{
"RecordId": "2",
"data": "data"
},
{
"RecordId": "3",
"data": "data"
},
{
"RecordId": "4",
"data": "data"
},
{
"RecordId": "5",
"data": "data"
}
]
}
}
And we can dig in to the data to retrieve only the records you're interested in (regardless of what they're wrapped in):
$ jq '.[][][]' < data.json
{
"RecordId": "1",
"data": "data"
}
{
"RecordId": "2",
"data": "data"
}
{
"RecordId": "3",
"data": "data"
}
{
"RecordId": "4",
"data": "data"
}
{
"RecordId": "5",
"data": "data"
}
This is much more readable, both by humans and by tools like awk which process content line-by-line. If you want to join your lines for processing per your question, the awk becomes much more simple:
$ jq '.[][][]' < data.json | awk '{printf("%s ",$0)} /}/{printf("\n")}'
{ "RecordId": "1", "data": "data" }
{ "RecordId": "2", "data": "data" }
{ "RecordId": "3", "data": "data" }
{ "RecordId": "4", "data": "data" }
{ "RecordId": "5", "data": "data" }
Or, as #peak suggested in comments, eliminate the awk portion of thie entirely by using jq's -c (compact output) option:
$ jq -c '.[][][]' < data.json
{"RecordId":"1","data":"data"}
{"RecordId":"2","data":"data"}
{"RecordId":"3","data":"data"}
{"RecordId":"4","data":"data"}
{"RecordId":"5","data":"data"}
Regarding perl: Try setting the input line separator $/ to }, like this:
#!/usr/bin/perl
$/= "},";
while (<>){
print "$_\n";
}'
or, as a one-liner:
$ perl -e '$/="},";while(<>){print "$_\n"}' sample.dat
Try using } as the record separator, e.g. in Perl:
perl -l -0175 -ne 'print $_, $/' < input
You might need to glue back lines containing only }.
This avoids the memory problem by not looking at the data as a single record, but may go too far the other way with respect to performance (processing a single character at a time). Also note that it requires gawk for the built-in RT variable (value of the current record separator):
$ cat j.awk
BEGIN { RS="[[:print:]]" }
RT == "{" { bal++}
RT == "}" { bal-- }
{ printf "%s", RT }
RT == "," && bal == 2 { print "" }
END { print "" }
$ gawk -f j.awk j.txt
{"RomanCharacters":{"Alphabet":[{"RecordId":"1",...]},
{"RecordId":"2",...},
{"RecordId":"3",...},
{"RecordId":"4",...},
{"RecordId":"5",...} }}
Using the sample data provided here (the one that begins with {Accounts:{Customer... ), the solution to this problem is one that reads in the file and as it is reading it is counting the number of delimiters defined in $/. For every count of 10,000 delimiters, it will write out to a new file. And for each delimiter found, it gives it a new line. Here is how the script looks like:
#!/usr/bin/perl
$base="/home/dat789/incoming";
#$_="sample.dat";
$/= "}]},"; # delimiter to find and insert new line after
$n = 0;
$match="";
$filecount=0;
$recsPerFile=10000; # set number of records in a file
print "Processing " . $_ ."\n";
while (<>){
if ($n < $recsPerFile) {
$match=$match.$_."\n";
$n++;
print "."; #This is so that we'd know it has done something
}
else {
my $newfile="partfile".$recsPerFile."-".$filecount . ".dat";
open ( OUTPUT,'>', $newfile );
print OUTPUT $match;
$match="";
$filecount++;
$n=0;
print "Wrote file " . $newfile . "\n";
}
}
print "Finished\n\n";
I've used this script against the big 2.8 GB file where it's content is an unformatted one-liner JSON. The resulting output files would be missing the correct JSON headers and footers but this can be easily fixed.
Thank you so much guys for contributing!