Using jq, is it possible to replace the value of each parameter in the sample JSON with the value of the variable that is the initial value?
In my scenario, Azure DevOps does not carryout any kind of variable substitution on the JSON file, so I need to do it manually. So for example, say $SUBSCRIPTION_ID is set to abc-123, I'd like to use jq to update the JSON file.
I can pull out the values using .parameters[].value, but I can't seem to find a way of setting each individual value.
The main challenge here is that the solution should be reusable, and different JSON files will have different parameters, so I don't think I can use --argjson.
Example
Original JSON
{
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/parametersTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"subscriptionId": {
"value": "$SUBSCRIPTION_ID"
},
"topicName": {
"value": "$TOPIC_NAME"
}
}
}
Variables
SUBSCRIPTION_ID="abc-123"
TOPIC_NAME="SomeTopic"
Desired JSON
{
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/parametersTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"subscriptionId": {
"value": "abc-123"
},
"topicName": {
"value": "SomeTopic"
}
}
}
Export those variables so that you can access them from within jq.
export SUBSCRIPTION_ID TOPIC_NAME
jq '.parameters[].value |= (env[.[1:]] // .)' file
//. part is for leaving variables absent in the environment as is, you can drop it if not necessary
Use --argjson; essentially, you are just going to ignore the attempt at parameterizing the JSON and simply replace the values unconditionally.
jq --argjson x "$SUBSCRIPTION_ID" \
--argjson y "$TOPIC_NAME" \
'.parameters.subscriptionId.value = $x; .parameters.topicName.value = $y' \
config.json
Here is a "data-driven" approach based on the contents of the schema and the available environment variables:
export SUBSCRIPTION_ID="abc-123"
export TOPIC_NAME="SomeTopic"
< schema.json jq '.parameters
|= map_values(if .value | (startswith("$") and env[.[1:]])
then .value |= env[.[1:]] else . end)'
Notice that none of the template names appear in the jq program.
If your shell supports it, you could avoid the "export" commands by prefacing the jq command with the variable assignments along the lines of:
SUBSCRIPTION_ID="abc-123" TOPIC_NAME="SomeTopic" jq -f program.jq schema.json
Caveat
Using environment variables to pass in the parameter values may not be such a great idea. Two alternatives would be to provide the name-value pairs in a text file or as a JSON object. See also Using jq as a template engine
I'm writing a Bash function to get a portion of a JSON object. The API for the function is:
GetSubobject()
{
local Filter="$1" # Filter is of the form .<key>.<key> ... .<key>
local File="$2" # File is the JSON to get the subobject
# Code to get subobject using jq
# ...
}
To illustrate what I mean by a subobject, consider the Bash function call:
GetSubobject .b.x.y example.json
where the file example.json contains:
{
"a": { "p": 1, "q": 2 },
"b":
{
"x":
{
"y": { "j": true, "k": [1,2,3] },
"z": [4,5,6]
}
}
}
The result from the function call would be emitted to stdout:
{
"y": {
"j": true,
"k": [
1,
2,
3
]
}
}
Note that the code jq -r "$Filter" "$File" would not give the desired answer. It would give:
{ "j": true, "k": [1,2,3] }
Please note that the answer I'm looking for needs to be something I can use in the Bash function API above. So, the answer should use the Filter and File variables as show above and not be specific to the example above.
I have come up with a solution; however, it relies on Bash to do part of the job. I am hoping that the solution can be pure jq without reliance on Bash processing.
#!/bin/bash
GetSubobject()
{
local Filter="$1"
local File="$2"
# General case: separate:
# .<key1>.<key2> ... .<keyN-1>.<keyN>
# into:
# Prefix=.<key1>.<key2> ... .<keyN-1>
# Suffix=<keyN>
local Suffix="${Filter##*.}"
local Prefix="${Filter%.$Suffix}"
# Edge case: where Filter = .<key>
# Set:
# Prefix=.
# Suffix=<key>
if [[ -z $Prefix ]]; then
Prefix='.'
Suffix="${Filter#.}"
fi
jq -r "$Prefix|to_entries|map(select(.key==\"$Suffix\"))|from_entries" "$File"
}
GetSubobject "$#"
How would I complete the above Bash function using jq to obtain the desired result, hopefully in a less brute-force way that takes advantage of jq's capabilities without having to do pre-processing in Bash?
Somewhat further simplifying the jq part but with the same general constraints as JawguyChooser's answer, how about the much simpler Bash function
GetSubject () {
local newroot=${1##*.}
jq -r "{$newroot: $1}" "$2"
}
I may be overlooking some nuances of your more-complex Bash processing, but this seems to work for the example you provided.
If I understand what you're trying to do correctly, it doesn't seem possible to me to do it "pure jq" having read the docs (and being a regular jq user myself). The closest I could come to helping here was to simplify the jq part itself:
jq -r "$Prefix| { $Suffix }" "$File"
This has the same behavior as your example (on this limited set of cases):
GetSubobject '.b.x.y' example.json
{
"y": {
"j": true,
"k": [
1,
2,
3
]
}
}
This is really a case of metaprogramming, you want to programmatically operate on a jq program. Well, it makes sense (to me) that jq takes its program as input but doesn't allow you to alter the program itself. bash seems like an appropriate choice for doing the metaprogramming here: to convert a jq program into another one and then run jq using that.
If the goal is to do as little as possible in bash, then maybe the following bash function will fill the bill:
function GetSubobject {
local Filter="$1" # Filter is of the form .<key>.<key> ... .<key>
local File="$2" # File is the JSON to get the subobject
jq '(null|path('"$Filter"')) as $path
| {($path[-1]): '"$Filter"'}' "$File"
}
An alternative would be to pass $Filter in as a string (e.g. --arg filter "$Filter"), have jq do the parsing, and then use getpath.
It would of course be simplest if GetSubobject could be called with the path separated from the field of interest, like this:
GetSubobject .b.x y filename
I have a preference for shell commands to get things done. I have a very, very big file -- about 2.8 GB and the content is that of JSON. Everything is on one line, and I was told there are at least 1.5 million records in there.
I must prepare the file for consumption. Each record must be on its own line. Sample:
{"RomanCharacters":{"Alphabet":[{"RecordId":"1",...]},{"RecordId":"2",...},{"RecordId":"3",...},{"RecordId":"4",...},{"RecordId":"5",...} }}
Or, use the following...
{"Accounts":{"Customer":[{"AccountHolderId":"9c585258-c94c-442b-a2f0-1ebbcc274795","Title":"Mrs","Forename":"Tina","Surname":"Wright","DateofBirth":"1988-01-01","Contact":[{"Contact_Info":"9168777943","TypeId":"Mobile Number","PrimaryFlag":"No","Index":"1","Superseded":"No" },{"Contact_Info":"9503588153","TypeId":"Home Telephone","PrimaryFlag":"Yes","Index":"2","Superseded":"Yes" },{"Contact_Info":"acne.pimple#microchimerism.com","TypeId":"Email Address","PrimaryFlag":"No","Index":"3","Superseded":"No" },{"Contact_Info":"swati.singh#microchimerism.com","TypeId":"Email Address","PrimaryFlag":"Yes","Index":"4","Superseded":"Yes" }, {"Contact_Info":"christian.bale#hollywood.com","TypeId":"Email Address","PrimaryFlag":"No","Index":"5","Superseded":"NO" },{"Contact_Info":"15482475584","TypeId":"Mobile_Phone","PrimaryFlag":"No","Index":"6","Superseded":"No" }],"Address":[{"AddressPtr":"5","Line1":"Flat No.14","Line2":"Surya Estate","Line3":"Baner","Line4":"Pune ","Line5":"new","Addres_City":"pune","Country":"India","PostCode":"AB100KP","PrimaryFlag":"No","Superseded":"No"},{"AddressPtr":"6","Line1":"A-602","Line2":"Viva Vadegiri","Line3":"Virar","Line4":"new","Line5":"banglow","Addres_City":"Mumbai","Country":"India","PostCode":"AB10V6T","PrimaryFlag":"Yes","Superseded":"Yes"}],"Account":[{"Field_A":"6884133655531279","Field_B":"887.07","Field_C":"A Loan Product",...,"FieldY_":"2015-09-18","Field_Z":"24275627"}]},{"AccountHolderId":"92a5788f-cd8f-423d-ae5f-4eb0ceb457fd","_Title":"Dr","_Forename":"Christopher","_Surname":"Carroll","_DateofBirth":"1977-02-02","Contact":[{"Contact_Info":"9168777943","TypeId":"Mobile Number","PrimaryFlag":"No","Index":"7","Superseded":"No" },{"Contact_Info":"9503588153","TypeId":"Home Telephone","PrimaryFlag":"Yes","Index":"8","Superseded":"Yes" },{"Contact_Info":"acne.pimple#microchimerism.com","TypeId":"Email Address","PrimaryFlag":"No","Index":"9","Superseded":"No" },{"Contact_Info":"swati.singh#microchimerism.com","TypeId":"Email Address","PrimaryFlag":"Yes","Index":"10","Superseded":"Yes" }],"Address":[{"AddressPtr":"11","Line1":"Flat No.14","Line2":"Surya Estate","Line3":"Baner","Line4":"Pune ","Line5":"new","Addres_City":"pune","Country":"India","PostCode":"AB11TXF","PrimaryFlag":"No","Superseded":"No"},{"AddressPtr":"12","Line1":"A-602","Line2":"Viva Vadegiri","Line3":"Virar","Line4":"new","Line5":"banglow","Addres_City":"Mumbai","Country":"India","PostCode":"AB11O8W","PrimaryFlag":"Yes","Superseded":"Yes"}],"Account":[{"Field_A":"4121879819185553","Field_B":"887.07","Field_C":"A Loan Product",...,"Field_X":"2015-09-18","Field_Z":"25679434"}]},{"AccountHolderId":"4aa10284-d9aa-4dc0-9652-70f01d22b19e","_Title":"Dr","_Forename":"Cheryl","_Surname":"Ortiz","_DateofBirth":"1977-03-03","Contact":[{"Contact_Info":"9168777943","TypeId":"Mobile Number","PrimaryFlag":"No","Index":"13","Superseded":"No" },{"Contact_Info":"9503588153","TypeId":"Home Telephone","PrimaryFlag":"Yes","Index":"14","Superseded":"Yes" },{"Contact_Info":"acne.pimple#microchimerism.com","TypeId":"Email Address","PrimaryFlag":"No","Index":"15","Superseded":"No" },{"Contact_Info":"swati.singh#microchimerism.com","TypeId":"Email Address","PrimaryFlag":"Yes","Index":"16","Superseded":"Yes" }],"Address":[{"AddressPtr":"17","Line1":"Flat No.14","Line2":"Surya Estate","Line3":"Baner","Line4":"Pune ","Line5":"new","Addres_City":"pune","Country":"India","PostCode":"AB12SQR","PrimaryFlag":"No","Superseded":"No"},{"AddressPtr":"18","Line1":"A-602","Line2":"Viva Vadegiri","Line3":"Virar","Line4":"new","Line5":"banglow","Addres_City":"Mumbai","Country":"India","PostCode":"AB12BAQ","PrimaryFlag":"Yes","Superseded":"Yes"}],"Account":[{"Field_A":"3288214945919484","Field_B":"887.07","Field_C":"A Loan Product",...,"Field_Y":"2015-09-18","Field_Z":"66264768"}]}]}}
Final outcome should be:
{"RomanCharacters":{"Alphabet":[{"RecordId":"1",...]},
{"RecordId":"2",...},
{"RecordId":"3",...},
{"RecordId":"4",...},
{"RecordId":"5",...} }}
Attempted commands:
sed -e 's/,{"RecordId"/}]},\n{"RecordId"/g' sample.dat
awk '{gsub(",{\"RecordId\"",",\n{\"RecordId\"",$0); print $0}' sample.dat
The attempted commands works perfectly fine for small files. But it does not work for the 2.8 GB file that I must manipulate. Sed quits midway after 10 mins without reason and nothing was done. Awk errored with a Segmentation Fault (core dump) reason after many hours in. I tried perl's search and replace and got an error saying "Out of memory".
Any help/ ideas would be great!
Additional info on my machine:
More than 105 GB disk space available.
8 GB memory
4 cores CPU
Running Ubuntu 14.04
Since you've tagged your question with sed, awk AND perl, I gather that what you really need is a recommendation for a tool. While that's kind of off-topic, I believe that jq is something you could use for this. It will be better than sed or awk because it actually understands JSON. Everything shown here with jq could also be done in perl with a bit of programming.
Assuming content like the following (based on your sample):
{"RomanCharacters":{"Alphabet": [ {"RecordId":"1","data":"data"},{"RecordId":"2","data":"data"},{"RecordId":"3","data":"data"},{"RecordId":"4","data":"data"},{"RecordId":"5","data":"data"} ] }}
You can easily reformat this to "prettify" it:
$ jq '.' < data.json
{
"RomanCharacters": {
"Alphabet": [
{
"RecordId": "1",
"data": "data"
},
{
"RecordId": "2",
"data": "data"
},
{
"RecordId": "3",
"data": "data"
},
{
"RecordId": "4",
"data": "data"
},
{
"RecordId": "5",
"data": "data"
}
]
}
}
And we can dig in to the data to retrieve only the records you're interested in (regardless of what they're wrapped in):
$ jq '.[][][]' < data.json
{
"RecordId": "1",
"data": "data"
}
{
"RecordId": "2",
"data": "data"
}
{
"RecordId": "3",
"data": "data"
}
{
"RecordId": "4",
"data": "data"
}
{
"RecordId": "5",
"data": "data"
}
This is much more readable, both by humans and by tools like awk which process content line-by-line. If you want to join your lines for processing per your question, the awk becomes much more simple:
$ jq '.[][][]' < data.json | awk '{printf("%s ",$0)} /}/{printf("\n")}'
{ "RecordId": "1", "data": "data" }
{ "RecordId": "2", "data": "data" }
{ "RecordId": "3", "data": "data" }
{ "RecordId": "4", "data": "data" }
{ "RecordId": "5", "data": "data" }
Or, as #peak suggested in comments, eliminate the awk portion of thie entirely by using jq's -c (compact output) option:
$ jq -c '.[][][]' < data.json
{"RecordId":"1","data":"data"}
{"RecordId":"2","data":"data"}
{"RecordId":"3","data":"data"}
{"RecordId":"4","data":"data"}
{"RecordId":"5","data":"data"}
Regarding perl: Try setting the input line separator $/ to }, like this:
#!/usr/bin/perl
$/= "},";
while (<>){
print "$_\n";
}'
or, as a one-liner:
$ perl -e '$/="},";while(<>){print "$_\n"}' sample.dat
Try using } as the record separator, e.g. in Perl:
perl -l -0175 -ne 'print $_, $/' < input
You might need to glue back lines containing only }.
This avoids the memory problem by not looking at the data as a single record, but may go too far the other way with respect to performance (processing a single character at a time). Also note that it requires gawk for the built-in RT variable (value of the current record separator):
$ cat j.awk
BEGIN { RS="[[:print:]]" }
RT == "{" { bal++}
RT == "}" { bal-- }
{ printf "%s", RT }
RT == "," && bal == 2 { print "" }
END { print "" }
$ gawk -f j.awk j.txt
{"RomanCharacters":{"Alphabet":[{"RecordId":"1",...]},
{"RecordId":"2",...},
{"RecordId":"3",...},
{"RecordId":"4",...},
{"RecordId":"5",...} }}
Using the sample data provided here (the one that begins with {Accounts:{Customer... ), the solution to this problem is one that reads in the file and as it is reading it is counting the number of delimiters defined in $/. For every count of 10,000 delimiters, it will write out to a new file. And for each delimiter found, it gives it a new line. Here is how the script looks like:
#!/usr/bin/perl
$base="/home/dat789/incoming";
#$_="sample.dat";
$/= "}]},"; # delimiter to find and insert new line after
$n = 0;
$match="";
$filecount=0;
$recsPerFile=10000; # set number of records in a file
print "Processing " . $_ ."\n";
while (<>){
if ($n < $recsPerFile) {
$match=$match.$_."\n";
$n++;
print "."; #This is so that we'd know it has done something
}
else {
my $newfile="partfile".$recsPerFile."-".$filecount . ".dat";
open ( OUTPUT,'>', $newfile );
print OUTPUT $match;
$match="";
$filecount++;
$n=0;
print "Wrote file " . $newfile . "\n";
}
}
print "Finished\n\n";
I've used this script against the big 2.8 GB file where it's content is an unformatted one-liner JSON. The resulting output files would be missing the correct JSON headers and footers but this can be easily fixed.
Thank you so much guys for contributing!