Adding json array via JQ introduces unicode characters in string - json

I have a JSON file in which I want to append an array element, using bash and latest JQ installed. I am able to append it but the resulting string has unicode characters as can be seen below. The first element in validators array is the original and the second is the appended code. (not the whole json file)
"validators": [
{
"address": "85BAF568E7F89277E47D3FC8E111775A4F6992FA",
"pub_key": {
"type": "tendermint/PubKeyEd25519",
"value": "BCzCLcW7rZ9VJgAtEUoDN17qcZw8ZvpYbPsL6eOy3No="
},
"power": "10",
"name": ""
},
{
"address": "\u001b[32m\"F75E15A3949824B685A3C5BFCDEED7E3DA4277AE\"\u001b[0m\r",
"pub_key": "\u001b[37m{\u001b[0m\u001b[34;1m\"type\"\u001b[0m\u001b[37m:\u001b[0m\u001b[32m\"tendermint/PubKeyEd25519\"\u001b[0m\u001b[37m,\u001b[0m\u001b[34;1m\"value\"\u001b[0m\u001b[37m:\u001b[0m\u001b[32m\"INeR51z41k6jPAEJ5rV+1TY+4sxnbIykc4bfJFmSCQ8=\"\u001b[0m\u001b[37m\u001b[37m}\u001b[0m\r",
"power": "10",
"name": "node2"
}
]
Printing the address element separately prints the element without any utf/unicode encoding chars.
{
"type": "tendermint/PubKeyEd25519",
"value": "BCzCLcW7rZ9VJgAtEUoDN17qcZw8ZvpYbPsL6eOy3No="
}
I merge the code using the following code:
cat genesis.json.src | jq --arg pub_key $PK --arg name node$i --arg addr $ADDR '.validators+= [{address: $addr, pub_key: $pub_key, power:"10",name:$name}]' > genesis.json.dest
I am running macOS. Any help or suggestion would be appreciated.

As #choroba mentioned in the comment, this is colour sequence characters. I removed them by adding a -M flag for JQ that disables colours.

Related

Replacing a json object value with constant string

I have below json content in a json file and "pos" value in below json is a variable and need to replace it with a constant like NNNN.
{
"qual_table": "TKGGU1.TCUSTORD",
"op_type": "INSERT",
"op_ts": "YYYY-MM-DDTHH:MI:SS.FFFFFFZ",
"pos": "G-AAAAAJcRAAAAAAAAAAAAAAAAAAAHAAoAAA==10687850.2.31.996",
"xid": "0.2.31.996",
"after": {
"CUST_CODE": "BILL",
"ORDER_DATE": "YYYY-MM-DD:HH:MI:SS",
"PRODUCT_CODE": "CAR",
"ORDER_ID": 765,
"PRODUCT_PRICE": 1500000,
"PRODUCT_AMOUNT": 3,
"TRANSACTION_ID": 100
}
}
I tried perl -i -pe 's/[A-Z]-[A-Z]*\.==*/ NNNN/g' filename.json but this is not working.
Could you suggest me a regular expression for this. Pls note this is a variable and have different length everytime.
jq is the de-facto standard tool for working with JSON from scripts and the command line. It makes updating a field of an object with a new value trivial:
$ jq '.pos = "NNNN"' input.json
{
"qual_table": "TKGGU1.TCUSTORD",
"op_type": "INSERT",
"op_ts": "YYYY-MM-DDTHH:MI:SS.FFFFFFZ",
"pos": "NNNN",
"xid": "0.2.31.996",
"after": {
"CUST_CODE": "BILL",
"ORDER_DATE": "YYYY-MM-DD:HH:MI:SS",
"PRODUCT_CODE": "CAR",
"ORDER_ID": 765,
"PRODUCT_PRICE": 1500000,
"PRODUCT_AMOUNT": 3,
"TRANSACTION_ID": 100
}
}
(Note that when standard output is a pipe or file or something other than a terminal, the JSON is printed in compact form on a single line.)
The equivalent perl one(ish)-liner would be something like
perl -MJSON::MaybeXS -MFile::Slurper=read_text -E '
my $json = JSON::MaybeXS->new;
my $obj = $json->decode(read_text($ARGV[0]));
$obj->{pos} = "NNNN";
say $json->encode($obj);' input.json
(Adjust as needed for your preferred JSON and file-reading modules)
perl -pe 's/"pos"\s*:\s*"\K[^"]*/NNNN/g' filename.json
should work for any length of pos as long as there is no quote character or newline in the string.
As Shawn pointed out it'd be better to use a json-manipulating tool instead of a string-manipulating one.
Otherwise you'll have to rely on assumptions.

Rename duplicate Keys in Json array data

I have a json data as below.
{
"Data":
[
"User": [
{"Name": "Solomon", "Age":20},
{"Name": "Absolom", "Age":30},
]
"Country": [
{"Name" : "US", "Resident" : "Permanent"},
{"Name" : "UK", "Resident" : "Temporary"}
]]}
There are two tags with same keys,
in Users there is Name key and in Country also i have Name key. I need to preprocess the json file to differentiate the keys. My expected result is below. Tried through awk and sed commands, but i could not find proper solution. Any suggestion would be helpful.
Expected result:
{
"Data":
[
"User": [
{"User_Name": "Solomon", "User_Age":20},
{"User_Name": "Absolom", "User_Age":30},
]
"Country": [
{"Country_Name" : "US", "Country_Resident" : "Permanent"},
{"Country_Name" : "UK", "Country_Resident" : "Temporary"}
]]}
Tag name should be appended to the attribute name.
This is what i have tried,
jq '[.[] | .["User_Name"] = .Name]' file_name.json
But it changes for both the tages User as well as Country
with the permission of the OP, here's a jtc based solution while waiting for the jq's (assuming the input JSON is fixed):
bash $ <file.json jtc -w'<Data>l[:]<L>k<.*>L:<>k' -u'"{L}_{}";' -tc
{
"Data": {
"Country": [
{ "Country_Name": "US", "Country_Resident": "Permanent" },
{ "Country_Name": "UK", "Country_Resident": "Temporary" }
],
"User": [
{ "User_Age": 20, "User_Name": "Solomon" },
{ "User_Age": 30, "User_Name": "Absolom" }
]
}
}
bash $
Explanation of the jtc parameters:
-w'<Data>l[:]<L>k<.*>L:<>k' :
walk path (-w) selects Data label (<Data>l)
and then each of the nested elements ([:]),
and memorizes its key/label into the namespace L (<L>k),
then finds further each labeled element using REGEX label search (<.*>L:)
and finally reinterpret found element's key/label as the value (<>k)
-u'"{L}_{}";':
for each found label (in step 1) update operation (-u) is applied using template
"{L}_{}";', where {L} is interpolated with preserved in the namespace L value and {} is getting interpolated with the currently found label (at the each iteration of the walk path)
the trailing ; (or any other symbol) is required to distinguish the argument of -u from a literal JSON.
-tc is used to display JSON in a semi-compact form.
PS. I'm the creator of jtc unix JSON processing tool. The disclaimer is required by SO.
As originally posted, neither the illustrative input nor the corresponding output is valid JSON, but the following has been tested using JSON based on the shown input:
.Data |= ( (.User |= map(with_entries(.key |= ("User_" + .))))
| (.Country |= map(with_entries(.key |= ("Country_" + .)))) )
Of course, the above may need tweaking depending on the actual requirements, and can be generalized in various ways, e.g. as shown below.
A generalization
.Data |= with_entries( (.key + "_") as $newkey
| .value |= map(with_entries(.key |= ($newkey + .))))
Here is an approach using jq Streaming
fromstream(tostream | .[0] |= if length < 4 then . else .[3]="\(.[1])_\(.[3])" end)
It works by using tostream to convert your input to a stream of arrays
[["Data","Country",0,"Name"],"US"]
[["Data","Country",0,"Resident"],"Permanent"]
[["Data","Country",0,"Resident"]]
[["Data","Country",1,"Name"],"UK"]
[["Data","Country",1,"Resident"],"Temporary"]
[["Data","Country",1,"Resident"]]
[["Data","Country",1]]
[["Data","User",0,"Age"],20]
[["Data","User",0,"Name"],"Solomon"]
[["Data","User",0,"Name"]]
[["Data","User",1,"Age"],30]
[["Data","User",1,"Name"],"Absolom"]
[["Data","User",1,"Name"]]
[["Data","User",1]]
[["Data","User"]]
[["Data"]]
then applying a simple update assignment |= expression to transform the stream into
[["Data","Country",0,"Country_Name"],"US"]
[["Data","Country",0,"Country_Resident"],"Permanent"]
[["Data","Country",0,"Country_Resident"]]
[["Data","Country",1,"Country_Name"],"UK"]
[["Data","Country",1,"Country_Resident"],"Temporary"]
[["Data","Country",1,"Country_Resident"]]
[["Data","Country",1]]
[["Data","User",0,"User_Age"],20]
[["Data","User",0,"User_Name"],"Solomon"]
[["Data","User",0,"User_Name"]]
[["Data","User",1,"User_Age"],30]
[["Data","User",1,"User_Name"],"Absolom"]
[["Data","User",1,"User_Name"]]
[["Data","User",1]]
[["Data","User"]]
[["Data"]]
then reversing the transformation with fromstream.
Try it online!

How to Parse / substitute value in jq for json file

How to parse / substitute value in jquery.
Json file as below:
{
"09800214851900C3": {
"label": "P7-R1-R16:S2",
"name": "Geist Upgradable rPDU",
"state": "normal",
"order": 0,
"type": "i03",
"snmpInstance": 1,
"lifetimeEnergy": "20155338",
"outlet": {},
"alarm": {
"severity": "",
"state": "none"
},
"layout": {
"0": [
"entity/total0",
"entity/phase0",
"entity/phase1",
"entity/phase2"
]
}
}
}
I want to do like below, But this is not working. Any idea/leads on this will be appreciated.
a=09800214851900C3
jsonfile.json | jq '.${a}.label'
Your current try has the following problems :
jsonfile.json isn't a command, so you can't use it as the first token of a command line. You could cat jsonfile.json | jq ..., but the prefered way to have jq work on a file is to use jq 'command' file
you define a variable a in your shell, but you try to reference it inside a single-quoted string, which prevents the shell from expanding it to its actual value. A shell based solution is to use double-quotes to have the variable expanded, but it's preferable to define the variable in the context of jq itself, using a --arg varname value option
09800214851900C3 isn't considered a "simple, identifier-like key" by jq (because it starts with a digit), so the standard way of accessing the value associated to this key (.key) doesn't work and you need to use ."09800214851900C3" or .["09800214851900C3"] instead
In conclusion I believe you will want to use the following command :
jq --arg a 09800214851900C3 '.[$a].label' jsonfile.json

How to parse asterisk in json file as string via jq

Here is a json file named test.json for testing
{
"name": "Google",
"location": {
"street": "1600 Amphitheatre Parkway",
"city": "Mountain View",
"state": "California",
"country": "US"
},
"employees": [
{
"name": "Michael",
"division": "Engineering"
},
{
"name": "Laura",
"division": "HR"
},
{
"name": "Elise",
"division": "Marketing * test"
}
]
}
if I use the jq code to parser it like below:
cat test.json | jq -r '.employees[2].division'
it will work well and give a correct result:
Marketing * test
but I use $(), the bad thing will happen!
echo $(cat test.json | jq -r '.employees[2].division')
the result will list all file names under current folder! like:
my1.json my2.json test.json test ...
I guess it $() run asterisk * as a shell script, but a string.
so how to make asterisk (*) in json file just as a string when I am using jq ?. I am using Google cloud platform and Ubuntu 17.10
Always use double-quotes around command-substitution to avoid * to be treated literally. The * is a special character in shell that is a wildcard entry that expands to all the files available in the current working directory. You need to quote it to deprive of its special meaning (Refer GNU bash man page under Parameters section).
Also jq can process the file directly, you can avoid useless cat usage.
result="$(jq -r '.employees[2].division' < test.json)"
echo "$result"
should produce the result as expected.

mongoexport - Leaf Level - JSON to CSV conversion - egrep not working with multiple patterns using "|" pipe or with -f option

Why egrep is not giving me all the matching entries?
This is my simple JSON blob:
[nukaNUKA#dev-machine csv]$ cat jsonfile.json
{"number": 303,"projectName": "giga","queueId":8881,"result":"SUCCESS"}
This is my pattern file (so that I don't scare the editor):
[nukaNUKA#dev-machine csv]$ cat egrep-pattern.txt
\"number\":.*\"projectName
\"projectName\":.*,\"queueId
\"queueId\":.*,\"result
\"result\":\".*$
This is egrep/grep command for individual searches, which works!:
[nukaNUKA#dev-machine csv]$ egrep -o "\"number\":.*\"projectName" jsonfile.json
"number": 303,"projectName
[nukaNUKA#dev-machine csv]$ egrep -o "\"projectName\":.*,\"queueId" jsonfile.json
"projectName": "giga","queueId
[nukaNUKA#dev-machine csv]$ egrep -o "\"queueId\":.*,\"result" jsonfile.json
"queueId":8881,"result
[nukaNUKA#dev-machine csv]$ egrep -o "\"result\":\".*$" jsonfile.json
"result":"SUCCESS"}
So, wth this didn't work? I don't wear glasses, yes.
[nukaNUKA#dev-machine csv]$ egrep -o "\"number\":.*\"projectName|\"projectName\":.*,\"queueId|\"queueId\":.*,\"result|\"result\":\".*$" jsonfile.json
"number": 303,"projectName
"queueId":8881,"result
[nukaNUKA#dev-machine csv]$ egrep -o -f egrep-pattern.txt jsonfile.json
"number": 303,"projectName
"queueId":8881,"result
[nukaNUKA#dev-machine csv]$
I have a complex nested JSON blob and because everything is unstructured, it seems like, I can't use JQ or JSONV or anything other Python script (as the data that I'm looking for is stored in arrays containing 1 dictionary entries (key=value) with same key names for what I'm looking for (ex: { "parameters": [ { "name": "jobname", "value": "shenzi" }, { "name": "pipelineVersion", "value": "1.2.3.4" }, ...so on..., ... ]) and the index for jobname and pipelineVersion or similar parameter names is not at the same index[X] location in every JSON entry I have.
Worst case, I can add conditional checks to see if the key at every index matches, jobname etc and then I get those fields what I looking for, but then, there are hundreds of such fields that I want to grab. I don't want to hard code those if possible.
I thought as my JSON entry is per line, I can simply write a cool patterns (ugly I know) but at least then I don't need to worry about the conditional code or just use BASH/sed/tr/cut power to get what I need but it seems like egrep -f or -o ... didn't work as shown above.
Sample JSON blob object (from one Jenkins job). There are different Jenkins build job's JSON blob entries (each having different JSON structures, parameters etc) in a single JenkinsJobsBuild collection in MongoDB.
See attached for sample JSON blob object.
{
"_id": {
"$oid": "5120349es967yhsdfs907c4f"
},
"actions": [
{
"causes": [
{
"shortDescription": "Started by an SCM change"
}
]
},
{
},
{
"oneClickDeployPossible": false,
"oneClickDeployReady": false,
"oneClickDeployValid": false
},
{
},
{
},
{
},
{
"cspec": "element * ...\/MyProject_latest_int\/LATESTnelement * ...\/MyProject_integration\/LATESTnelement \/vobs\/some_vob\/gigi \/main\/myproject_integration\/MyProject_Slot_0_maint_int\/LATESTnelement * ...\/myproject_integration\/LATESTnelement \/vobs\/some_vob \/main\/LATEST",
"latestBlsOnConfiguredStream": null,
"stream": null
},
{
},
{
"parameters": [
{
"name": "CLEARCASE_VIEWTAG",
"value": "jenkins_MyProject_latest"
},
{
"name": "BUILD_DEBUG",
"value": false
},
{
"name": "CLEAN_BUILD",
"value": true
},
{
"name": "BASEVERSION",
"value": "7.4.1"
},
{
"name": "ARTIFACTID",
"value": "lowercaseprojectname"
},
{
"name": "SYSTEM",
"value": "myprojectSystem"
},
{
"name": "LOT",
"value": "02"
},
{
"name": "PIPENUMBER",
"value": "7.4.1.303"
}
]
},
{
},
{
},
{
"parameters": [
{
"name": "DESCRIPTION_SETTER_DESCRIPTION",
"value": "lowercaseprojectname_V7.4.1.303"
}
]
},
{
},
{
},
{
},
{
}
],
"artifacts": [
],
"building": false,
"builtOn": "servername",
"changeSet": {
"items": [
{
"affectedPaths": [
"vobs\/some_vob\/myproject\/apps\/app1\/Java\/test\/src\/com\/giga\/highlevelproject\/myproject\/schedule\/validation\/SomeActivityTest.java"
],
"author": {
"absoluteUrl": "http:\/\/11.22.33.44:8080\/user\/hitj1620",
"fullName": "name1, name2 A"
},
"commitId": null,
"date": {
"$numberLong": "1489439532000"
},
"dateStr": "13\/03\/2017 21:12:12",
"elements": [
{
"action": "create version",
"editType": "edit",
"file": "vobs\/some_vob\/myproject\/apps\/app1\/Java\/test\/src\/com\/giga\/highlevelproject\/myproject\/schedule\/validation\/SomeActivityTest.java",
"operation": "checkin",
"version": "\/main\/MyProject_latest_int\/2"
}
],
"msg": "",
"timestamp": -1,
"user": "user111"
}
],
"kind": null
},
"culprits": [
{
"absoluteUrl": "http:\/\/11.22.33.44:8080\/user\/nuka1620",
"fullName": "nuka, Chuck"
}
],
"description": "lowercaseprojectname_V7.4.1.303",
"displayName": "#303",
"duration": 525758,
"estimatedDuration": 306374,
"executor": null,
"fullDisplayName": "MyProject \u00bb MyProject-build #303",
"highlevelproject_metrics_source_url": "http:\/\/11.22.33.44:8080\/job\/MyProject\/job\/MyProject-build\/303\/\/api\/json",
"id": "303",
"keepLog": false,
"number": 303,
"projectName": "MyProject-build",
"queueId": 8201,
"result": "SUCCESS",
"timeToRepair": null,
"timestamp": {
"$numberLong": "1489439650307"
},
"url": "http:\/\/11.22.33.44:8080\/job\/MyProject\/job\/MyProject-build\/303\/"
}
When the regexes are in a file, you don't have to escape double quotes; you don't have to fight to get your double quotes past the shell.
"number":.*"projectName
"projectName":.*,"queueId
"queueId":.*,"result
"result":".*$
When that's fixed, I get:
$ egrep -o -f egrep-pattern.txt jsonfile.json
"number": 303,"projectName
"queueId":8881,"result
$
The trouble now is, I think, that you've consumed the projectName with the first pattern, so the others don't get a chance to match it. Change the patterns to read up to a comma and you can get better results:
"number":[^,]*
"projectName":[^,]*
"queueId":[^,]*
"result":".*$
yields:
"number": 303
"projectName": "giga"
"queueId":8881
"result":"SUCESS"}
You could try to be more delicate, but you rapidly reach a point where a JSON-aware tool becomes more sensible. Commas in a string value would mess up the modified regexes, for example. (So, if the project name was "Giga, if not Tera", you'd have problems.)
Matching more general JSON name:value notation
As long as you're looking for simple "key":"quoted value" objects, you can use the following grep -E (aka egrep) command:
grep -Eoe '"[^"]+":"((\\(["\\/bfnrt]|u[0-9a-fA-F]{4}))|[^"])*"' data
Given the JSON-like data (in the file called data):
{"key1":"value","key2":"value2 with \"quoted\" text","key3":"value3 with \\ and \/ and \f and \uA32D embedded"}
that script produces:
"key1":"value"
"key2":"value2 with \"quoted\" text"
"key3":"value3 with \\ and \/ and \f and \uA32D embedded"
You can upgrade it to handle almost any valid JSON "key":value by using:
grep -Eoe '"[^"]+":(("((\\(["\\/bfnrt]|u[0-9a-fA-F]{4}))|[^"])*")|true|false|null|(-?(0|[1-9][0-9]*)(\.[0-9]+)?([eE][-+]?[0-9]+)?))' data
With a new data file containing:
{"key1":"value","key2":"value2 with \"quoted\" text"}
{"key3":"value3 with \\ and \/ and \f and \uA32D embedded"}
{"key4":false,"key5":true,"key6":null,"key7":7,"key8":0,"key9":0.123E-23}
{"key10":10,"key11":3.14159,"key12":0.876,"key13":-543.123}
the script produces:
"key1":"value"
"key2":"value2 with \"quoted\" text"
"key3":"value3 with \\ and \/ and \f and \uA32D embedded"
"key4":false
"key5":true
"key6":null
"key7":7
"key8":0
"key9":0.123E-23
"key10":10
"key11":3.14159
"key12":0.876
"key13":-543.123
You can follow the railroad diagrams in the outline JSON specification at http://json.org to see how I created the regex.
It could be enhanced by the judicious addition of [[:space:]]* in places where spaces are permitted but not required — before the key string, before the colon, after the colon (you could add it after the value too, but you probably don't want that).
Another simplification that I've taken is that the key doesn't allow for the various escape characters that the value string does. You could repeat that.
And, of course, this only works for 'leaf' name:value pairs; if a value is itself an object {…} or an array […], this doesn't handle the value as a whole.
However, this just goes to emphasize that it gets very messy very quickly and you would be better off using a special-purpose JSON query tool. One such tool is jq, as mentioned in a comment to the main query.
The complex JSON blob I had, was from Jenkins (i.e. Jenkins job's RestAPI data) that I had in MongoDB database.
To grab it from MongoDB, I used mongoexport command for generating (non-JsonArray or non-Pretty format) JSON blob successfully.
#/bin/bash
server=localhost
collectionFile=collections.txt
## Generate collection file contains all collections in the Jenkins database in MongoDB.
( set -x
mongo "mongoDbServer.company.com/database_Jenkins" --eval "rs.slaveOk();db.getCollectionNames()" --quiet > ${collectionFile}
)
## create collection based JSON files
for collection in $(cat ${collectionFile} | sed -e 's:,: :g')
do
mongoexport --host ${server} --db ${db} --collection "${collection}" --out ${exportDir}/${collection}.json
##mongoexport --host ${server} --db ${db} --collection "${collection}" --type=csv --fieldFile ~/mongoDB_fetch/get_these_csv_fields.txt --out ${exportDir}/${collection}.csv; ## This didn't work if you have nested fields. fieldFile file was just containing field name per line in a particular xyz.IndexNumber.yyy format.
done
Tried inbuilt mongoexport command's --type=csv with -f fields to catch topfield.0.subField, field2, field3.7.parameters.7.. nothing worked.
PS: The number after the . mark is how you define indexes if you are going to create a CSV file and using fields (mandatory) using mongoexport command.
As my JSON structure was all unstructured (Jenkins version bumps/upgrades happened in past and the data about a job was not the same structure), I tried this final sed trick (as JSON data per entry was in each individual line).
This sed command (as shown below) will give you all the keys and it's values (in the key=value format) per line at the LEAF field key=value level of almost any JSON blob / at least from the Jenkins JSON blob . Once you have this info, you can feed the output of this command to temporary file, then read all the value part (after the = mark) and create your CSV file acc. YES, you have to sort it so that your CSV file's fields are maintained in order for the header names and thus values are inserted to the right column/field. I calculated the fields names from all different collection JSON file's temporary key=value generated key names. Then, read all temporary collection files and added the values acc. into the final CSV file under respective header/field/column.
OK this is a weird solution but at least it's a solution - in one liner.
cat myJenkinsJob.json | sed "s/{}//g;s/,,*/,/g;s/},\"/\n/g;s/},{/\n/g;s/\([^\"a-zA-Z]\),\"/\1\n/g;s/:\[{/\n/g;s/\"name\":\"//g;s/\",\"value//g;s/,\"/\n/g;s/\":\"*/=/g;s/\"//g;s/[\[}\]]//g;s/[{}]//g;s/\$[a-zA-Z][a-zA-Z]*=//g"|grep "=" | sed "s/,$//"|egrep -v "=-|=$|=\[|^_class="
Tweak this acc. to your own solution for the sed part a little bit, if your JSON blob shows you funny characters that you don't want. The order of sed operations below is important. I'm also excluding any redundant variables (that I don't need at this time, for ex: JSON blob contained _class="..." values) so I'm excluding those via egrep -v after the last | pipe.