transform json to add array objects - json

I need to transform an array by adding additional objects -
I have:
"user_id":"testuser"
"auth_token":"abcd"
I need:
"key":"user_id"
"value":"testuser"
"key":"auth_token"
"value":"abcd"
I have been using jq but cant figure out how to do it. Do i need to transform this into a multi-dimensional array first?
I have tried multiple jq queries but cant find the most suitable
When i try using jq i get
jq: error: syntax error, unexpected $end, expecting QQSTRING_TEXT or QQSTRING_INTERP_START or QQSTRING_END (Unix shell quoting issues?) at , line 1

Your input is not json, it's just a bunch of what could be thought of as key/value pairs. Assuming your json input actually looked like this:
{
"user_id": "testuser",
"auth_token": "abcd"
}
You could get an array of key/value pair objects using to_entries.
$ jq 'to_entries' input.json
[
{
"key": "user_id",
"value": "testuser"
},
{
"key": "auth_token",
"value": "abcd"
}
]
If on the other hand your input was actually that, you would need to convert it to a format that can be processed. Fortunately you could read it in as a raw string and probably parse using regular expressions or basic string manipulation.
$ jq -Rn '[inputs|capture("\"(?<key>[^\"]+)\":\"(?<value>[^\"]*)\"")]' input.txt
$ jq -Rn '[inputs|split(":")|map(fromjson)|{key:.[0],value:.[1]}]' input.txt

You can use to_entries filter for that.
Here is jqplay example

Robust conversion of key:value lines to JSON.
If the key:value specifications would be valid JSON except for the
missing punctuation (opening and closing braces etc), then a simple and quite robust approach to converting these key:value pairs to a single valid JSON object is illustrated by the following:
cat <<EOF | jq -nc -R '["{" + inputs + "}" | fromjson] | add'
"user_id": "testuser"
"auth_token" : "abcd"
EOF
Output
{
"user_id": "testuser",
"auth_token": "abcd"
}

Related

How to prettify json with jq given a string with escaped double quotes

I would like to pretty print a json string I copied from an API call which contains escaped double quotes.
Similar to this:
"{\"name\":\"Hans\", \"Hobbies\":[\"Car\",\"Swimming\"]}"
However when I execute pbpaste | jq "." the result is not changed and is still in one line.
I guess the problem are the escaped double quotes, but I don't know how to tell jq to remove them if possible, or any other workaround.
What you have is a JSON string which happens to contain a JSON object. You can decode the contents of the string with the fromjson function.
$ pbpaste | jq "."
"{\"name\":\"Hans\", \"Hobbies\":[\"Car\",\"Swimming\"]}"
$ pbpaste | jq "fromjson"
{
"name": "Hans",
"Hobbies": [
"Car",
"Swimming"
]
}

Convert json filtered into csv with jq

I have file that looks like this:
$ cat sample-test.json |jq .
{
"logRef": "c4fa4367-23f6-462f-b5fd-f972d0916a30",
"timestamp": 1563268297545,
"someOtherField": "nonImportantValue"
}
{
"logRef": "c4fa4367-23f6-462f-b5fd-f972d0916a31",
"timestamp": 1563268297595,
"someOtherField2": "nonImportantValue3"
}
And I would like to convert it to csv like this:
logRef;timestamp
c4fa4367-23f6-462f-b5fd-f972d0916a30;1563268297545
c4fa4367-23f6-462f-b5fd-f972d0916a31;1563268297595
I was trying
$ cat sample-test.json |jq '.logRef, .timestamp |#csv'
jq: error (at <stdin>:1): string ("c4fa4367-2...) cannot be csv-formatted, only array
jq: error (at <stdin>:2): string ("c4fa4367-2...) cannot be csv-formatted, only array
Your input is fine (it's a JSON stream).
The problem with your filter is that #csv expects an array. So this will work:
[.logRef,.timestamp] | #csv
However it quotes strings, so if you want your strings unquoted (which might mean the result won't be CSV), then you could use:
"\(.logRef),\(.timestamp)"
In all cases, you'll need to use jq's-r command-line option.
The problem in your json file. Looks like it has incorrect format (without root array element [] and commas between documents). If you fix it, jq will work as expected.
> cat sample-test.json
[{
"logRef": "c4fa4367-23f6-462f-b5fd-f972d0916a30",
"timestamp": 1563268297545,
"someOtherField": "nonImportantValue"
},
{
"logRef": "c4fa4367-23f6-462f-b5fd-f972d0916a31",
"timestamp": 1563268297595,
"someOtherField2": "nonImportantValue3"
}]
cat sample-test.json |jq -r 'map(.logRef), map(.timestamp) | #csv'
"c4fa4367-23f6-462f-b5fd-f972d0916a30","c4fa4367-23f6-462f-b5fd-f972d0916a31"
1563268297545,1563268297595
I've also fixed the command with map() function.

jq - parsing& replacement based on key-value pairs within json

I have a json file in the form of a key-value map. For example:
{
"users":[
{
"key1":"user1",
"key2":"user2"
}
]
}
I have another json file. The values in the second file has to be replaced based on the keys in first file.
For example 2nd file is:
{
"info":
{
"users":["key1","key2","key3","key4"]
}
}
This second file should be replaced with
{
"info":
{
"users":["user1","user2","key3","key4"]
}
}
Because the value of key1 in first file is user1. this could be done with any python program, but I am learning jq and would like to try if it is possible with jq itself. I tried different combinations with reading file using slurpfile, then select & walk etc. But couldn't arrive at the required solution.
Any suggestions for the same will be appreciated.
Since .users[0] is a JSON dictionary, it would make sense to use it as such (e.g. for efficiency):
Invocation:
jq -c --slurpfile users users.json -f program.jq input.json
program.jq:
$users[0].users[0] as $dict
| .info.users |= map($dict[.] // .)
Output:
{"info":{"users":["user1","user2","key3","key4"]}}
Note: the above assumes that the dictionary contains no null or false values, or rather that any such values in the dictionary should be ignored. This avoids the double lookup that would otherwise be required. If this assumption is invalid, then a solution using has or in (e.g. as provided by RomanPerekhrest) would be appropriate.
Solution to supplemental problem
(See "comments".)
$users[0].users[0] as $dict
| second
| .info.users |= (map($dict[.] | select(. != null)))
sponge
It is highly inadvisable to use redirection to overwrite an input file.
If you have or can install sponge, then it would be far better to use it. For further details, see e.g. "What is jq's equivalent of sed -i?" in the jq FAQ.
jq solution:
jq --slurpfile users 1st.json '$users[0].users[0] as $users
| .info.users |= map(if in($users) then $users[.] else . end)' 2nd.json
The output:
{
"info": {
"users": [
"user1",
"user2",
"key3",
"key4"
]
}
}

jq construct with value strings spanning multiple lines

I am trying to form a JSON construct using jq that should ideally look like below:-
{
"api_key": "XXXXXXXXXX-7AC9-D655F83B4825",
"app_guid": "XXXXXXXXXXXXXX",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": [
"event"
],
"traffic_including": [
"unattributed_traffic"
],
"time_zone": "Australia/NSW",
"delivery_format": "csv",
"columns_order": [
"attribution_attribution_action",
"attribution_campaign",
"attribution_campaign_id",
"attribution_creative",
"attribution_date_adjusted",
"attribution_date_utc",
"attribution_matched_by",
"attribution_matched_to",
"attribution_network",
"attribution_network_id",
"attribution_seconds_since",
"attribution_site_id",
"attribution_site_id",
"attribution_tier",
"attribution_timestamp",
"attribution_timestamp_adjusted",
"attribution_tracker",
"attribution_tracker_id",
"attribution_tracker_name",
"count",
"custom_dimensions",
"device_id_adid",
"device_id_android_id",
"device_id_custom",
"device_id_idfa",
"device_id_idfv",
"device_id_kochava",
"device_os",
"device_type",
"device_version",
"dimension_count",
"dimension_data",
"dimension_sum",
"event_name",
"event_time_registered",
"geo_city",
"geo_country",
"geo_lat",
"geo_lon",
"geo_region",
"identity_link",
"install_date_adjusted",
"install_date_utc",
"install_device_version",
"install_devices_adid",
"install_devices_android_id",
"install_devices_custom",
"install_devices_email_0",
"install_devices_email_1",
"install_devices_idfa",
"install_devices_ids",
"install_devices_ip",
"install_devices_waid",
"install_matched_by",
"install_matched_on",
"install_receipt_status",
"install_san_original",
"install_status",
"request_ip",
"request_ua",
"timestamp_adjusted",
"timestamp_utc"
]
}
What I have tried unsuccessfully thus far is below:-
json_construct=$(cat <<EOF
{
"api_key": "6AEC90B5-4169-59AF-7AC9-D655F83B4825",
"app_guid": "komacca-s-rewards-app-au-ios-production-cv8tx71",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": ["event"],
"traffic_including": ["unattributed_traffic"],
"time_zone": "Australia/NSW",
"delivery_format": "csv"
"columns_order": ["attribution_attribution_action","attribution_campaign","attribution_campaign_id","attribution_creative","attribution_date_adjusted","attribution_date_utc","attribution_matched_by","attribution_matched_to","attributio
network","attribution_network_id","attribution_seconds_since","attribution_site_id","attribution_tier","attribution_timestamp","attribution_timestamp_adjusted","attribution_tracker","attribution_tracker_id","attribution_tracker_name","
unt","custom_dimensions","device_id_adid","device_id_android_id","device_id_custom","device_id_idfa","device_id_idfv","device_id_kochava","device_os","device_type","device_version","dimension_count","dimension_data","dimension_sum","ev
t_name","event_time_registered","geo_city","geo_country","geo_lat","geo_lon","geo_region","identity_link","install_date_adjusted","install_date_utc","install_device_version","install_devices_adid","install_devices_android_id","install_
vices_custom","install_devices_email_0","install_devices_email_1","install_devices_idfa","install_devices_ids","install_devices_ip","install_devices_waid","install_matched_by","install_matched_on","install_receipt_status","install_san_
iginal","install_status","request_ip","request_ua","timestamp_adjusted","timestamp_utc"]
}
EOF)
followed by:-
echo "$json_construct" | jq '.'
I get the following error:-
parse error: Expected separator between values at line 10, column 15
I am guessing it is because of the string literal which spans to multiple lines that jq is unable to parse it.
Use jq itself:
my_formatted_json=$(jq -n '{
"api_key": "XXXXXXXXXX-7AC9-D655F83B4825",
"app_guid": "XXXXXXXXXXXXXX",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": ["event"],
"traffic_including": ["unattributed_traffic"],
"time_zone": "Australia/NSW",
"delivery_format": "csv",
"columns_order": [
"attribution_attribution_action",
"attribution_campaign",
...,
"timestamp_utc"
]
}')
Your input "JSON" is not valid JSON, as indicated by the error message.
The first error is that a comma is missing after the key/value pair: "delivery_format": "csv", but there are others -- notably, JSON strings cannot be split across lines. Once you fix the key/value pair problem and the JSON strings that are split incorrectly, jq . will work with your text. (Note that once your input is corrected, the longest JSON string is quite short -- 50 characters or so -- whereas jq has no problems processing strings of length 10^8 quite speedily ...)
Generally, jq is rather permissive when it comes to JSON-like input, but if you're ever in doubt, it would make sense to use a validator such as the online validator at jsonlint.com
By the way, the jq FAQ does suggest various ways for handling input that isn't strictly JSON -- see https://github.com/stedolan/jq/wiki/FAQ#processing-not-quite-valid-json
Along the lines of chepner's suggestion since jq can read raw text data you could just use a jq filter to generate a legal json object from your script variables. For example:
#!/bin/bash
# whatever logic you have to obtain bash variables goes here
key=XXXXXXXXXX-7AC9-D655F83B4825
guid=XXXXXXXXXXXXXX
# now use jq filter to read raw text and construct legal json object
json_construct=$(jq -MRn '[inputs]|map(split(" ")|{(.[0]):.[1]})|add' <<EOF
api_key $key
app_guid $guid
EOF)
echo $json_construct
Sample Run (assumes executable script is in script.sh)
$ ./script.sh
{ "api_key": "XXXXXXXXXX-7AC9-D655F83B4825", "app_guid": "XXXXXXXXXXXXXX" }
Try it online!

Linux CLI - How to get substring from JSON jq + grep?

I need to pull a substring from JSON. In the JSON doc below, I need the end of the value of jq '.[].networkProfile.networkInterfaces[].id' In other words, I need just A10NICvw4konls2vfbw-data to pass to another command. I can't seem to figure out how to pull a substring using grep. I've seem regex examples out there but haven't been successful with them.
[
{
"id": "/subscriptions/blah/resourceGroups/IPv6v2/providers/Microsoft.Compute/virtualMachines/A10VNAvw4konls2vfbw",
"instanceView": null,
"licenseType": null,
"location": "centralus",
"name": "A10VNAvw4konls2vfbw",
"networkProfile": {
"networkInterfaces": [
{
"id": "/subscriptions/blah/resourceGroups/IPv6v2/providers/Microsoft.Network/networkInterfaces/A10NICvw4konls2vfbw-data",
"resourceGroup": "IPv6v2"
}
]
}
}
]
In your case, sub(".*/";"") will do the trick as * is greedy:
.[].networkProfile.networkInterfaces[].id | sub(".*/";"")
Try this:
jq -r '.[]|.networkProfile.networkInterfaces[].id | split("/") | last'
The -r tells JQ to print the output in "raw" form - in this case, that means no double-quotes around the string value.
As for the jq expression, after you access the id you want, piping it (still inside jq) through split("/") turns it into an array of the parts between slashes. Piping that through the last function (thanks, #Thor) returns just the last element of the array.
If you want to do it with grep here is one way:
jq -r '.[].networkProfile.networkInterfaces[].id' | grep -o '[^/]*$'
Output:
A10NICvw4konls2vfbw-data