Splitting nested arrays as separate entities - json

I have some JSON data which contains attributes and some array elements. I would like to push a given set of fields into the array elements and then separate the arrays as separate entities.
Source data looks like this
[
{
"phones": [
{
"phone": "555-555-1234",
"type": "home"
},
{
"phone": "555-555-5678",
"type": "mobile"
}
],
"email": [
{
"email": "a#b.com",
"type": "work"
},
{
"email": "x#c.com",
"type": "home"
}
],
"name": "john doe",
"year": "2012",
"city": "cupertino",
"zip": "555004"
},
{
"phones": [
{
"phone": "555-666-1234",
"type": "home"
},
{
"phone": "555-666-5678",
"type": "mobile"
}
],
"email": [
{
"email": "a#b.com",
"type": "work"
},
{
"email": "x#c.com",
"type": "home"
}
],
"name": "jane doe",
"year": "2000",
"city": "los angeles",
"zip": "555004"
}
]
I expect a result like this
{
"person": [
{
"name": "john doe",
"year": "2012",
"city": "cupertino",
"zip": "555004"
},
{
"name": "jane doe",
"year": "2000",
"city": "los angeles",
"zip": "555004"
}
],
"phones": [
{
"name": "john doe",
"year": "2012",
"phone": "555-555-1234",
"type": "home"
},
{
"name": "john doe",
"year": "2012",
"phone": "555-555-5678",
"type": "mobile"
},
{
"name": "jane doe",
"year": "2000",
"phone": "555-666-1234",
"type": "home"
},
{
"name": "jane doe",
"year": "2000",
"phone": "555-666-5678",
"type": "mobile"
}
],
"email": [
{
"name": "john doe",
"year": "2012",
"email": "a#b.com",
"type": "work"
},
{
"name": "john doe",
"year": "2012",
"email": "x#c.com",
"type": "home"
},
{
"name": "jane doe",
"year": "2000",
"email": "a#b.com",
"type": "work"
},
{
"name": "jane doe",
"year": "2000",
"email": "x#c.com",
"type": "home"
}
]
}
I have been able to get the desired result, but I can't make it work in a generic way.
experiment on jqterm
The code below achieves the job, but I would like to pass the array of columns to be injected into the child arrays, the name of the primary result and an array containing the array field names.
["phones", "email"] as $children
| ["name", "year"] as $ids
|{person: map(with_entries(
. as $data | select($children|contains([$data.key])|not)
))}
+ {"phones": split_child($children[0];$ids)}
+ {"email": split_child($children[1];$ids)}

It's a lot more easier to achieve this using multiple reduces, like:
def split_data($parent; $ids; $arr_cols):
($arr_cols | map([.])) as $p
| reduce .[] as $in ({}; .[$parent] += [$in | delpaths($p)]
| (reduce $ids[] as $k ({}; . + {($k): $in[$k]}) as $s
| reduce $arr_cols[] as $k (.; .[$k] += [$in[$k][] + $s])
);
split_data("person"; ["name", "year"]; ["phones", "email"])

Here's a straightforward solution to the generic problem (it uses reduce only once, in a helper function). To understand it, it might be helpful to see it as an abstraction of this concrete solution:
{ person: [.[] | {name, year, city, zip} ]}
+ { phones: [.[] | ({name, year} + .phones[]) ]}
+ { email: [.[] | ({name, year} + .email[]) ]}
Helper function
Let's first define a helper function for constructing an object by selecting a set of keys:
def pick($ary):
. as $in
| reduce $ary[] as $k ({};
. + {($k): $in[$k]});
split_data
Here finally is the function that takes as arguments the $parent, $ids, and columns of interest. The main complication is ensuring that the supplemental keys ("city" and "zip") are dealt with in the proper order.
def split_data($parent; $ids; $arr_cols):
(.[0]|keys_unsorted - $arr_cols - $ids) as $extra
| { ($parent): [.[] | pick($ids + $extra)] }
+ ([$arr_cols[] as $k
| {($k): [.[] | pick($ids) + .[$k][]] }] | add) ;
The invocation:
split_data("person"; ["name", "year"]; ["phones", "email"])
produces the desired result.

Related

How to extract a paticular key from the json

I am trying to extract values from a json that I obtained using the curl command for api testing. My json looks as below. I need some help extracting the value "20456" from here?
{
"meta": {
"status": "OK",
"timestamp": "2022-09-16T14:45:55.076+0000"
},
"links": {},
"data": {
"id": 24843,
"username": "abcd",
"firstName": "abc",
"lastName": "xyz",
"email": "abc#abc.com",
"phone": "",
"title": "",
"location": "",
"licenseType": "FLOATING",
"active": true,
"uid": "u24843",
"type": "users"
}
}
{
"meta": {
"status": "OK",
"timestamp": "2022-09-16T14:45:55.282+0000",
"pageInfo": {
"startIndex": 0,
"resultCount": 1,
"totalResults": 1
}
},
"links": {
"data.createdBy": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.createdBy}"
},
"data.fields.user1": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.fields.user1}"
},
"data.modifiedBy": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.modifiedBy}"
},
"data.fields.projectManager": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.fields.projectManager}"
},
"data.parent": {
"type": "projects",
"href": "https://abc#abc.com/rest/v1/projects/{data.parent}"
}
},
"data": [
{
"id": 20456,
"projectKey": "Stratus",
"parent": 20303,
"isFolder": false,
"createdDate": "2018-03-12T23:46:59.000+0000",
"modifiedDate": "2020-04-28T22:14:35.000+0000",
"createdBy": 18994,
"modifiedBy": 18865,
"fields": {
"projectManager": 18373,
"user1": 18628,
"projectKey": "Stratus",
"text1": "",
"name": "Stratus",
"description": "",
"date2": "2019-03-12",
"date1": "2018-03-12"
},
"type": "projects"
}
]
}
I have tried the following, but end up getting error:
▶ cat jqTrial.txt | jq '.data[].id'
jq: error (at <stdin>:21): Cannot index number with string "id"
20456
Also tried this but I get strings outside the object that I am not sure how to remove:
cat jqTrial.txt | jq '.data[]'
Assuming you want the project id not the user id:
jq '
.data
| if type == "object" then . else .[] end
| select(.type == "projects")
| .id
' file.json
There's probably a better way to write the 2nd expression
Indeed, thanks to #pmf
.data | objects // arrays[] | select(.type == "projects").id
Your input consists of two JSON documents; both have a data field on top level. But while the first one is itself an object which has an .id field, the second one is an array with one object item, which also has an .id field.
To retrieve both, you could use the --slurp (or -s) option which wraps both top-level objects into an array, then you can address them separately by index:
jq --slurp '.[0].data.id, .[1].data[].id' jqTrial.txt
24843
20456
Demo

Compare 2 JSON and retrieve subset from one of them based on condition in Powershell

I have two JSON files abc.json and xyz.json.
Content in abc.json is:
[{"id": "121",
"name": "John",
"location": "europe"
},
{"id": "100",
"name": "Jane",
"location": "asia"
},
{"id": "202",
"name": "Doe",
"location": "america"
}
]
Updated -> Content in xyz.json is:
{
"value": [
{
"id": "111",
"city": "sydney",
"profession": "painter"
},
{
"id": "200",
"city": "istanbul",
"profession": "actor"
},
{
"id": "202",
"city": "seattle",
"profession": "doctor"
}
],
"count": {
"type": "Total",
"value": 3
}
}
I want to get those records of abc.json in when the id in both objects are equal.In this case:
{"id": "202",
"name": "Doe",
"location": "america"
}
I need to do this in Powershell and the version I am using is 5.1.This is what I have tried:
$OutputList = #{}
$abcHash = Get-Content 'path\to\abc.json' | Out-String | ConvertFrom-Json
$xyzHash = Get-Content 'path\to\xyz.json' | Out-String | ConvertFrom-Json
$xyzResp = $xyzHash.value
foreach($item in $xyzResp){
foreach ($record in $abcHash){
if ($item.id -eq $record.id){
$OutputList.Add($record, $null)
}
}
}
Write-Output $OutputList
But on printing the OutputList , I get like this:
Key: #{"id": "202",
"name": "Doe",
"location": "america"
}
Value:
Name:#{"id": "202",
"name": "Doe",
"location": "america"
}
What I require is more of a PSObject like:
id: 202
name:Doe
location:america
I tried using Get-Member cmdlet but could not quite reach there.
Is there any suggestion I could use?
I have corrected your example xyz.json because there was an extra comma in there that should not be there. Also, the example did not have an iten with id 202, so there would be no match at all..
xyz.json
{
"value": [
{
"id": "111",
"city": "sydney",
"profession": "painter"
},
{
"id": "202",
"city": "denver",
"profession": "painter"
},
{
"id": "111",
"city": "sydney",
"profession": "painter"
}
],
"count": {
"type": "Total",
"value": 3
}
}
That said, you can use a simple Where-Object{...} to get the item(s) with matching id's like this:
$abc = Get-Content 'path\to\abc.json' -Raw | ConvertFrom-Json
$xyz = Get-Content 'path\to\xyz.json' -Raw | ConvertFrom-Json
# get the items with matching id's as object(s)
$abc | Where-Object { $xyz.value.id -contains $_.id}
Output:
id name location
-- ---- --------
202 Doe america
Of course you can capture the output first and display as list and/or save to csv, convert back to json and save that.

Reshape a array to object with jq

I am learning advance concepts of jq. And I made a tiny json with array with some films of Charles Chaplin...well this array in json:
[
{
"title": "The Great Dictator",
"year": 1940,
"country": "USA",
"genre": "political satire"
},
{
"title": "Modern Times ",
"year": 1936,
"country": "USA",
"genre": "comedy"
},
{
"title": "The Gold Rush",
"year": 1925,
"country": "USA",
"genre": "comedy"
},
{
"title": "The Kid",
"year": 1921,
"country": "USA",
"genre": "drama"
}
]
And I want to convert or reshape into a object with the genres as the keys and the list of the films as array (comedy is only has two element in the array):
{
"comedy": [
{
"title": "Modern Times ",
"year": 1936,
"country": "USA"
},
{
"title": "The Gold Rush",
"year": 1925,
"country": "USA"
}
],
"political satire": [
{
"title": "The Great Dictator",
"year": 1940,
"country": "USA"
}
],
"drama": [
{
"title": "The Kid",
"year": 1921,
"country": "USA"
}
]
}
But I can't do it. I trying the first step to create a object with genre and foo string as var, but it fails: cat c.json | jq '{.[] | (.genre): "foo" ]}'
It can be done in three lines:
[group_by(.genre)[]
| {(.[0].genre): map_values(del(.genre))}]
| add
aggregate_by/3
The relevant generic abstraction here is:
def aggregate_by(s; f; g):
reduce s as $x (null; .[$x|f] += [$x|g]);
This allows the solution to be written directly as:
aggregate_by(.[]; .genre; del(.genre))
I found:
$ cat c.json | jq '
group_by(.genre)
| map({"genre": .[0].genre,
"film": map(. | del(.genre))})
| [ .[] | {(.genre): .film}]
| add'
{
"comedy": [
{
"title": "Modern Times ",
"year": 1936,
"country": "USA"
},
{
"title": "The Gold Rush",
"year": 1925,
"country": "USA"
}
],
"drama": [
{
"title": "The Kid",
"year": 1921,
"country": "USA"
}
],
"political satire": [
{
"title": "The Great Dictator",
"year": 1940,
"country": "USA"
}
]
}
Maybe it is not the best, because I think there are a lot of steps...but it runs.
You can use a modified version of Jeff Mercado's answer on the page you linked to.
jq 'reduce .[] as $i ({}; .[$i.genre] += [$i])'
That groups the objects as you want but leaves the genre key-value pair. You can delete them like so.
jq 'reduce .[] as $i ({}; .[$i.genre] += [$i|del(.genre)])'
Really, this is just a concrete version of peak's "generic abstraction".

Add key value to parent subelement if child has specific key:value

I'm trying to understand what's the best way to add a json element to child's parent
if that child contains a specific key:value and finally print the entire json using jq
I try to explain better with an example.
The input json is:
{
"family": {
"surname": "Smith"
},
"components": [
{
"name": "John",
"details": {
"hair": "brown",
"eyes": "brown",
"age": "56"
},
"role": "father"
},
{
"name": "Mary",
"details": {
"hair": "blonde",
"eyes": "green",
"age": "45"
},
"role": "mother"
},
{
"name": "George",
"details": {
"hair": "blonde",
"eyes": "brown",
"age": "25"
},
"role": "child"
}
]
}
I want to add:
"description": "5 years less than 30"
at the same level of "details" if "age" is equal to "25" and then print the result:
{
"family": {
"surname": "Smith"
},
"components": [
{
"name": "John",
"details": {
"hair": "brown",
"eyes": "brown",
"age": "56"
},
"role": "father"
},
{
"name": "Mary",
"details": {
"hair": "blonde",
"eyes": "green",
"age": "45"
},
"role": "mother"
},
{
"name": "George",
"details": {
"hair": "blonde",
"eyes": "brown",
"age": "25"
},
"role": "child",
"description": "5 years less than 30"
}
]
}
The only solution I've found was to apply the update but printing only the "components" content;
then I've removed from the JSON and finally inserted the modified "components" content previously saved, in this way:
cat sample.json | jq -c ' .components[] | select(.details.age=="25") |= . + {description: "5 years less than 30" } ' > /tmp/saved-components.tmp
cat sample.json | jq --slurpfile savedcomponents /tmp/saved-components.tmp 'del(.components) | . + { components: [ $savedcomponents ] }'
I don't think it's the best way to solve these kind of problems, so I'd like to know what is
the right "jq approach".
I forgot to say: I prefer to use jq only, not other tools
Than you
Marco
You can select the object matching the condition and append to that object. Something like below. The key is to use += the modification assignment to not lose the other objects
(.components[] | select(.details.age == "25")) += { "description": "5 years less than 30" }
jqplay - Demo
Here's a straightforward ("no magic") and efficient solution:
.components |=
map(if .details.age=="25" then .description = "5 years less than 30" else . end)

Combining Nested Json using PowerShell

I have the following Json script:
{
"merchant_info": {
"email": "merchant#example.com",
"first_name": "David",
"last_name": "Larusso",
"business_name": "Mitchell & Murray",
"phone": {
"country_code": "001",
"national_number": "4085551234"
},
"address": {
"line1": "1234 First Street",
"city": "Anytown",
"state": "CA",
"postal_code": "98765",
"country_code": "US"
}
},
"billing_info": [{
"email": "bill-me#example.com",
"first_name": "Stephanie",
"last_name": "Meyers"
}
],
"shipping_info": {
"first_name": "Stephanie",
"last_name": "Meyers",
"address": {
"line1": "1234 Main Street",
"city": "Anytown",
"state": "CA",
"postal_code": "98765",
"country_code": "US"
}
},
"items": [{
"name": "Zoom System wireless headphones",
"quantity": 2,
"unit_price": {
"currency": "USD",
"value": "120"
},
"tax": {
"name": "Tax",
"percent": 8
}
}, {
"name": "Bluetooth speaker",
"quantity": 1,
"unit_price": {
"currency": "USD",
"value": "145"
},
"tax": {
"name": "Tax",
"percent": 8
}
}
],
"discount": {
"percent": 1
},
"shipping_cost": {
"amount": {
"currency": "USD",
"value": "10"
}
},
"note": "Thank you for your business.",
"terms": "No refunds after 30 days."
}
And I want to use PowerShell to get the following Record and export it to CSV:
So far I created the following Script:
$JsonFile = "C:\Users\me\Documents\myfile.json"
$OutputFile = "C:\Users\me\Documents\newtext.csv"
Get-Content -Path $OutputFile
$json = ConvertFrom-Json (Get-Content $JsonFile -Raw)
$json.merchant_info | Select "first_name","last_name",#{Label = "phone"; Expression = {$_.phone.national_number}} |
Export-Csv $OutputFile -NoTypeInformation
I am able to bring values from (Merchant_info, Shipping_info, item) separetely but how do I bring it all in combined like in my screen shot above.
but how do I bring it all in combined like in my screen shot above.
We can only guess; assuming this entire json block is one order, with one merchant and one customer, but multiple items, then each row is an item. So start with that as the input:
Create an output record (PSCustomObject) with the repeated data, and then the individual item data:
$json.items | ForEach-Object {
[PSCustomObject]#{
MerchantInfoFirstName = $json.merchant_info.first_name
MerchantInfoLastName = $json.merchant_info.last_name
MerchantInfoPhoneNumber = $json.merchant_info.phone.national_number
ShippingInfoFirstName = $json.shipping_info.first_name
ShippingInfoLastName = $json.shipping_info.last_name
ItemName = $_.name
ItemQuantity = $_.quantity
}
} | Export-Csv ... etc.