Map conditional child elements - json

I am working with a JSON file which has contains lot of data that can be removed before sending to an API.
Found that JQ can be used to achieve this but not sure on how to map to get the desired results.
Input JSON
{
"name": "Sample name",
"id": "123",
"userStory": {
"id": "234",
"storyName": "Story Name",
"narrative": "Narrative",
"type": "feature"
},
"testSteps": [
{
"number": 1,
"description": "Step 1",
"level": 0,
"children": [
{
"number": 2,
"description": "Description",
"children": [
{
"number": 3,
"description": "Description"
}
]
},
{
"number": 4,
"anotherfield": "another field"
}
]
}
]
}
Desired Output
{
"name": "Sample name",
"userStory": {
"storyName": "Story Name"
},
"testSteps": [
{
"description": "Step 1",
"children": [
{
"description": "Description",
"children": [
{
"description": "Description"
}
]
},
{
"anotherfield": "anotherfield"
}
]
}
]
}
Tried to do it with the following jq command
map_values(..|{name, id, userStory})
but not sure how to filter only the userStory.storyName.
Thanks in advance.
Note: The actual JSON has different child elements that are repeated in some cases.

To delete .id from root object:
del(.id)
To leave only .storyName in .userStory:
.userStory |= {storyName}
To delete .number and .level from every object on any level in .testSteps:
.testSteps |= walk(if type == "object" then del(.number, .level) else . end)
Putting it all together:
del(.id) | (.userStory |= {storyName}) | (.testSteps |=
walk(if type == "object" then del(.number, .level) else . end))
Online demo

Related

How to extract a paticular key from the json

I am trying to extract values from a json that I obtained using the curl command for api testing. My json looks as below. I need some help extracting the value "20456" from here?
{
"meta": {
"status": "OK",
"timestamp": "2022-09-16T14:45:55.076+0000"
},
"links": {},
"data": {
"id": 24843,
"username": "abcd",
"firstName": "abc",
"lastName": "xyz",
"email": "abc#abc.com",
"phone": "",
"title": "",
"location": "",
"licenseType": "FLOATING",
"active": true,
"uid": "u24843",
"type": "users"
}
}
{
"meta": {
"status": "OK",
"timestamp": "2022-09-16T14:45:55.282+0000",
"pageInfo": {
"startIndex": 0,
"resultCount": 1,
"totalResults": 1
}
},
"links": {
"data.createdBy": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.createdBy}"
},
"data.fields.user1": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.fields.user1}"
},
"data.modifiedBy": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.modifiedBy}"
},
"data.fields.projectManager": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.fields.projectManager}"
},
"data.parent": {
"type": "projects",
"href": "https://abc#abc.com/rest/v1/projects/{data.parent}"
}
},
"data": [
{
"id": 20456,
"projectKey": "Stratus",
"parent": 20303,
"isFolder": false,
"createdDate": "2018-03-12T23:46:59.000+0000",
"modifiedDate": "2020-04-28T22:14:35.000+0000",
"createdBy": 18994,
"modifiedBy": 18865,
"fields": {
"projectManager": 18373,
"user1": 18628,
"projectKey": "Stratus",
"text1": "",
"name": "Stratus",
"description": "",
"date2": "2019-03-12",
"date1": "2018-03-12"
},
"type": "projects"
}
]
}
I have tried the following, but end up getting error:
▶ cat jqTrial.txt | jq '.data[].id'
jq: error (at <stdin>:21): Cannot index number with string "id"
20456
Also tried this but I get strings outside the object that I am not sure how to remove:
cat jqTrial.txt | jq '.data[]'
Assuming you want the project id not the user id:
jq '
.data
| if type == "object" then . else .[] end
| select(.type == "projects")
| .id
' file.json
There's probably a better way to write the 2nd expression
Indeed, thanks to #pmf
.data | objects // arrays[] | select(.type == "projects").id
Your input consists of two JSON documents; both have a data field on top level. But while the first one is itself an object which has an .id field, the second one is an array with one object item, which also has an .id field.
To retrieve both, you could use the --slurp (or -s) option which wraps both top-level objects into an array, then you can address them separately by index:
jq --slurp '.[0].data.id, .[1].data[].id' jqTrial.txt
24843
20456
Demo

JQ - unique count of each value in an array

I'm needing to solve this with JQ. I have a large lists of arrays in my json file and am needing to do some sort | uniq -c types of stuff on them. Specifically I have a relatively nasty looking fruit array that needs to break down what is inside. I'm aware of unique and things like that, and imagine there is likely a simple way to do this, but I've been trying run down assigning things as variables and appending and whatnot, but I can't get the most basic part of counting the unique values per that fruit array, and especially not without breaking the rest of the content (hence the variable ideas). Please tell me I'm overthinking this.
I'd like to turn this;
[
{
"uid": "123abc",
"tID": [
"T19"
],
"fruit": [
"Kiwi",
"Apple",
"",
"",
"",
"Kiwi",
"",
"Kiwi",
"",
"",
"Mango",
"Kiwi"
]
},
{
"uid": "456xyz",
"tID": [
"T15"
],
"fruit": [
"",
"Orange"
]
}
]
Into this;
[
{
"uid": "123abc",
"tID": [
"T19"
],
"metadata": [
{
"name": "fruit",
"value": "Kiwi - 3"
},
{
"name": "fruit",
"value": "Mango - 1"
},
{
"name": "fruit",
"value": "Apple - 1"
}
]
},
{
"uid": "456xyz",
"tID": [
"T15"
],
"metadata": [
{
"name": "fruit",
"value": "Orange - 1"
}
]
}
]
Using group_by and length would be one way:
jq '
map(with_entries(select(.key == "fruit") |= (
.value |= (group_by(.) | map(
{name: "fruit", value: "\(.[0] | select(. != "")) - \(length)"}
))
| .key = "metadata"
)))
'
[
{
"uid": "123abc",
"tID": [
"T19"
],
"metadata": [
{
"name": "fruit",
"value": "Apple - 1"
},
{
"name": "fruit",
"value": "Kiwi - 4"
},
{
"name": "fruit",
"value": "Mango - 1"
}
]
},
{
"uid": "456xyz",
"tID": [
"T15"
],
"metadata": [
{
"name": "fruit",
"value": "Orange - 1"
}
]
}
]
Demo

Filtering deeply within tree

I'm trying to prune nodes deeply within a JSON structure and I'm puzzled why empty behaves seemingly different from a normal value here.
Input
[
{
"name": "foo",
"children": [{
"name": "foo.0",
"color": "red"
}]
},
{
"name": "bar",
"children": [{
"name": "bar.0",
"color": "green"
},
{
"name": "bar.1"
}]
},
{
"name": "baz",
"children": [{
"name": "baz.0"
},
{
"name": "baz.1"
}]
}
]
Program
jq '(.[].children|.[])|=if has("color") then . else empty end' foo.json
Actual output
[
{
"name": "foo",
"children": [
{
"name": "foo.0",
"color": "red"
}
]
},
{
"name": "bar",
"children": [
{
"name": "bar.0",
"color": "green"
}
]
},
{
"name": "baz",
"children": [
{
"name": "baz.1"
}
]
}
]
Expected output
The output I get, except without the baz.1 child, as that one doesn't have a color.
Question
Apart from the right solution, I'm also curious why replacing empty in the script by a regular value like 42 would replace the children without colors with 42 as expected, but when replacing with empty, it looks like the else branch doesn't get executed?
.[].children |= map(select(.color))
Will remove children that does not has an color so the output becomes:
[
{
"name": "foo",
"children": [
{
"name": "foo.0",
"color": "red"
}
]
},
{
"name": "bar",
"children": [
{
"name": "bar.0",
"color": "green"
}
]
},
{
"name": "baz",
"children": []
}
]
Online demo
Regarding why your filter does not seem to like empty;
This git issue seems to be the cause, multiple elements with empty will fail.
There must be a bug with assigning empty to multiple paths.
In this case you can use del instead:
del(.[].children[] | select(has("color") | not))
Online demo

Using jq to convert object to key with values

I have been playing around with jq to format a json file but I am having some issues trying to solve a particular transformation. Given a test.json file in this format:
[
{
"name": "A", // This would be the first key
"number": 1,
"type": "apple",
"city": "NYC" // This would be the second key
},
{
"name": "A",
"number": "5",
"type": "apple",
"city": "LA"
},
{
"name": "A",
"number": 2,
"type": "apple",
"city": "NYC"
},
{
"name": "B",
"number": 3,
"type": "apple",
"city": "NYC"
}
]
I was wondering, how can I format it this way using jq?
[
{
"key": "A",
"values": [
{
"key": "NYC",
"values": [
{
"number": 1,
"type": "a"
},
{
"number": 2,
"type": "b"
}
]
},
{
"key": "LA",
"values": [
{
"number": 5,
"type": "b"
}
]
}
]
},
{
"key": "B",
"values": [
{
"key": "NYC",
"values": [
{
"number": 3,
"type": "apple"
}
]
}
]
}
]
I have followed this thread Using jq, convert array of name/value pairs to object with named keys and tried to group the json using this expression
jq '. | group_by(.name) | group_by(.city) ' ./test.json
but I have not been able to add the keys in the output.
You'll want to group the items at the different levels and building out your result objects as you want.
group_by(.name) | map({
key: .[0].name,
values: (group_by(.city) | map({
key: .[0].city,
values: map({number,type})
}))
})
Just keep in mind that group_by/1 yields groups in a sorted order. You'll probably want an implementation that preserves that order.
def group_by_unsorted(key_selector):
reduce .[] as $i ({};
.["\($i|key_selector)"] += [$i]
)|[.[]];

Convert JSON to CSV - string manipulation (jq, bash, awk, sed, etc.)

I'm in a dire need of help for a script to basically convert JSON text to CSV text in an attempt to copy users from one AWS Cognito userpool to another.
The export JSON looks like this:
{
"Users": [
{
"Username": "user.name",
"Attributes": [
{
"Name": "sub",
"Value": "some-value"
},
{
"Name": "email_verified",
"Value": "true"
},
{
"Name": "custom:jobtitle",
"Value": Director"
},
{
"Name": "custom:user_id",
"Value": "38"
},
{
"Name": "email",
"Value": "foo.bar#email.com"
}
],
"UserCreateDate": some-value,
"UserLastModifiedDate": some-value,
"Enabled": some-value,
"UserStatus": "some-value"
}
[more lines down here]...
] }
Then the CSV file would contain these lines:
,,,,,,,,,foo.bar#email.com,TRUE,,,,,,FALSE,,,Director,,38,FALSE,foo.bar
[more lines down here]...
So, the variables would be like this for JSON:
{
"Users": [
{
"Username": "%USERNAME%",
"Attributes": [
{
"Name": "sub",
"Value": "some-value"
},
{
"Name": "email_verified",
"Value": "true"
},
{
"Name": "custom:jobtitle",
"Value": %JOB_TITLE%"
},
{
"Name": "custom:user_id",
"Value": "%USER_ID%"
},
{
"Name": "email",
"Value": %EMAIL%"
}
],
"UserCreateDate": some-value,
"UserLastModifiedDate": some-value,
"Enabled": some-value,
"UserStatus": "some-value"
}
...
]
}
And like this for CSV:
,,,,,,,,,%EMAIL%,TRUE,,,,,,FALSE,,,%JOB_TITLE%,,%USER_ID%,FALSE,%USERNAME%
where %EMAIL%, %JOB_TITLE%, %USER_ID%, and %USERNAME% are variables, everything else should be just string.
Appreciate your help in advanced guys.
Consider first this filter:
.Users[].Attributes
| map(select(.Name | . == "custom:jobtitle" or . == "custom:user_id" or . == "email") )
| from_entries
| [ .email, .["custom:jobtitle"], .["custom:user_id"] ]
| #csv
The trick used here is the use of from_entries to convert the array of Name/Value pairs to an object with the Names as keys.
Assuming valid JSON input along the lines shown in the Q, invoking jq with the -r option would yield:
"foo.bar#email.com","Director","38"
Unfortunately the precise requirements are not so clear to me, but you should be able to adapt the above in accordance with your needs.