Selecting in JQ with Contains in a Array - json

I want to select the particular item from the array using contains and get the first item using JQ.
JQ:
.amazon.items[] | select(.name | contains ("shoes"))
JSON:
{
"amazon": {
"activeitem": 2,
"items": [
{
"id": 1,
"name": "harry potter",
"state": "sold"
},
{
"id": 2,
"name": "adidas shoes",
"state": "in inventory"
},
{
"id": 3,
"name": "watch",
"state": "returned"
},{
"id": 4,
"name": "adidas shoes",
"state": "in inventory"
}
]
}
}
Expected Result:
{
"activeitem": 2,
"item": {
"id": 2,
"name": "adidas shoes",
"state": "in inventory"
}
}
Actual :
Tried various options like but not getting the Intended response .
.amazon.items[] | select(.name | contains ("shoes"))
.amazon.items | select(.[].name | contains ("shoes")) | .[0]
Also when I try to combine activeitem and item, I get something like this, which is also wrong.
{
"activeitem": 2,
"item": {
"id": 2,
"name": "adidas shoes",
"state": "in inventory"
}
},
{
"activeitem": 2,
"item": {
"id": 2,
"name": "adidas shoes",
"state": "in inventory"
}
}

To edit "in-place" you could write:
.amazon
| .items |= map(select(.name | contains ("shoes")))[0]
If you really want to change the name 'items' to 'item', you could tweak the above as follows:
.amazon
| .item = (.items | map(select(.name | contains ("shoes")))[0])
| del(.items)

Related

Pyspark transform json into multiple dataframes

I have multiple json with this structure (association can have one or multiple objects & Charasteritics doesn't always has the same number of kv pairs:
{
"vl:VNETList": {
"Template": {
"ID": "SomeId",
"Object": [
{
"ID": "my_first_id",
"Context": {
"ID": "Avngate"
},
"Name": "Model Description",
"ClassID": "PID",
"Association": [
{
"Object": {
"ID": "test.svg",
"Context": {
"ID": "Avngate"
}
},
"#type": "is fulfilled by"
},
{
"Object": {
"ID": "Project Description",
"Context": {
"ID": "Avngate"
}
},
"#type": "is an element of"
}
],
"Characteristic": [
{
"Name": "InfoType",
"Value": "image/svg+xml"
},
{
"Name": "LOCK",
"Value": false
},
{
"Name": "EXFI",
"Value": 10000
}
]
},
{
"ID": "my_second_id",
"Context": {
"ID": "Avngate2"
},
"Name": "Model Description2",
"ClassID": "PID2",
"Association": [
{
"Object": {
"ID": "test2.svg",
"Context": {
"ID": "Avngate"
}
},
"#type": "is fulfilled by"
}
],
"Characteristic": [
{
"Name": "Dbtencoding",
"Value": "unicode"
}
]
}
]
}
}
I would like to build two dataframes like this:
and the second dataframe like this:
What's the best approach? If too complex, I would be able also to save the characteristics as a separate table referencing the objectId like with the association.
Read json and groupBy for the first one, just select for the second one with explode.
df1 = spark.read.json('test.json', multiLine=True)
df2 = df1.select(f.explode('vl:VNETList.Template.Object').alias('value')) \
.select('value.*')
df_f1 = df2.withColumn('Characteristic', f.explode('Characteristic')) \
.groupBy('ID', 'Name', 'ClassId') \
.pivot('Characteristic.Name') \
.agg(f.first('Characteristic.Value'))
df_f2 = df2.withColumn('Association', f.explode('Association')) \
.select('ID', 'Association.Object.ID', 'Association.#Type') \
.toDF('ID', 'AssociationId', 'AssociationType')
df_f1.show()
df_f2.show()
+------------+------------------+-------+-----------+-----+-------------+-----+
| ID| Name|ClassId|Dbtencoding| EXFI| InfoType| LOCK|
+------------+------------------+-------+-----------+-----+-------------+-----+
| my_first_id| Model Description| PID| null|10000|image/svg+xml|false|
|my_second_id|Model Description2| PID2| unicode| null| null| null|
+------------+------------------+-------+-----------+-----+-------------+-----+
+------------+-------------------+----------------+
| ID| AssociationId| AssociationType|
+------------+-------------------+----------------+
| my_first_id| test.svg| is fulfilled by|
| my_first_id|Project Description|is an element of|
|my_second_id| test2.svg| is fulfilled by|
+------------+-------------------+----------------+

How to select items in JQ based on values in array

I have a json file with data like this:
{
"data": {
"all": {
"members": [
{
"id": 10,
"name": "First"
},
{
"id": 12,
"name": "Second"
},
{
"id": 14,
"name": "Third"
}
],
"live": {
"online": [
10,
14
]
}
}
}
}
How can I use jq to select and show only the JSON values in data.all.members that have their id in data.all.live.online array?
So the output would be something like:
{
"members": [
{
"id": 10,
"name": "First"
},
{
"id": 14,
"name": "Third"
}
]
}
One way:
jq '.data.all
| .live.online as $online
| {members}
| .members |= map( select(.id | IN($online[]) // null) )
' data.json
Here's a solution that should be fairly easy to understand:
.data.all
| .live.online as $online
| { members: .members | map(select([.id] | inside($online))) }
Output:
{
"members": [
{
"id": 10,
"name": "First"
},
{
"id": 14,
"name": "Third"
}
]
}
And if you need more flexibility how your ids are matched:
.data.all
| .live.online as $online
| { members: .members | map(select(.id as $id | $online | any(. == $id))) }

How do I use jq to flatten a complex json structure?

I have the following simplified json structure: Notice an array of values, which have children, whose children could have children.
{
"value": [
{
"id": "12",
"text": "Beverages",
"state": "closed",
"attributes": null,
"iconCls": null
},
{
"id": "10",
"text": "Foods",
"state": "closed",
"attributes": null,
"iconCls": null,
"children": [
{
"id": "33",
"text": "Mexican",
"state": "closed",
"attributes": null,
"iconCls": null,
"children": [
{
"id": "6100",
"text": "Taco",
"count": "3",
"attributes": null,
"iconCls": ""
}
]
}
]
}
]
}
How do I flatten a json structure using jq? I would like to print each element just once, but in a flat structure. An example output:
{
"id": "12",
"category": "Beverages"
},
{
"id": "10",
"category": "Foods"
},
{
"id": "33",
"category": "Mexican"
},
{
"id": "6100",
"category": "Tacos"
}
My attempt doesn't seem to work at all:
cat simple.json - | jq '.value[] | {id: .id, category: .text} + {id: .children[]?.id, category: .children[]?.text}'
.. is your friend:
.. | objects | select( .id and .text) | {id, category: .text}
If your actual input is that simple, recursively extracting id and text from each object under value should work.
[ .value | recurse | objects | {id, category: .text} ]
Online demo
I was totally going in the wrong direction
Not really. Going in that direction, you would have something like:
.value[]
| recurse(.children[]?)
| {id, category: .text}

Conditionally add a field in JSON transformation using JQ

I am trying to transform my JSON to different structure using JQ. I am able to achieve my new structure, however i am getting feilds with Null objects if they are not present in Source, my client wants to remove the fields if they are having null values..
As i iterate i am able to get the structure in new format. But additional structures are coming.
Code Snippet- https://jqplay.org/s/w2N_Ozg9Ag
JSON
{
"amazon": {
"activeitem": 2,
"createdDate": "2019-01-15T17:36:31.588Z",
"lastModifiedDate": "2019-01-15T17:36:31.588Z",
"user": "net",
"userType": "new",
"items": [
{
"id": 1,
"name": "harry potter",
"state": "sold",
"type": {
"branded": false,
"description": "artwork",
"contentLevel": "season"
}
},
{
"id": 2,
"name": "adidas shoes",
"state": null ,
"type": {
"branded": false,
"description": "Spprts",
"contentLevel": "season"
}
},
{
"id": 3,
"name": "watch",
"type": {
"branded": false,
"description": "walking",
"contentLevel": "special"
}
},
{
"id": 4,
"name": "adidas shoes",
"state": "in inventory",
"type": {
"branded": false,
"description": "running",
"contentLevel": "winter"
}
}
],
"product": {
"id": 4,
"name": "adidas shoes",
"source": "dealer",
"destination": "resident"
}
}
}
JQ Query:
.amazon | { userType: .userType, userName: .user, itemCatalog: (.items | map({ itemId: .id, name, state} )) }
Expected Response:
{
"userType": "new",
"userName": "net",
"itemCatalog": [
{
"itemId": 1,
"name": "harry potter",
"state": "sold"
},
{
"itemId": 2,
"name": "adidas shoes"
},
{
"itemId": 3,
"name": "watch"
},
{
"itemId": 4,
"name": "adidas shoes",
"state": "in inventory"
}
]
}
With the query i have, i am getting state : null for the entries which has empty or null values. I want to hide the field itself in these cases.
Horrible solution, delete them after the query. There must be a neater way? (I started using jq today)
.amazon | { userType: .userType, userName: .user, itemCatalog: (.items | map({ itemId: .id, name, state} )) }| del(.itemCatalog[].state |select(. == null))
https://jqplay.org/s/jlNYmJNi25

jq get first value by priority and condition

I have following json:
{
"Detail": {
"Response": [
{
"ID": "8000000D-1483989576",
"Name": "",
"FullName": "FullName 1"
},
{
"ID": "8000000C-1483985849",
"Name": "Name 1"
},
{
"ID": "80000006-1481277410",
"Name": "Name 2",
"FullName": "FullName 2"
},
{
"ID": "8000000B-1481537384",
"Name": "Name 3"
}
]
}
}
I'm trying to create another json that will consider the non-empty/not null .Name as priority otherwise get .FullName regardless if it's empty or null, the final json would look like following:
[
{
"id": "8000000D-1483989576",
"name": "FullName 1"
},
{
"id": "8000000C-1483985849",
"name": "Name 1"
},
{
"id": "80000006-1481277410",
"name": "FullName 2"
},
{
"id": "8000000B-1481537384",
"name": "Name 3"
}
]
The temporary solution I got is to use join
jq '[.Detail.Response[] | {id: .ID, name: [.Name, .FullName] | join("") }]'
But of course, it'll only work if .FullName is empty or null.
This should get you on your way:
.Detail.Response[]
| { id: .ID, Name: (if .Name != "" then .Name else .FullName end) }
I figure out a way to do it using map and select.
jq '[.Detail.Response[] | {id: .ID, name: [.Name, .FullName] | map(select(length > 0)) | first }]'