We are using this common_schema library in mysql 5.6 to extract the values from json array. The format is given below. But it returns the NULL value. So, can you please help us out how to parse the json array using common_schema.
select common_schema.extract_json_value('"batter":
[
{ "id": "1001", "type": "Regular" },
{ "id": "1002", "type": "Chocolate" },
{ "id": "1003", "type": "Blueberry" },
{ "id": "1004", "type": "Devils Food" }
]','/id');
Expected output should be saved in table as
id type
1001 Regular
1002 Chocolate
1003 Blueberry
1004 Devils Food
Please let us know how we can achieve this parsing.
Thanks
Kalyan
Directly it seems not so easy to get what you need.
An option to obtain a single value is:
SET #`json` := '
{
"batter":
[
{ "id": "1001", "type": "Regular" },
{ "id": "1002", "type": "Chocolate" },
{ "id": "1003", "type": "Blueberry" },
{ "id": "1004", "type": "Devils Food" }
]
}
';
SELECT
`common_schema`.`extract_json_value`(#`json`,'descendant-or-self::id[1]') `id`,
`common_schema`.`extract_json_value`(#`json`,'descendant-or-self::type[1]') `type`;
+------+---------+
| id | type |
+------+---------+
| 1001 | Regular |
+------+---------+
1 row in set (0,04 sec)
Related
I am trying to extract values from a json that I obtained using the curl command for api testing. My json looks as below. I need some help extracting the value "20456" from here?
{
"meta": {
"status": "OK",
"timestamp": "2022-09-16T14:45:55.076+0000"
},
"links": {},
"data": {
"id": 24843,
"username": "abcd",
"firstName": "abc",
"lastName": "xyz",
"email": "abc#abc.com",
"phone": "",
"title": "",
"location": "",
"licenseType": "FLOATING",
"active": true,
"uid": "u24843",
"type": "users"
}
}
{
"meta": {
"status": "OK",
"timestamp": "2022-09-16T14:45:55.282+0000",
"pageInfo": {
"startIndex": 0,
"resultCount": 1,
"totalResults": 1
}
},
"links": {
"data.createdBy": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.createdBy}"
},
"data.fields.user1": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.fields.user1}"
},
"data.modifiedBy": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.modifiedBy}"
},
"data.fields.projectManager": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.fields.projectManager}"
},
"data.parent": {
"type": "projects",
"href": "https://abc#abc.com/rest/v1/projects/{data.parent}"
}
},
"data": [
{
"id": 20456,
"projectKey": "Stratus",
"parent": 20303,
"isFolder": false,
"createdDate": "2018-03-12T23:46:59.000+0000",
"modifiedDate": "2020-04-28T22:14:35.000+0000",
"createdBy": 18994,
"modifiedBy": 18865,
"fields": {
"projectManager": 18373,
"user1": 18628,
"projectKey": "Stratus",
"text1": "",
"name": "Stratus",
"description": "",
"date2": "2019-03-12",
"date1": "2018-03-12"
},
"type": "projects"
}
]
}
I have tried the following, but end up getting error:
▶ cat jqTrial.txt | jq '.data[].id'
jq: error (at <stdin>:21): Cannot index number with string "id"
20456
Also tried this but I get strings outside the object that I am not sure how to remove:
cat jqTrial.txt | jq '.data[]'
Assuming you want the project id not the user id:
jq '
.data
| if type == "object" then . else .[] end
| select(.type == "projects")
| .id
' file.json
There's probably a better way to write the 2nd expression
Indeed, thanks to #pmf
.data | objects // arrays[] | select(.type == "projects").id
Your input consists of two JSON documents; both have a data field on top level. But while the first one is itself an object which has an .id field, the second one is an array with one object item, which also has an .id field.
To retrieve both, you could use the --slurp (or -s) option which wraps both top-level objects into an array, then you can address them separately by index:
jq --slurp '.[0].data.id, .[1].data[].id' jqTrial.txt
24843
20456
Demo
I have the following simplified json structure: Notice an array of values, which have children, whose children could have children.
{
"value": [
{
"id": "12",
"text": "Beverages",
"state": "closed",
"attributes": null,
"iconCls": null
},
{
"id": "10",
"text": "Foods",
"state": "closed",
"attributes": null,
"iconCls": null,
"children": [
{
"id": "33",
"text": "Mexican",
"state": "closed",
"attributes": null,
"iconCls": null,
"children": [
{
"id": "6100",
"text": "Taco",
"count": "3",
"attributes": null,
"iconCls": ""
}
]
}
]
}
]
}
How do I flatten a json structure using jq? I would like to print each element just once, but in a flat structure. An example output:
{
"id": "12",
"category": "Beverages"
},
{
"id": "10",
"category": "Foods"
},
{
"id": "33",
"category": "Mexican"
},
{
"id": "6100",
"category": "Tacos"
}
My attempt doesn't seem to work at all:
cat simple.json - | jq '.value[] | {id: .id, category: .text} + {id: .children[]?.id, category: .children[]?.text}'
.. is your friend:
.. | objects | select( .id and .text) | {id, category: .text}
If your actual input is that simple, recursively extracting id and text from each object under value should work.
[ .value | recurse | objects | {id, category: .text} ]
Online demo
I was totally going in the wrong direction
Not really. Going in that direction, you would have something like:
.value[]
| recurse(.children[]?)
| {id, category: .text}
I am trying to convert the sample input below into the output below using jq:
Input JSON
"elements": [
{
"type": "CustomObjectData",
"id": "2185",
"fieldValues": [
{
"type": "FieldValue",
"id": "169",
"value": "9/6/2017 12:00:00 AM"
},
{
"type": "FieldValue",
"id": "190",
"value": "ABC"
}
]
},
{
"type": "CustomObjectData",
"id": "2186",
"contactId": "13",
"fieldValues": [
{
"type": "FieldValue",
"id": "169",
"value": "8/31/2017 12:00:00 AM"
},
{
"type": "FieldValue",
"id": "190",
"value": "DEF"
}
]
}
]
Desired Output (group by id)
Essentially trying to extract "value" field from each "fieldValues" object and group them by "id"
{
"id:"2185",
"value": "9/6/2017 12:00:00 AM",
"value": "ABC"
},
{
"id:"2186",
"value": "8/31/2017 12:00:00 AM",
"value": "DEF"
}
What jq syntax should i use to achieve this? Thanks very much!!
Assuming the input shown in the Q has been modified in the obvious way to make it valid JSON, the following filter will produce the output as shown below, that is, a stream of valid JSON values that is similar to the allegedly expected output included in the Q. If a single array is desired, one possibility would be to wrap the program in square brackets.
program.jq
.elements[]
| {id, values: [ .fieldValues[].value] }
Output
{
"id": "2185",
"values": [
"9/6/2017 12:00:00 AM",
"ABC"
]
}
{
"id": "2186",
"values": [
"8/31/2017 12:00:00 AM",
"DEF"
]
}
Producing CSV
One of many possibilities:
.elements[]
| [.id] + [.fieldValues[].value]
| #csv
With the -r command-line option, this produces the following CSV:
"2185","9/6/2017 12:00:00 AM","ABC"
"2186","8/31/2017 12:00:00 AM","DEF"
I have JSON column, containing the JSON array. My Scenario, is to get all the the records where value of url is
'"example.com/user1"' is present. I have trouble writing the query for this operation.
Record1
[
{
"id": "1",
"firstname": "user1",
"url": "example.com/user1"
},
{
"id": "2",
"firstname": "user2",
"url": "example.com/user2"
}
]
Record2
[
{
"id": "1",
"firstname": "user3",
"url": "example.com/user3"
},
{
"id": "2",
"firstname": "user2",
"url": "example.com/user2"
}
]
......
......
......
Record10
[
{
"id": "1",
"firstname": "user10",
"url": "example.com/user10"
},
{
"id": "2",
"firstname": "user1",
"url": "example.com/user1"
}
]
The Query Which I ran is:
Select internal_id from users_dummy where JSON_EXTRACT(user_friends, '$[0].url') = "example.com/user1" or JSON_EXTRACT(user_friends, '$[1].url') = "example.com/user1";
So o/p was:
Record1, Record10
Is this the proper way to search for the values across the records?
Thanks in advance.
You can use JSON_SEARCH like this:
SELECT *
FROM users_dummy
WHERE JSON_SEARCH(user_friends, 'one', 'example.com/user1', NULL, '$[*].url') IS NOT NULL
demo on dbfiddle.uk
You can use the following solution in case you are using objects instead of arrays:
SELECT *
FROM users_dummy
WHERE JSON_SEARCH(user_friends, 'one', 'example.com/user1', NULL, '$.*.url') IS NOT NULL
demo on dbfiddle.uk
I have some rather large json files (~500mb - 4gb compressed) for which I cannot load into memory for manipulation. So I am using the --stream option with jq.
For example my json might look like this - only bigger:
[{
"id": "0001",
"type": "donut",
"name": "Cake",
"ppu": 0.55,
"batters": {
"batter": [{
"id": "1001",
"type": "Regular"
}, {
"id": "1002",
"type": "Chocolate"
}, {
"id": "1003",
"type": "Blueberry"
}, {
"id": "1004",
"type": "Devil's Food"
}]
},
"topping": [{
"id": "5001",
"type": "None"
}, {
"id": "5002",
"type": "Glazed"
}, {
"id": "5005",
"type": "Sugar"
}, {
"id": "5007",
"type": "Powdered Sugar"
}, {
"id": "5006",
"type": "Chocolate with Sprinkles"
}, {
"id": "5003",
"type": "Chocolate"
}, {
"id": "5004",
"type": "Maple"
}]
}, {
"id": "0002",
"type": "donut",
"name": "Raised",
"ppu": 0.55,
"batters": {
"batter": [{
"id": "1001",
"type": "Regular"
}]
},
"topping": [{
"id": "5001",
"type": "None"
}, {
"id": "5002",
"type": "Glazed"
}, {
"id": "5005",
"type": "Sugar"
}, {
"id": "5003",
"type": "Chocolate"
}, {
"id": "5004",
"type": "Maple"
}]
}, {
"id": "0003",
"type": "donut",
"name": "Old Fashioned",
"ppu": 0.55,
"batters": {
"batter": [{
"id": "1001",
"type": "Regular"
}, {
"id": "1002",
"type": "Chocolate"
}]
},
"topping": [{
"id": "5001",
"type": "None"
}, {
"id": "5002",
"type": "Glazed"
}, {
"id": "5003",
"type": "Chocolate"
}, {
"id": "5004",
"type": "Maple"
}]
}]
If this were the type of file I could hold in memory, and I wanted to select objects that only have batter type "Chocolate", I could use:
cat sample.json | jq '.[] | select(.batters.batter[].type == "Chocolate")'
And I would only get back the full objects with ids "0001" and "0003"
But with streaming I know it's different.
I am reading through the jq documentation on streaming here and here, but I am still quite confused as the examples don't really demonstrate real world problems with json.
Namely, Is it even possible to select whole objects after streaming through their paths and identifying a notable event, or in this case a property value that matches a certain string?
I know that I can use:
cat sample.json | jq --stream 'select(.[0][1] == "batters" and .[0][2] == "batter" and .[0][4] == "type") | .[1]'
to give me all of the batter types. But is there a way to say: "If it's Chocolate, grab the object this leaf is a part of"?
Command:
$ jq -cn --stream 'fromstream(1|truncate_stream(inputs))' array_of_objects.json |
jq 'select(.batters.batter[].type == "Chocolate") | .id'
Output:
"0001"
"0003"
The first invocation of jq converts the array of objects into a stream of objects. The second is based on your invocation and can be tailored further to your needs.
Of course the two invocations can (and probably should) be combined into one, but you might want to use the first invocation to save the big file as a file containing the stream of objects.
By the way, it would probably be better to use the following select:
select( any(.batters.batter[]; .type == "Chocolate") )
Here is another approach. Start with a streaming filter filter1.jq that extracts the record number and the minimum set of attributes you need to process. E.g.
select(length==2)
| . as [$p, $v]
| {r:$p[0]}
| if $p[1] == "id" then .id = $v
elif $p[1] == "batters" and $p[-1] == "type" then .type = $v
else empty
end
Running this with
jq -M -c --stream -f filter1.jq bigdata.json
produces values like
{"r":0,"id":"0001"}
{"r":0,"type":"Regular"}
{"r":0,"type":"Chocolate"}
{"r":0,"type":"Blueberry"}
{"r":0,"type":"Devil's Food"}
{"r":1,"id":"0002"}
{"r":1,"type":"Regular"}
{"r":2,"id":"0003"}
{"r":2,"type":"Regular"}
{"r":2,"type":"Chocolate"}
now pipe this into a second filter filter2.jq which does the processing you want on those attributes for each record
foreach .[] as $i (
{c: null, r:null, id:null, type:null}
; .c = $i
| if .r != .c.r then .id=null | .type=null | .r=.c.r else . end # control break
| .id = if .c.id == null then .id else .c.id end
| .type = if .c.type == null then .type else .c.type end
; if ([.id, .type] | contains([null])) then empty else . end
)
| select(.type == "Chocolate").id
with a command like
jq -M -c --stream -f filter1.jq bigdata.json | jq -M -s -r -f filter2.jq
to produce
0001
0003
filter1.jq and filter2.jq do a little more than what you need for this specific problem but they can be generalized easily.