I have the following data frame:
id_1 id_2 id_3 id_4 id_5
0133 11 kelly AA-1 1
2119 22 Wade AA-2 1
3903 33 John BB-1 1
3903 33 John BB-2 1
3903 33 John BB-3 1
5133 44 Emily C-1 1
9148 99 Pete BB-34 1
9148 99 Pete BB-23 1
2910 111 Mark DD-3 1
I want to iterate through it and capture any instace where the id_1 has the same value when its greater than 1.
I want to only capture colums id_4 and id_5. This will ultimately be added to a JSON object so the end result would be:
{"id_1": "0133", "id_2": "11", "id_3": "kelly", "items": [{"id_4":"AA-1", "id_5":"1"}]}
{"id_1": "2119", "id_2": "22", "id_3": "Wade", "items": [{"id_4":"AA-2", "id_5":"1"}]}
{"id_1": "3903", "id_2": "33", "id_3": "John", "items": [{"id_4":"BB-1", "id_5":"1",{"id_4":"BB-1", "BB-2":"1"}, ,{"id_4":"BB-1", "BB-3":"1"}]}
{"id_1": "5133", "id_2": "44", "id_3": "Emily", "items": [{"id_4":"C-1", "id_5":"1"}]}
{"id_1": "9148", "id_2": "99", "id_3": "Pete", "items": [{"id_4":"BB-34", "id_5":"1",{"id_4":"BB-23", "BB-2":"1"}]}
{"id_1": "2910", "id_2": "111", "id_3": "Mark", "items": [{"id_4":"DD-3", "id_5":"1"}]}
Would anyone know the best approach to accomplish something like this? any insight is greatly appreciated.
You can try something like:
df.groupby('id_1').nth(1).reset_index()[['id_4','id_5']]
Then you can convert it to JSON.
I have a table where one column is a json, like this:
{"type":"select","description":"Rota","default":"",
"required":"0","listOptions":[{"text": "1 - Jardins", "value": "1"}, {"text": "2 - Praia do Canto/Shop Vix", "value": "2"}, {"text": "3 - Hotéis Vitória/Serra", "value": "3"}, {"text": "6 - Hotéis Vila Velha/Padarias Praia da Costa", "value": "6"}, {"text": "9 - Cariacica", "value": "9"}, {"text": "5 - Vitória/Vila Velha", "value": "5"}, {"text": "10 - Baú/Reboque", "value": "10"}}
I can select like this: select atributos->"$.listOptions" from table
My question is, how can I select the values from listOptions?
Use [*] to extract all the array item values.
SELECT atributos->"$.listOptions[*].value" FROM test;
To extract a specify item use e.g first item value.
SELECT atributos->"$.listOptions[0].value" FROM test;
Refer to Runnable DBFiddle instance.
In mysql 8 you can use JSON_TABLE
CREATE tABLE TAB1 (atributos json);
INSERT INTO TAB1 VALUES ('{"type":"select","description":"Rota","default":"",
"required":"0","listOptions":[{"text": "1 - Jardins", "value": "1"}, {"text": "2 - Praia do Canto/Shop Vix", "value": "2"}, {"text": "3 - Hotéis Vitória/Serra", "value": "3"}, {"text": "6 - Hotéis Vila Velha/Padarias Praia da Costa", "value": "6"}, {"text": "9 - Cariacica", "value": "9"}, {"text": "5 - Vitória/Vila Velha", "value": "5"}, {"text": "10 - Baú/Reboque", "value": "10"}]}');
SELECT atributos->"$.description", tt1.*
FROM TAB1,
JSON_TABLE(
atributos,
"$.listOptions[*]"
COLUMNS (
mytext VARCHAR(100) PATH "$.text" DEFAULT '0' ON EMPTY DEFAULT '-99' ON ERROR,
myvalue INT PATH "$.value"
) ) tt1;
atributos->"$.description" | mytext | myvalue
:------------------------- | :--------------------------------------------- | ------:
"Rota" | 1 - Jardins | 1
"Rota" | 2 - Praia do Canto/Shop Vix | 2
"Rota" | 3 - Hotéis Vitória/Serra | 3
"Rota" | 6 - Hotéis Vila Velha/Padarias Praia da Costa | 6
"Rota" | 9 - Cariacica | 9
"Rota" | 5 - Vitória/Vila Velha | 5
"Rota" | 10 - Baú/Reboque | 10
db<>fiddle here
mysql Server (since Version 8.0):
SELECT [field with json blob]->>"$.json_field" FROM mytable;
mariaDB Server (since Version 10.26)
SELECT JSON_EXTRACT([field with json blob], "$.json_field") from mytable
I've recently discovered the PostgreSQL can be used store JSON
before I import loads of data I need to understand how to retrieve it in particular the nested objects
This postgresql tutorial is a good starting point but doesn't really explain how to query nested json array
In the sample below I need to select the codes -> code where codes -> level: 1 (adminCode1_iso) is related to adminName1 and if it exists codes -> level: 2 is related to adminName2
CREATE TABLE gn_json (
id serial NOT NULL PRIMARY KEY,
info json NOT NULL
);
comment on table gn_json is 'How PG holds json';
insert into gn_json (info)
VALUES ('{
"adminCode2": "C3",
"codes": [
{
"code": "ENG",
"level": "1",
"type": "ISO3166-2"
},
{
"code": "CAM",
"level": "2",
"type": "ISO3166-2"
}
],
"adminCode3": "12UE",
"adminName4": "Yelling",
"adminName3": "Huntingdonshire",
"adminCode1": "ENG",
"adminName2": "Cambridgeshire",
"distance": 0,
"countryCode": "GB",
"countryName": "United Kingdom",
"adminName1": "England",
"adminCode4": "12UE085"
}',
'{
"codes": [
{
"code": "81",
"level": "1",
"type": "ISO3166-2"
}
],
"adminCode1": "63",
"distance": 0,
"countryCode": "TH",
"countryName": "Thailand",
"adminName1": "Krabi"
}');
select info ->> 'countryName' as countryName,info ->> 'countryCode' as countryCode,
info ->> 'adminName1' as adminName1, info ->> 'adminCode1' as adminCode1,
info ->> 'adminName2' as adminName2, info ->> 'adminCode2' as adminCode2,
info ->'codes->0->' -> 'code' as adminCode1_iso,
info ->'codes->1->' -> 'code' as adminCode2_iso
FROM gn_json;
Edit Expected outcome
countryname countrycode adminname1 admincode1 adminname2 admincode2 admincode1_iso admincode2_iso
United Kingdom GB England ENG Cambridgeshire C3 ENG CAM
Thailand TH Krabi 63 NULL NULL 81 NULL
I have a table with the name mainapp_project_data which has a jsonb column project_user_data
TABLE
public | mainapp_project_data | table | admin
select project_user_data from mainapp_project_data;
project_user_data
-----------------------------------------------------------------------------------------------------------------
[{"name": "john", "age": "21", "gender": "M"}, {"name": "randy", "age": "23", "gender": "M"}]
[{"name": "donald", "age": "31", "gender": "M"}, {"name": "wick", "age": "32",
"gender": "M"}]
[{"name": "orton", "age": "18", "gender": "M"}, {"name": "russel", "age": "55",
"gender": "M"}]
[{"name": "angelina", "age": "open", "gender": "F"}, {"name": "josep", "age": "21",
"gender": "M"}]
(4 rows)
(END)
I would like to count the distinct values of keys gender and age of JSON.
output format : [{key:count(repeated_values)}]
filtering on `gender` : [{"M":7},{"F":1}]
filtering on `age` : [{"21":2},{"23":1},{"31":1}.....]
WITH flat AS (
SELECT
kv.key,
-- make into a JSON object with a single value and count, e.g., '{"M": 7}'
jsonb_build_object(kv.value, COUNT(*)) AS val_count
FROM mainapp_project_data AS mpd
-- Flatten the JSON arrays into single objects per row
CROSS JOIN LATERAL jsonb_array_elements(mpd.project_user_data) AS unarrayed(udata)
-- Convert to a long, flat list of key-value pairs
CROSS JOIN LATERAL jsonb_each_text(unarrayed.udata) AS kv(key, value)
GROUP BY kv.key, kv.value
)
SELECT
-- de-deplicated object keys
flat.key,
-- aggregation of all values and counts per key
jsonb_agg(flat.val_count) AS value_counts
FROM flat
GROUP BY flat.key
Returns
key | value_counts
--------+---------------------------------------------------------------------------------------------------------------------
gender | [{"M": 7}, {"F": 1}]
name | [{"josep": 1}, {"russel": 1}, {"orton": 1}, {"donald": 1}, {"wick": 1}, {"john": 1}, {"randy": 1}, {"angelina": 1}]
age | [{"18": 1}, {"32": 1}, {"21": 2}, {"23": 1}, {"open": 1}, {"31": 1}, {"55": 1}]
This will provide any key-value pair instance count. If you just want genders and ages, just add a where clause before the first GROUP BY clause.
WHERE kv.key IN ('gender', 'age')
Does something like this work for you?
postgres=# select count(*), (foo->'gender')::text as g from (select json_array_elements(project_user_data) as foo from mainapp_project_data) as j group by (foo->'gender')::text;
count | g
-------+-----
7 | "M"
1 | "F"
(2 rows)
postgres=# select count(*), (foo->'age')::text as g from (select json_array_elements(project_user_data) as foo from mainapp_project_data) as j group by (foo->'age')::text;
count | g
-------+--------
2 | "21"
1 | "32"
1 | "open"
1 | "23"
1 | "18"
1 | "55"
1 | "31"
(7 rows) ```
create table store (id integer primary key, name text);
create table opening (store integer references store(id),
wday text, start integer, end integer);
insert into store (name) values ('foo'), ('bar');
insert into opening (store, wday, start, end)
values (1, 'mon', 0, 60),
(1, 'mon', 60, 120),
(1, 'tue', 180, 240),
(1, 'tue', 300, 360),
(2, 'wed', 0, 60),
(2, 'wed', 60, 120),
(2, 'thu', 180, 240);
I'm trying to get in a single query all the stores and their respective openings by weekday as JSON.
{
"1": {
"name": "foo",
"openings": {
"mon": [ [ 0, 60 ], [ 60, 120 ] ],
"tue": [ [180, 240 ], [ 300, 360 ] ]
}
},
"2": {
"name": "bar",
"openings": {
"wed": [ [0,60], [60,120] ],
"thu": [ [180,240] ]
}
}
}
Here's the evolution of what I have tried. I missing a way to do multi-level json_group_object I suppose.
select * from opening;
store wday start end
---------- ---------- ---------- ----------
1 mon 0 60
1 mon 60 120
1 tue 180 240
1 tue 300 360
2 wed 0 60
2 wed 60 120
2 thu 180 240
select * from opening group by store;
store wday start end
---------- ---------- ---------- ----------
1 mon 0 60
2 wed 0 60
select json_group_object(store, wday) from opening group by store;
json_group_object(store, wday)
-----------------------------------------
{"1":"mon","1":"mon","1":"tue","1":"tue"}
{"2":"wed","2":"wed","2":"thu"}
select store, wday, json_group_array(json_array(start, end))
from opening group by store, wday;
store wday json_group_array(json_array(start, end))
---------- ---------- ----------------------------------------
1 mon [[0,60],[60,120]]
1 tue [[180,240],[300,360]]
2 thu [[180,240]]
2 wed [[0,60],[60,120]]
select json_object('id', store,
'openings', json_group_object(wday, json_group_array(json_array(start, end)))
) from opening group by store, wday;
Error: near line 17: misuse of aggregate function json_group_array()
select json_object('id', store,
'openings', json_object(wday, json_group_array(json_array(start, end)))
) from opening group by store, wday;
{"id":1,"openings":{"mon":[[0,60],[60,120]]}}
{"id":1,"openings":{"tue":[[180,240],[300,360]]}}
{"id":2,"openings":{"thu":[[180,240]]}}
{"id":2,"openings":{"wed":[[0,60],[60,120]]}}
How can I group on same id here?
A row will be returned for each unique values corresponding to a group by. Thus, the outermost select must have a group by store.
select json_group_object(store, x)
from (
select
store,
json_object(
'id', store,
'openings', json_object(wday, json_group_array(json_array(start, end)))
) x
from opening group by store, wday
) group by store;
This inner query returns literal JSON however. It seems silly to decode the inner JSON just to then encode it all in the outer-most query.
{"1":"{\"id\":1,\"openings\":{\"mon\":[[0,60],[60,120]]}}","1":"{\"id\":1,\"openings\":{\"tue\":[[180,240],[300,360]]}}"}
{"2":"{\"id\":2,\"openings\":{\"thu\":[[180,240]]}}","2":"{\"id\":2,\"openings\":{\"wed\":[[0,60],[60,120]]}}"}
IIRC in Postgres this inner query that returns JSON wouldn't return literal JSON but either way I'm confused how to continue.
Thanks for any help.
Adding an example for general reference. Shawn's point about using json(x) in the outer selects is key. Here's an example with multiple levels of nested arrays
The sample data: select * from tblSmall
region|subregion |postalcode|locality |lat |lng |
------|-------------|----------|-------------------------------|-------|-------|
Delhi |Central Delhi| 110001|Connaught Place |28.6431|77.2197|
Delhi |Central Delhi| 110001|Parliament House |28.6407|77.2154|
Delhi |Central Delhi| 110003|Pandara Road |28.6431|77.2197|
Delhi |Central Delhi| 110004|Rashtrapati Bhawan |28.6453|77.2128|
Delhi |Central Delhi| 110005|Karol Bagh |28.6514|77.1907|
Delhi |Central Delhi| 110005|Anand Parbat |28.6431|77.2197|
Delhi |North Delhi | 110054|Civil Lines (North Delhi) |28.6804|77.2263|
Delhi |North Delhi | 110084|Burari |28.7557|77.1994|
Delhi |North Delhi | 110084|Jagatpur |28.7414|77.2199|
Delhi |North Delhi | 110086|Kirari Suleman Nagar |28.7441|77.0732|
For each region has multiple subregion values, each subregion has multiple postalcode values, and each postalcode has multiple locality values.
Here's the sql :
select json_object('region', A2.region, 'subregions', json_group_array(json(A2.json_obj2))) from
(select A1.region, json_object('subregion',
A1.subregion,
'postalCodes',
json_group_array(json(A1.json_obj1)) ) as json_obj2 from
(select region, subregion, json_object('postalCode',
postalcode,
'localities',
json_group_array(json_object('locality',
locality, 'latitude',
lat, 'longitude', lng) ) ) as json_obj1
from tblSmall where subregion in ('Central Delhi', 'North Delhi')
group by region, subregion, postalcode) as A1
group by A1.region, A1.subregion) as A2
group by A2.region
Note the json(A1.json_obj1) and json(A2.json_obj2) bits to handle the decode/re-encode of json coming out of the inner queries.
Here's the result (kind of long because of pretty-print) - there's a subregions array, which contains a postalcodes array, which contains a localities array:
{
"region": "Delhi",
"subregions": [
{
"subregion": "Central Delhi",
"postalCodes": [
{
"postalCode": 110001,
"localities": [
{
"locality": "Connaught Place",
"latitude": 28.6431,
"longitude": 77.2197
},
{
"locality": "Parliament House",
"latitude": 28.6407,
"longitude": 77.2154
}
]
},
{
"postalCode": 110003,
"localities": [
{
"locality": "Pandara Road",
"latitude": 28.6431,
"longitude": 77.2197
}
]
},
{
"postalCode": 110004,
"localities": [
{
"locality": "Rashtrapati Bhawan",
"latitude": 28.6453,
"longitude": 77.2128
}
]
},
{
"postalCode": 110005,
"localities": [
{
"locality": "Karol Bagh",
"latitude": 28.6514,
"longitude": 77.1907
},
{
"locality": "Anand Parbat",
"latitude": 28.6431,
"longitude": 77.2197
}
]
},
{
"postalCode": 110060,
"localities": [
{
"locality": "Rajender Nagar",
"latitude": 28.5329,
"longitude": 77.2004
}
]
},
{
"postalCode": 110069,
"localities": [
{
"locality": "Union Public Service Commission",
"latitude": 28.5329,
"longitude": 77.2004
}
]
},
{
"postalCode": 110100,
"localities": [
{
"locality": "Foreign Post Delhi IBC",
"latitude": 28.6563,
"longitude": 77.1366
}
]
}
]
},
{
"subregion": "North Delhi",
"postalCodes": [
{
"postalCode": 110054,
"localities": [
{
"locality": "Timarpur",
"latitude": 28.7038,
"longitude": 77.2227
},
{
"locality": "Civil Lines (North Delhi)",
"latitude": 28.6804,
"longitude": 77.2263
}
]
},
{
"postalCode": 110084,
"localities": [
{
"locality": "Burari",
"latitude": 28.7557,
"longitude": 77.1994
},
{
"locality": "Jagatpur",
"latitude": 28.7414,
"longitude": 77.2199
}
]
},
{
"postalCode": 110086,
"localities": [
{
"locality": "Kirari Suleman Nagar",
"latitude": 28.7441,
"longitude": 77.0732
}
]
}
]
}
]
}