PostgreSQL query : distinct count of jsonb values column - mysql

I have a table with the name mainapp_project_data which has a jsonb column project_user_data
TABLE
public | mainapp_project_data | table | admin
select project_user_data from mainapp_project_data;
project_user_data
-----------------------------------------------------------------------------------------------------------------
[{"name": "john", "age": "21", "gender": "M"}, {"name": "randy", "age": "23", "gender": "M"}]
[{"name": "donald", "age": "31", "gender": "M"}, {"name": "wick", "age": "32",
"gender": "M"}]
[{"name": "orton", "age": "18", "gender": "M"}, {"name": "russel", "age": "55",
"gender": "M"}]
[{"name": "angelina", "age": "open", "gender": "F"}, {"name": "josep", "age": "21",
"gender": "M"}]
(4 rows)
(END)
I would like to count the distinct values of keys gender and age of JSON.
output format : [{key:count(repeated_values)}]
filtering on `gender` : [{"M":7},{"F":1}]
filtering on `age` : [{"21":2},{"23":1},{"31":1}.....]

WITH flat AS (
SELECT
kv.key,
-- make into a JSON object with a single value and count, e.g., '{"M": 7}'
jsonb_build_object(kv.value, COUNT(*)) AS val_count
FROM mainapp_project_data AS mpd
-- Flatten the JSON arrays into single objects per row
CROSS JOIN LATERAL jsonb_array_elements(mpd.project_user_data) AS unarrayed(udata)
-- Convert to a long, flat list of key-value pairs
CROSS JOIN LATERAL jsonb_each_text(unarrayed.udata) AS kv(key, value)
GROUP BY kv.key, kv.value
)
SELECT
-- de-deplicated object keys
flat.key,
-- aggregation of all values and counts per key
jsonb_agg(flat.val_count) AS value_counts
FROM flat
GROUP BY flat.key
Returns
key | value_counts
--------+---------------------------------------------------------------------------------------------------------------------
gender | [{"M": 7}, {"F": 1}]
name | [{"josep": 1}, {"russel": 1}, {"orton": 1}, {"donald": 1}, {"wick": 1}, {"john": 1}, {"randy": 1}, {"angelina": 1}]
age | [{"18": 1}, {"32": 1}, {"21": 2}, {"23": 1}, {"open": 1}, {"31": 1}, {"55": 1}]
This will provide any key-value pair instance count. If you just want genders and ages, just add a where clause before the first GROUP BY clause.
WHERE kv.key IN ('gender', 'age')

Does something like this work for you?
postgres=# select count(*), (foo->'gender')::text as g from (select json_array_elements(project_user_data) as foo from mainapp_project_data) as j group by (foo->'gender')::text;
count | g
-------+-----
7 | "M"
1 | "F"
(2 rows)
postgres=# select count(*), (foo->'age')::text as g from (select json_array_elements(project_user_data) as foo from mainapp_project_data) as j group by (foo->'age')::text;
count | g
-------+--------
2 | "21"
1 | "32"
1 | "open"
1 | "23"
1 | "18"
1 | "55"
1 | "31"
(7 rows) ```

Related

Count json tags in sql

I have this json strings
[{"count": 9, "name": "fixkit", "label": "Repair Kit"}, {"count": 1, "name": "phone", "label": "Telefoon"}]
[{"count": 3, "name": "phone", "label": "Telefoon"}]
[{"count": 5, "name": "kunststof", "label": "Kunststof"}, {"count": 6, "name": "papier", "label": "Papier"}, {"count": 2, "name": "metaal", "label": "Metaal"}, {"count": 2, "name": "inkt", "label": "Inkt"}, {"count": 3, "name": "kabels", "label": "Kabels"}, {"count": 2, "name": "klei", "label": "Klei"}, {"count": 2, "name": "glas", "label": "Glas"}, {"count": 12, "name": "phone", "label": "Telefoon"}]
[{"count": 77, "name": "weed", "label": "Cannabis"}, {"count": 1, "name": "firework1", "label": "Vuurpijl 1"}]
And know i want the following output
Phone | Number of phones (in this case: 16)
Fixkit | Number of fixkits (in this case: 9)
I wanted to do this with a sql query. If you know how to do this, thanks in advance!
If you're not using MySQL 8, this is a bit more complicated. First you have to find a path to a name element that has the value phone (or fixkit); then you can replace name in that path with count and extract the count field from that path; these values can then be summed:
SELECT param, SUM(JSON_EXTRACT(counts, REPLACE(JSON_UNQUOTE(JSON_SEARCH(counts, 'one', param, NULL, '$[*].name')), 'name', 'count'))) AS count
FROM data
CROSS JOIN (
SELECT 'phone' AS param
UNION ALL
SELECT 'fixkit'
) params
WHERE JSON_SEARCH(counts, 'one', param, NULL, '$[*].name') IS NOT NULL
GROUP BY param
Output:
param count
fixkit 9
phone 16
Demo on dbfiddle
If you are running MySQL 8.0, you can unnest the arrays into rows with json_table(), then filter on the names you are interested in, and aggregate.
Assuming that your table is mytable and that the json column is called js, that would be:
select j.name, sum(j.cnt) cnt
from mytable t
cross join json_table (
t.js,
'$[*]' columns(
cnt int path '$.count',
name varchar(50) path '$.name'
)
) j
where j.name in ('phone', 'fixkit')
group by j.name
Demo on DB Fiddle:
| name | cnt |
| ------ | --- |
| fixkit | 9 |
| phone | 16 |

Select certain keys from array of objects

Say I have a JSON column foo where each value is an array of objects, i.e:
[{"date": somedate, "value": somevalue, "othervaluesidontneed":...},
{"date": somedate, "value": somevalue, "othervaluesidontneed":...},
{"date": somedate, "value": somevalue, "othervaluesidontneed":...},...]
I want to select this column but for each row, to only include the keys date and value, so the returned value is:
[{"date": somedate, "value": somevalue},
{"date": somedate, "value": somevalue},
{"date": somedate, "value": somevalue},...]
Is this possible?
A solution would be to use json_table() (available in MySQ 8.0 only) to expand the array as a set of rows, and then generate a new array of objects that contain only the requested keys with json_arrayagg():
select
json_arrayagg(json_object( 'date', tt.date, 'value', tt.value)) new_js
from
mytable t,
json_table(
js,
"$[*]"
columns(
date datetime path "$.date",
value int path "$.value"
)
) as tt
group by t.id
This requires that some column can be used to uniquely identify a row in the initial table: I called that column id.
Demo on DB Fiddle:
with mytable as (
select 1 id, '[
{ "date": "2019-01-01", "value": 1, "othervaluesidontneed": 12 },
{ "date": "2019-01-02", "value": 2, "othervaluesidontneed": 55 },
{ "date": "2019-01-03", "value": 3, "othervaluesidontneed": 72}
]' js
)
select
json_arrayagg(json_object( 'date', tt.date, 'value', tt.value)) new_js
from
mytable t,
json_table(
js,
"$[*]"
columns(
date datetime path "$.date",
value int path "$.value"
)
) as tt
group by t.id
| new_js |
| :----------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [{"date": "2019-01-01 00:00:00.000000", "value": 1}, {"date": "2019-01-02 00:00:00.000000", "value": 2}, {"date": "2019-01-03 00:00:00.000000", "value": 3}] |

How to query nested array of jsonb

I am working on a PostgreSQL 11 table with a column of nested and multiple jsonb objects
to simulate the issue: -
CREATE TABLE public.test
(
id integer NOT NULL DEFAULT nextval('test_id_seq'::regclass),
testcol jsonb
)
insert into test (testcol) values
('[{"type": {"value": 1, "displayName": "flag1"}, "value": "10"},
{"type": {"value": 2, "displayName": "flag2"}, "value": "20"},
{"type": {"value": 3, "displayName": "flag3"}, "value": "30"},
{"type": {"value": 4, "displayName": "flag4"}},
{"type": {"value": 4, "displayName": "flag4"}},
{"type": {"value": 6, "displayName": "flag6"}, "value": "40"}]');
I am trying to:
get outer value if type= specific value. e.g. get the value 30, if flag3 is in displayname.
count occurrence of flag4 in inner json
You could use json_to_recordset to parse it:
WITH cte AS (
SELECT test.id, sub."type"->'value' AS t_value, sub."type"->'displayName' AS t_name, value
FROM test
,LATERAL jsonb_to_recordset(testcol) sub("type" jsonb, "value" int)
)
SELECT *
FROM cte
-- WHERE ...
-- GROUP BY ...;
db<>fiddle demo

Extract key from JSON string in MySQL

My table contains string in json format. I need to get the sum and average of each key.
+----+------------------------------------------------------------------------------------+------------+
| id | json_data | subject_id |
+----+------------------------------------------------------------------------------------+------------+
| 1 | {"id": "a", "value": "30"}, {"id": "b", "value": "20"}, {"id": "c", "value": "30"} | 1 |
+----+------------------------------------------------------------------------------------+------------+
| 2 | {"id": "a", "value": "40"}, {"id": "b", "value": "50"}, {"id": "c", "value": "60"} | 1 |
+----+------------------------------------------------------------------------------------+------------+
| 3 | {"id": "a", "value": "20"} | 1 |
+----+------------------------------------------------------------------------------------+------------+
Expected result is
{"id": "a", "sum": 90, "avg": 30},
{"id": "b", "sum": 70, "avg": 35},
{"id": "c", "sum": 120, "avg": 40}
I've tried
SELECT (
JSON_OBJECT('id', id, 'sum', sum_data, 'avg', avg_data)
) FROM (
SELECT
JSON_EXTRACT(json_data, "$.id") as id,
SUM(JSON_EXTRACT(json_data, "$.sum_data")) as sum_data,
AVG(JSON_EXTRACT(json_data, "$.avg_data")) as avg_data
FROM Details
GROUP BY JSON_EXTRACT(json_data, "$.id")
) as t
But no luck. How can I sort this out?
Input json needs to correct
create table json_sum (id int primary key auto_increment, json_data json);
insert into json_sum values (0,'[{"id": "a", "value": "30"}, {"id": "b", "value": "20"}, {"id": "c", "value": "30"}]');
insert into json_sum values (0,'[{"id": "a", "value": "40"}, {"id": "b", "value": "50"}, {"id": "c", "value": "60"}]');
insert into json_sum values (0,'[{"id": "a", "value": "20"}]');
select
json_object("id", jt.id, "sum", sum(jt.value), "avg", avg(jt.value))
from json_sum, json_table(json_data, "$[*]" columns (
row_id for ordinality,
id varchar(10) path "$.id",
value varchar(10) path "$.value")
) as jt
group by jt.id
Output:
json_object("id", jt.id, "sum", sum(jt.value), "avg", avg(jt.value))
{"id": "a", "avg": 30.0, "sum": 90.0}
{"id": "b", "avg": 35.0, "sum": 70.0}
{"id": "c", "avg": 45.0, "sum": 90.0}

PostgreSQL return JSON objects as key-value pairs

A PostgreSQL instance stores data in JSONB format:
CREATE TABLE myschema.mytable (
id BIGSERIAL PRIMARY KEY,
data JSONB NOT NULL
)
The data array might contain objects like:
{
"observations": {
"temperature": {
"type": "float",
"unit": "C",
"value": "23.1"
},
"pressure": {
"type": "float",
"unit": "mbar",
"value": "1011.3"
}
}
}
A selected row should be returned as key-value pairs in a format like:
temperature,type,float,value,23.1,unit,C,pressure,type,float,value,1011.3,unit,mbar
The following query returns at least each object, while still JSON:
SELECT id, value FROM mytable JOIN jsonb_each_text(mytable.data->'observations') ON true;
1 | {"type": "float", "unit": "mbar", "value": 1140.5}
1 | {"type": "float", "unit": "C", "value": -0.9}
5 | {"type": "float", "unit": "mbar", "value": "1011.3"}
5 | {"type": "float", "unit": "C", "value": "23.1"}
But the results are splitted and not in text format.
How can I return key-value pairs of all objects in data?
This will flatten the json structure and effectively just concatenate the values, along with the top-level key names (e.g. temperature and pressure), for the expected "depth" level. See if this is what you had in mind.
SELECT
id,
(
SELECT STRING_AGG(conc, ',')
FROM (
SELECT CONCAT_WS(',', key, STRING_AGG(value, ',')) AS conc
FROM (
SELECT key, (jsonb_each_text(value)).value
FROM jsonb_each(data->'observations')
) AS x
GROUP BY key
) AS csv
) AS csv
FROM mytable
Result:
| id | csv |
| --- | --------------------------------------------------- |
| 1 | pressure,float,mbar,1011.3,temperature,float,C,23.1 |
| 2 | pressure,bigint,unk,455,temperature,int,F,45 |
https://www.db-fiddle.com/f/ada5mtMgYn5acshi3WLR7S/0