JSON reformatting in PostgreSQL - json

In a PostgreSQL table I have a column with a JSON like:
{"elements":[{"val":"value1", "column":"column1"}, {"val":"val2", "column":"column2"}, ...]}.
Is any way to transform this to result set like:
column1 | column2 | ...
-----------------------
value1 | value2 | ...
I worked around PostgreSQL JSON functions but didn't find an answer.

The number of columns of a query needs to be known before the query is executed, so you will have to write one expression for each possible column in your array.
With Postgres 12, you can do this with a JSON/Path expression:
select jsonb_path_query_first(input -> 'elements', '$[*].val ? ($.column == "column1")' ) #>> '{}' as column_1,
jsonb_path_query_first(input -> 'elements', '$[*].val ? ($.column == "column2")' ) #>> '{}' as column_2
from data;
You need to repeat the jsonb_path_query_first() part for every possible column in the array.
The #>> {} is there to convert the JSONB value returned by the function to a text value.

You can use json_to_recordset function to convert JSON into rowset. Anyway, final rowset cannot have dynamic number of columns, i.e. whatever solution you choose, you will have to list them explicitly in some way.
For example in select clause when doing manual transposition of 1:1-converted JSON:
with t(d) as (values
('{"elements":[{"val":"value1", "column":"column1"}, {"val":"val2", "column":"column2"}]}'::json)
), matrix(val,col) as (
select x.val, x."column"
from t
inner join lateral json_to_recordset((t.d->>'elements')::json) as x(val text, "column" text) on true
)
select (select val from matrix where col = 'column1') as column1
, (select val from matrix where col = 'column2') as column2
Or in as x(column1 text, column2 text) clause when using the crosstab extension (see this question).
Or in somehow transformed or converted-to-xml JSON.

This sample JSON value dynamically unpivoted to col and val columns through use of json_array_elements_text() and json_each() functions by this query
SELECT json_array_elements_text(v)::json->>'column' AS col,
json_array_elements_text(v)::json->>'val' AS val
FROM tab t
CROSS JOIN json_each(jsval) as js(k,v)
but pivoting the results coming from the above query needs to columns should be listed individually, depending on the number of columns for the resulting query, in a such a way that of using a conditional aggregation :
SELECT MAX(val) FILTER (WHERE col = 'column1') as column1,
MAX(val) FILTER (WHERE col = 'column2') as column2,
MAX(val) FILTER (WHERE col = 'column3') as column3
FROM
(
SELECT json_array_elements_text(v)::json->>'column' AS col,
json_array_elements_text(v)::json->>'val' AS val
FROM tab t
CROSS JOIN json_each(jsval) as js(k,v)
) q
Demo

Based on #a_horse_with_no_name proposal:
select
jsonb_path_query_first(raw_data_1thbase.data -> 'elements', '$[*] ? (#.column == "column1").val' ) #>> '{}' as column1,
jsonb_path_query_first(raw_data_1thbase.data -> 'elements', '$[*] ? (#.column == "column2").val' ) #>> '{}' as column2
from data;
worked for me.

Related

Athena/Presto find key with the max value in JSON object

I have a column in Athena (type string) with json like this:
{
"key1": 1.1,
"key2":2.2,
"key3": 3.3
}
How do I write a query which will return me the JSON key with the highest value (in this example it is key3) for each row and the associated value (3.3).
Note: I don't know what are the key names in advance (and there can be several)
You can cast your json as MAP(VARCHAR, INTEGER) and process it. For example (this uses map_entries function to turn map into array of rows, reduce array function and relies on default row naming convention) :
WITH dataset AS (
SELECT *
FROM (VALUES
(JSON '{
"key1": 1.1,
"key2":2.2,
"key3": 3.3
}'),
(JSON '{
"key0": 1.1,
"key1":4.4,
"key2": 3.3
}')) AS t (json))
SELECT row.field0 as key, row.field1 as value
FROM
(SELECT reduce(
map_entries(CAST(json as MAP(VARCHAR, INTEGER))),
ROW (null, null),
(agg, curr) -> IF (agg.field1 > curr.field1, agg, curr),
s -> s) as row
FROM dataset)
Output:
key
value
key3
3
key1
4
So I have found a way but it seems very convoluted, would appreciate if anyone else has a better solution. Assuming there is a column called Id, and the json is stored in a separate column:
with d as (
select id,
CAST(json_extract(json_col, '$') AS MAP(VARCHAR, VARCHAR)) as s
from TABLE_NAME
),
d2 as (
select *,
element_at(s, key) AS value
from d
cross join unnest(map_keys(s)) AS sx(key)
),
d3 as (
select id, key, value,
rank() over (partition by id order by value desc) as order
from d2
order by id, order
)
select id, key, value from d3 where order = 1
Basically first cast the JSON object into a map, then unnest the map keys and cross join and in a separate column store the value, then compute the rank partitioned by the value, then only choose those rows with rank = 1

Combine two json array as key-value in mysql and create one json object

I have two JSON array fields in MySQL like this:
["a", "b", "c"]
["apple", "banana", "coconut"]
Now I want to combine them into one JSON object like this:
{"a":"apple", "b":"banana", "c":"coconut"}
Is there any MySQL function for this?
I would approach this in a simple way.
Unnest the two JSON structures using JSON_TABLE().
Join the two tables together.
Construct the appropriate JSON objects and aggregate.
The following implements this logic. The first CTE extracts the keys. The second extracts the values, and finally these are combined:
WITH the_keys as (
SELECT j.*
FROM t CROSS JOIN
JSON_TABLE(t.jsdata1,
'$[*]'
columns (seqnum for ordinality, the_key varchar(255) path '$')
) j
),
the_values as (
SELECT j.*
FROM t CROSS JOIN
JSON_TABLE(t.jsdata2,
'$[*]'
columns (seqnum for ordinality, val varchar(255) path '$')
) j
)
select json_objectagg(the_keys.the_key, the_values.val)
from the_keys join
the_values
on the_keys.seqnum = the_values.seqnum;
Here is a db<>fiddle.
Note that this is quite generalizable (you can add more elements to the rows). You can readily adjust it to return multiple rows of data, if you you have key/value pairs on different rows, and it uses no deprecated functionality.
You can extract by JSON_EXTRACT() function due to the index of each element within the arrays along with the contribution of row generation through use of a table from information_schema, then aggregate all results by using JSON_OBJECTAGG() returning from the subquery such as
SELECT JSON_OBJECTAGG(Js1,Js2)
FROM
(
SELECT JSON_UNQUOTE(JSON_EXTRACT(jsdata1,CONCAT('$[',#rn+1,']'))) AS Js1,
JSON_UNQUOTE(JSON_EXTRACT(jsdata2,CONCAT('$[',#rn+1,']'))) AS Js2,
#rn := #rn + 1 AS rn
FROM tab AS t1
JOIN (SELECT #rn:=-1) AS r
JOIN information_schema.tables AS t2
-- WHERE #rn < JSON_LENGTH(jsdata1) - 1 #redundant for MariaDB, but needed for MySQL
) AS j
where
'["a", "b", "c"]' is assumed to be the value of the column jsdata1 and
'["apple", "banana", "coconut"]' is assumed to be the value of the column jsdata2
within a table(tab) containing only one row inserted.
Demo
The basic way for it using JSON functions like:
select JSON_OBJECT(
JSON_UNQUOTE(JSON_EXTRACT(a, '$[0]')), JSON_EXTRACT(b, '$[0]'),
JSON_UNQUOTE(JSON_EXTRACT(a, '$[1]')), JSON_EXTRACT(b, '$[1]'),
JSON_UNQUOTE(JSON_EXTRACT(a, '$[2]')), JSON_EXTRACT(b, '$[2]')
) result from tbl;
SQL sandbox

WHERE x IN works with a single value, not with multiple on json

There's a hard to understand issue with querying on a json field in MySQL. The data column is of type json.
The following query works perfectly fine
SELECT * FROM `someTable` WHERE data->'$.someData' in ('A')
However the following one returns nothing.
SELECT * FROM `someTable` WHERE data->'$.someData' in ('A','B')
Funnily enough this also works:
SELECT * FROM `someTable` WHERE data->'$.someData'='A' OR data->'$.someData'='B'
I'm clueless as to why this happens. I originally thought that WHERE x IN executed in a json query format might be doing something like && but even if the values are ('A','A') it still returns nothing which essentially shows that more than one value in WHERE x IN wont work.
SAMPLE DATA (any would do really)
id | data (json)
1 | {"someData":"A"}
2 | {"someData":"B"}
Too long for a comment...
This seems to be related to an optimisation MySQL is performing when there is only one value in the IN expression (probably converting it to an a = b expression) and then it ignoring quotes. Strictly speaking,
SELECT *
FROM `someTable`
WHERE data->'$.someData' in ('A')
or
SELECT *
FROM `someTable`
WHERE data->'$.someData' = 'A'
should return no data because
SELECT data->'$.someData'
FROM someTable;
returns
"A"
"B"
which is not the same as A. You need to use JSON_UNQUOTE (or if you have MySQL 5.7.13 or later the ->> operator) to get the actual value of the someData key:
SELECT JSON_UNQUOTE(data->'$.someData') FROm someTable;
SELECT data->>'$.someData' FROm someTable;
which gives
A
B
which then works fine with an IN expression:
SELECT *
FROM `someTable`
WHERE JSON_UNQUOTE(data->'$.someData') in ('A','B')
-- or use WHERE data->>'$.someData' in ('A','B')
Output:
id data
1 {"someData":"A"}
2 {"someData":"B"}
Demo on dbfiddle
You could try using a join on a subquery instead of a IN clause
SELECT *
FROM `someTable` s
INNER JOIN (
select 'A' col
union
select 'B'
) t ON t.col = s.data->'$.someData

Is is possible to do `IN` query over JSON array?

In SQL Server, is there a way to do IN query over json array ?
eg.
There's a column foo which contains a json array
row1 -> {"foo":["a", "b"]}
row2 -> {"foo":["c", "a", "b"]}
I need to query rows which has b in json array
JSON_QUERY can return the array, but there's no way to do
Something like
SELECT *
FROM table1
WHERE "b" in JSON_QUERY(foo)
LIKE query will work, but is inefficient
You can combine OPENJSON with JSON_QUERY and use CROSS APPLY to break down the result to the array elements level
declare #tmp table (foo nvarchar(max))
insert into #tmp values
('{"foo":["a", "b"]}')
,('{"foo":["c", "a", "b"]}')
,('{"foo":["c", "a", "y"]}')
SELECT foo
FROM #tmp AS c
CROSS APPLY OPENJSON(JSON_QUERY(foo, '$.foo')) AS x
where x.[value]='b'
Sample input:
Sample output:

Querying a JSON array of objects in Postgres

I have a postgres db with a json data field.
The json I have is an array of objects:
[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]
I'm trying to return values for a specific key in a JSON array, so in the above example I'd like to return the values for name.
When I use the following query I just get a NULL value returned:
SELECT data->'name' AS name FROM json_test
Im assuming this is because it's an array of objects? Is it possible to directly address the name key?
Ultimately what I need to do is to return a count of every unique name, is this possible?
Thanks!
you have to unnest the array of json-objects first using the function (json_array_elements or jsonb_array_elements if you have jsonb data type), then you can access the values by specifying the key.
WITH json_test (col) AS (
values (json '[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]')
)
SELECT
y.x->'name' "name"
FROM json_test jt,
LATERAL (SELECT json_array_elements(jt.col) x) y
-- outputs:
name
--------------
"Mickey Mouse"
"Donald Duck"
To get a count of unique names, its a similar query to the above, except the count distinct aggregate function is applied to y.x->>name
WITH json_test (col) AS (
values (json '[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]')
)
SELECT
COUNT( DISTINCT y.x->>'name') distinct_names
FROM json_test jt,
LATERAL (SELECT json_array_elements(jt.col) x) y
It is necessary to use ->> instead of -> as the former (->>) casts the extracted value as text, which supports equality comparison (needed for distinct count), whereas the latter (->) extracts the value as json, which does not support equality comparison.
Alternatively, convert the json as jsonb and use jsonb_array_elements. JSONB supports the equality comparison, thus it is possible to use COUNT DISTINCT along with extraction via ->, i.e.
COUNT(DISTINCT (y.x::jsonb)->'name')
updated answer for postgresql versions 12+
It is now possible to extract / unnest specific keys from a list of objects using jsonb path queries, so long as the field queried is jsonb and not json.
example:
WITH json_test (col) AS (
values (jsonb '[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]')
)
SELECT jsonb_path_query(col, '$[*].name') "name"
FROM json_test
-- replaces this original snippet:
-- SELECT
-- y.x->'name' "name"
-- FROM json_test jt,
-- LATERAL (SELECT json_array_elements(jt.col) x) y
Do like this:
SELECT * FROM json_test WHERE (column_name #> '[{"name": "Mickey Mouse"}]');
You can use jsonb_array_elements (when using jsonb) or json_array_elements (when using json) to expand the array elements.
For example:
WITH sample_data_array(arr) AS (
VALUES ('[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]'::jsonb)
)
, sample_data_elements(elem) AS (
SELECT jsonb_array_elements(arr) FROM sample_data_array
)
SELECT elem->'name' AS extracted_name FROM sample_data_elements;
In this example, sample_data_elements is equivalent to a table with a single jsonb column called elem, with two rows (the two array elements in the initial data).
The result consists of two rows (one jsonb column, or of type text if you used ->>'name' instead):
extracted_name
----------------
"Mickey Mouse"
"Donald Duck"
(2 rows)
You should them be able to group and aggregate as usual to return the count of individual names.