I have a column in Athena (type string) with json like this:
{
"key1": 1.1,
"key2":2.2,
"key3": 3.3
}
How do I write a query which will return me the JSON key with the highest value (in this example it is key3) for each row and the associated value (3.3).
Note: I don't know what are the key names in advance (and there can be several)
You can cast your json as MAP(VARCHAR, INTEGER) and process it. For example (this uses map_entries function to turn map into array of rows, reduce array function and relies on default row naming convention) :
WITH dataset AS (
SELECT *
FROM (VALUES
(JSON '{
"key1": 1.1,
"key2":2.2,
"key3": 3.3
}'),
(JSON '{
"key0": 1.1,
"key1":4.4,
"key2": 3.3
}')) AS t (json))
SELECT row.field0 as key, row.field1 as value
FROM
(SELECT reduce(
map_entries(CAST(json as MAP(VARCHAR, INTEGER))),
ROW (null, null),
(agg, curr) -> IF (agg.field1 > curr.field1, agg, curr),
s -> s) as row
FROM dataset)
Output:
key
value
key3
3
key1
4
So I have found a way but it seems very convoluted, would appreciate if anyone else has a better solution. Assuming there is a column called Id, and the json is stored in a separate column:
with d as (
select id,
CAST(json_extract(json_col, '$') AS MAP(VARCHAR, VARCHAR)) as s
from TABLE_NAME
),
d2 as (
select *,
element_at(s, key) AS value
from d
cross join unnest(map_keys(s)) AS sx(key)
),
d3 as (
select id, key, value,
rank() over (partition by id order by value desc) as order
from d2
order by id, order
)
select id, key, value from d3 where order = 1
Basically first cast the JSON object into a map, then unnest the map keys and cross join and in a separate column store the value, then compute the rank partitioned by the value, then only choose those rows with rank = 1
Related
Let's say I try to insert some json data in my column like this:
[{"key": "a", "value": 1}, {"key":"a", "value": 20}]
The same key-value object could occur multiple times in the array. But there is only one array in the column.
I would like to have it so that only the first occurence of the same key gets entered into the database, in this array.
So the end result would be
[{"key": "a", "value": 1}]
Either that or after inserting a separate SQL update statement to filter out all the duplicates.
Is that possible with Mysql 5.7, or 8?
My situation is similar to this question, but for MYSQL
Test this:
WITH cte AS (
SELECT test.id,
json_parse.*,
ROW_NUMBER() OVER (PARTITION BY test.id, json_parse.keyname ORDER BY json_parse.rowid) rn
FROM test
CROSS JOIN JSON_TABLE(val,
"$[*]" COLUMNS (rowid FOR ORDINALITY,
keyname VARCHAR(255) PATH "$.key",
keyvalue VARCHAR(255) PATH "$.value")) json_parse
)
SELECT id, JSON_ARRAY(JSON_OBJECTAGG(keyname, keyvalue)) output
FROM cte
WHERE rn = 1
GROUP BY id
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=de17680ab3f962359c7c2d40e26a5c8c
If keynames may vary (are dynamic) then gather all present keynames (JSON_KEYS() function) then build correct SQL text and execute as prepared statement. Use above code as a pattern.
I have two JSON array fields in MySQL like this:
["a", "b", "c"]
["apple", "banana", "coconut"]
Now I want to combine them into one JSON object like this:
{"a":"apple", "b":"banana", "c":"coconut"}
Is there any MySQL function for this?
I would approach this in a simple way.
Unnest the two JSON structures using JSON_TABLE().
Join the two tables together.
Construct the appropriate JSON objects and aggregate.
The following implements this logic. The first CTE extracts the keys. The second extracts the values, and finally these are combined:
WITH the_keys as (
SELECT j.*
FROM t CROSS JOIN
JSON_TABLE(t.jsdata1,
'$[*]'
columns (seqnum for ordinality, the_key varchar(255) path '$')
) j
),
the_values as (
SELECT j.*
FROM t CROSS JOIN
JSON_TABLE(t.jsdata2,
'$[*]'
columns (seqnum for ordinality, val varchar(255) path '$')
) j
)
select json_objectagg(the_keys.the_key, the_values.val)
from the_keys join
the_values
on the_keys.seqnum = the_values.seqnum;
Here is a db<>fiddle.
Note that this is quite generalizable (you can add more elements to the rows). You can readily adjust it to return multiple rows of data, if you you have key/value pairs on different rows, and it uses no deprecated functionality.
You can extract by JSON_EXTRACT() function due to the index of each element within the arrays along with the contribution of row generation through use of a table from information_schema, then aggregate all results by using JSON_OBJECTAGG() returning from the subquery such as
SELECT JSON_OBJECTAGG(Js1,Js2)
FROM
(
SELECT JSON_UNQUOTE(JSON_EXTRACT(jsdata1,CONCAT('$[',#rn+1,']'))) AS Js1,
JSON_UNQUOTE(JSON_EXTRACT(jsdata2,CONCAT('$[',#rn+1,']'))) AS Js2,
#rn := #rn + 1 AS rn
FROM tab AS t1
JOIN (SELECT #rn:=-1) AS r
JOIN information_schema.tables AS t2
-- WHERE #rn < JSON_LENGTH(jsdata1) - 1 #redundant for MariaDB, but needed for MySQL
) AS j
where
'["a", "b", "c"]' is assumed to be the value of the column jsdata1 and
'["apple", "banana", "coconut"]' is assumed to be the value of the column jsdata2
within a table(tab) containing only one row inserted.
Demo
The basic way for it using JSON functions like:
select JSON_OBJECT(
JSON_UNQUOTE(JSON_EXTRACT(a, '$[0]')), JSON_EXTRACT(b, '$[0]'),
JSON_UNQUOTE(JSON_EXTRACT(a, '$[1]')), JSON_EXTRACT(b, '$[1]'),
JSON_UNQUOTE(JSON_EXTRACT(a, '$[2]')), JSON_EXTRACT(b, '$[2]')
) result from tbl;
SQL sandbox
I have a table in which one column we have json array data. in some rows this json is very big (with 10000+ json objects) like below. wanted to know is there any way just to select first 250 objects from the array-
[
{
"product":"Vegetable",
"name":"Potato",
"price":"$60.00"
},
{
"product":"Fruit",
"name":"Mango",
"price":"$3.30"
},
{
"product":"Milk",
"name":"Milk",
"price":"$1.08"
},
.....10,000
]
Well I investigated the question, and I found one can use json_query to select single entries from the JSON.
CREATE TABLE json_table ( JSON varchar(1024) NOT NULL , constraint CK_JSON_IS_JSON check (JSON is json));
insert into json_table columns (JSON) values ('[ { "product":"Vegetable", "name":"Potato", "price":"$60.00" }, { "product":"Fruit", "name":"Mango", "price":"$3.30" }, { "product":"Milk", "name":"Milk", "price":"$1.08" }]');
select json_query(JSON, '$[0]'),
json_query(JSON, '$[1]'),
json_query(JSON, '$[2]'),
json_query(JSON, '$[3]')
from json_table;
This selects entries 0 to 3, with 3 not being found and being NULL.
You could probably stitch together a database stored procedure to return a list of the first n entries in the JSON.
An analytic function such as ROW_NUMBER() might be used within the subquery to determine the restriction, and then JSON_ARRAYAGG() and JSON_OBJECT() combination might be added to get back the reduced array :
SELECT JSON_ARRAYAGG(
JSON_OBJECT('product' VALUE product,
'name' VALUE name,
'price' VALUE price) ) AS "Result"
FROM
(
SELECT t.*, ROW_NUMBER() OVER (ORDER BY 1) AS rn
FROM tab
CROSS JOIN
JSON_TABLE(jsdata, '$[*]' COLUMNS (
product VARCHAR(100) PATH '$.product',
name VARCHAR(100) PATH '$.name',
price VARCHAR(100) PATH '$.price'
)
) t
)
WHERE rn <= 250
Demo
Given a jsonb column called pairs with data such as the following in a single record:
{ "foo": 1, "bar": 2 }
How to query for records where a given value is one of the values in the above field.
For example, query for 1 would match the above record.
Query for 3 would not match.
PostgreSQL 9.5
In Postgres 9.5 use the function jsonb_each_text() in a lateral join:
with my_table(pairs) as (
values
('{ "foo": 1, "bar": 2 }'::jsonb)
)
select t.*
from my_table t
cross join jsonb_each_text(pairs)
where value = '1';
Upgrade to Postgres 12 and use json path functions, e.g.:
select *
from my_table
where jsonb_path_exists(pairs, '$.* ? (# == 1)')
Read more: JSON Functions and Operators.
I have a postgres db with a json data field.
The json I have is an array of objects:
[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]
I'm trying to return values for a specific key in a JSON array, so in the above example I'd like to return the values for name.
When I use the following query I just get a NULL value returned:
SELECT data->'name' AS name FROM json_test
Im assuming this is because it's an array of objects? Is it possible to directly address the name key?
Ultimately what I need to do is to return a count of every unique name, is this possible?
Thanks!
you have to unnest the array of json-objects first using the function (json_array_elements or jsonb_array_elements if you have jsonb data type), then you can access the values by specifying the key.
WITH json_test (col) AS (
values (json '[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]')
)
SELECT
y.x->'name' "name"
FROM json_test jt,
LATERAL (SELECT json_array_elements(jt.col) x) y
-- outputs:
name
--------------
"Mickey Mouse"
"Donald Duck"
To get a count of unique names, its a similar query to the above, except the count distinct aggregate function is applied to y.x->>name
WITH json_test (col) AS (
values (json '[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]')
)
SELECT
COUNT( DISTINCT y.x->>'name') distinct_names
FROM json_test jt,
LATERAL (SELECT json_array_elements(jt.col) x) y
It is necessary to use ->> instead of -> as the former (->>) casts the extracted value as text, which supports equality comparison (needed for distinct count), whereas the latter (->) extracts the value as json, which does not support equality comparison.
Alternatively, convert the json as jsonb and use jsonb_array_elements. JSONB supports the equality comparison, thus it is possible to use COUNT DISTINCT along with extraction via ->, i.e.
COUNT(DISTINCT (y.x::jsonb)->'name')
updated answer for postgresql versions 12+
It is now possible to extract / unnest specific keys from a list of objects using jsonb path queries, so long as the field queried is jsonb and not json.
example:
WITH json_test (col) AS (
values (jsonb '[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]')
)
SELECT jsonb_path_query(col, '$[*].name') "name"
FROM json_test
-- replaces this original snippet:
-- SELECT
-- y.x->'name' "name"
-- FROM json_test jt,
-- LATERAL (SELECT json_array_elements(jt.col) x) y
Do like this:
SELECT * FROM json_test WHERE (column_name #> '[{"name": "Mickey Mouse"}]');
You can use jsonb_array_elements (when using jsonb) or json_array_elements (when using json) to expand the array elements.
For example:
WITH sample_data_array(arr) AS (
VALUES ('[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]'::jsonb)
)
, sample_data_elements(elem) AS (
SELECT jsonb_array_elements(arr) FROM sample_data_array
)
SELECT elem->'name' AS extracted_name FROM sample_data_elements;
In this example, sample_data_elements is equivalent to a table with a single jsonb column called elem, with two rows (the two array elements in the initial data).
The result consists of two rows (one jsonb column, or of type text if you used ->>'name' instead):
extracted_name
----------------
"Mickey Mouse"
"Donald Duck"
(2 rows)
You should them be able to group and aggregate as usual to return the count of individual names.