oracle - fetch first n json objects from json array - json

I have a table in which one column we have json array data. in some rows this json is very big (with 10000+ json objects) like below. wanted to know is there any way just to select first 250 objects from the array-
[
{
"product":"Vegetable",
"name":"Potato",
"price":"$60.00"
},
{
"product":"Fruit",
"name":"Mango",
"price":"$3.30"
},
{
"product":"Milk",
"name":"Milk",
"price":"$1.08"
},
.....10,000
]

Well I investigated the question, and I found one can use json_query to select single entries from the JSON.
CREATE TABLE json_table ( JSON varchar(1024) NOT NULL , constraint CK_JSON_IS_JSON check (JSON is json));
insert into json_table columns (JSON) values ('[ { "product":"Vegetable", "name":"Potato", "price":"$60.00" }, { "product":"Fruit", "name":"Mango", "price":"$3.30" }, { "product":"Milk", "name":"Milk", "price":"$1.08" }]');
select json_query(JSON, '$[0]'),
json_query(JSON, '$[1]'),
json_query(JSON, '$[2]'),
json_query(JSON, '$[3]')
from json_table;
This selects entries 0 to 3, with 3 not being found and being NULL.
You could probably stitch together a database stored procedure to return a list of the first n entries in the JSON.

An analytic function such as ROW_NUMBER() might be used within the subquery to determine the restriction, and then JSON_ARRAYAGG() and JSON_OBJECT() combination might be added to get back the reduced array :
SELECT JSON_ARRAYAGG(
JSON_OBJECT('product' VALUE product,
'name' VALUE name,
'price' VALUE price) ) AS "Result"
FROM
(
SELECT t.*, ROW_NUMBER() OVER (ORDER BY 1) AS rn
FROM tab
CROSS JOIN
JSON_TABLE(jsdata, '$[*]' COLUMNS (
product VARCHAR(100) PATH '$.product',
name VARCHAR(100) PATH '$.name',
price VARCHAR(100) PATH '$.price'
)
) t
)
WHERE rn <= 250
Demo

Related

Athena/Presto find key with the max value in JSON object

I have a column in Athena (type string) with json like this:
{
"key1": 1.1,
"key2":2.2,
"key3": 3.3
}
How do I write a query which will return me the JSON key with the highest value (in this example it is key3) for each row and the associated value (3.3).
Note: I don't know what are the key names in advance (and there can be several)
You can cast your json as MAP(VARCHAR, INTEGER) and process it. For example (this uses map_entries function to turn map into array of rows, reduce array function and relies on default row naming convention) :
WITH dataset AS (
SELECT *
FROM (VALUES
(JSON '{
"key1": 1.1,
"key2":2.2,
"key3": 3.3
}'),
(JSON '{
"key0": 1.1,
"key1":4.4,
"key2": 3.3
}')) AS t (json))
SELECT row.field0 as key, row.field1 as value
FROM
(SELECT reduce(
map_entries(CAST(json as MAP(VARCHAR, INTEGER))),
ROW (null, null),
(agg, curr) -> IF (agg.field1 > curr.field1, agg, curr),
s -> s) as row
FROM dataset)
Output:
key
value
key3
3
key1
4
So I have found a way but it seems very convoluted, would appreciate if anyone else has a better solution. Assuming there is a column called Id, and the json is stored in a separate column:
with d as (
select id,
CAST(json_extract(json_col, '$') AS MAP(VARCHAR, VARCHAR)) as s
from TABLE_NAME
),
d2 as (
select *,
element_at(s, key) AS value
from d
cross join unnest(map_keys(s)) AS sx(key)
),
d3 as (
select id, key, value,
rank() over (partition by id order by value desc) as order
from d2
order by id, order
)
select id, key, value from d3 where order = 1
Basically first cast the JSON object into a map, then unnest the map keys and cross join and in a separate column store the value, then compute the rank partitioned by the value, then only choose those rows with rank = 1

MYSQL Json remove objects from array based on duplicate field

Let's say I try to insert some json data in my column like this:
[{"key": "a", "value": 1}, {"key":"a", "value": 20}]
The same key-value object could occur multiple times in the array. But there is only one array in the column.
I would like to have it so that only the first occurence of the same key gets entered into the database, in this array.
So the end result would be
[{"key": "a", "value": 1}]
Either that or after inserting a separate SQL update statement to filter out all the duplicates.
Is that possible with Mysql 5.7, or 8?
My situation is similar to this question, but for MYSQL
Test this:
WITH cte AS (
SELECT test.id,
json_parse.*,
ROW_NUMBER() OVER (PARTITION BY test.id, json_parse.keyname ORDER BY json_parse.rowid) rn
FROM test
CROSS JOIN JSON_TABLE(val,
"$[*]" COLUMNS (rowid FOR ORDINALITY,
keyname VARCHAR(255) PATH "$.key",
keyvalue VARCHAR(255) PATH "$.value")) json_parse
)
SELECT id, JSON_ARRAY(JSON_OBJECTAGG(keyname, keyvalue)) output
FROM cte
WHERE rn = 1
GROUP BY id
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=de17680ab3f962359c7c2d40e26a5c8c
If keynames may vary (are dynamic) then gather all present keynames (JSON_KEYS() function) then build correct SQL text and execute as prepared statement. Use above code as a pattern.

Combine two json array as key-value in mysql and create one json object

I have two JSON array fields in MySQL like this:
["a", "b", "c"]
["apple", "banana", "coconut"]
Now I want to combine them into one JSON object like this:
{"a":"apple", "b":"banana", "c":"coconut"}
Is there any MySQL function for this?
I would approach this in a simple way.
Unnest the two JSON structures using JSON_TABLE().
Join the two tables together.
Construct the appropriate JSON objects and aggregate.
The following implements this logic. The first CTE extracts the keys. The second extracts the values, and finally these are combined:
WITH the_keys as (
SELECT j.*
FROM t CROSS JOIN
JSON_TABLE(t.jsdata1,
'$[*]'
columns (seqnum for ordinality, the_key varchar(255) path '$')
) j
),
the_values as (
SELECT j.*
FROM t CROSS JOIN
JSON_TABLE(t.jsdata2,
'$[*]'
columns (seqnum for ordinality, val varchar(255) path '$')
) j
)
select json_objectagg(the_keys.the_key, the_values.val)
from the_keys join
the_values
on the_keys.seqnum = the_values.seqnum;
Here is a db<>fiddle.
Note that this is quite generalizable (you can add more elements to the rows). You can readily adjust it to return multiple rows of data, if you you have key/value pairs on different rows, and it uses no deprecated functionality.
You can extract by JSON_EXTRACT() function due to the index of each element within the arrays along with the contribution of row generation through use of a table from information_schema, then aggregate all results by using JSON_OBJECTAGG() returning from the subquery such as
SELECT JSON_OBJECTAGG(Js1,Js2)
FROM
(
SELECT JSON_UNQUOTE(JSON_EXTRACT(jsdata1,CONCAT('$[',#rn+1,']'))) AS Js1,
JSON_UNQUOTE(JSON_EXTRACT(jsdata2,CONCAT('$[',#rn+1,']'))) AS Js2,
#rn := #rn + 1 AS rn
FROM tab AS t1
JOIN (SELECT #rn:=-1) AS r
JOIN information_schema.tables AS t2
-- WHERE #rn < JSON_LENGTH(jsdata1) - 1 #redundant for MariaDB, but needed for MySQL
) AS j
where
'["a", "b", "c"]' is assumed to be the value of the column jsdata1 and
'["apple", "banana", "coconut"]' is assumed to be the value of the column jsdata2
within a table(tab) containing only one row inserted.
Demo
The basic way for it using JSON functions like:
select JSON_OBJECT(
JSON_UNQUOTE(JSON_EXTRACT(a, '$[0]')), JSON_EXTRACT(b, '$[0]'),
JSON_UNQUOTE(JSON_EXTRACT(a, '$[1]')), JSON_EXTRACT(b, '$[1]'),
JSON_UNQUOTE(JSON_EXTRACT(a, '$[2]')), JSON_EXTRACT(b, '$[2]')
) result from tbl;
SQL sandbox

Facing Problem in PostgresSQL query for JSON data

I am having following data
{
"City": "Fontana",
"Timezone": "America/Los_Angeles",
"Longitude": "-117.4864123",
"Timestamp": "2020-07-15T12:13:00-07:00",
"refs": ["123", "456", "789"], "tZone": "PPP"
}
above data store against analytis.col_json column
I am having table structure
CREATE TABLE analytics
(
id bigint NOT NULL,
col_typ character varying(255) COLLATE pg_catalog."default",
col_json json,
cre_dte timestamp without time zone,
CONSTRAINT clbk_logs_pkey PRIMARY KEY (id)
);
The above records are in n-rows.
I am trying to fetch records on basis of 'refs' by sending list of string. for example:-
I have a separate List as a right side values to be filter on my table.
My query is as following
select * FROM public.analytics
where col_json-> 'refs' in (
'123',
'pqa',
'bhu',
'qwerty'
);
but above query is not working for me.
The more advanced JSON capabilities are only available when using the jsonb type, so you will have to cast your column every time you want to do something non-trivial. It would be better to define the column as jsonb in the long run.
You can use the ?| operator
select a.*
from analytics a
where col_json::jsonb -> 'refs' ?| array['123','pqa','bhu','qwerty'];
Note that this only works if all array elements are strings. It does not work with numbers e.g. if the json contained "refs": [123,456] it will not work.
Alternatively you can use an EXISTS condition with a sub-query:
select a.*
from analytics a
where exists (select *
from json_array_elements_text(a.col_json -> 'refs') as x(item)
where x.item in ('123','pqa','bhu','qwerty'));
If you want refs to contain all of the values in your list you can use the contains operator #>
select a.*
from analytics a
where a.col_json::jsonb -> 'refs' #> '["123", "456"]';
Or alternatively: where a.col_json #> '{"refs": ["123", "456"]}'
The above will only return rows where both values are contained in the refs array.
Online example

Querying a JSON array of objects in Postgres

I have a postgres db with a json data field.
The json I have is an array of objects:
[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]
I'm trying to return values for a specific key in a JSON array, so in the above example I'd like to return the values for name.
When I use the following query I just get a NULL value returned:
SELECT data->'name' AS name FROM json_test
Im assuming this is because it's an array of objects? Is it possible to directly address the name key?
Ultimately what I need to do is to return a count of every unique name, is this possible?
Thanks!
you have to unnest the array of json-objects first using the function (json_array_elements or jsonb_array_elements if you have jsonb data type), then you can access the values by specifying the key.
WITH json_test (col) AS (
values (json '[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]')
)
SELECT
y.x->'name' "name"
FROM json_test jt,
LATERAL (SELECT json_array_elements(jt.col) x) y
-- outputs:
name
--------------
"Mickey Mouse"
"Donald Duck"
To get a count of unique names, its a similar query to the above, except the count distinct aggregate function is applied to y.x->>name
WITH json_test (col) AS (
values (json '[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]')
)
SELECT
COUNT( DISTINCT y.x->>'name') distinct_names
FROM json_test jt,
LATERAL (SELECT json_array_elements(jt.col) x) y
It is necessary to use ->> instead of -> as the former (->>) casts the extracted value as text, which supports equality comparison (needed for distinct count), whereas the latter (->) extracts the value as json, which does not support equality comparison.
Alternatively, convert the json as jsonb and use jsonb_array_elements. JSONB supports the equality comparison, thus it is possible to use COUNT DISTINCT along with extraction via ->, i.e.
COUNT(DISTINCT (y.x::jsonb)->'name')
updated answer for postgresql versions 12+
It is now possible to extract / unnest specific keys from a list of objects using jsonb path queries, so long as the field queried is jsonb and not json.
example:
WITH json_test (col) AS (
values (jsonb '[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]')
)
SELECT jsonb_path_query(col, '$[*].name') "name"
FROM json_test
-- replaces this original snippet:
-- SELECT
-- y.x->'name' "name"
-- FROM json_test jt,
-- LATERAL (SELECT json_array_elements(jt.col) x) y
Do like this:
SELECT * FROM json_test WHERE (column_name #> '[{"name": "Mickey Mouse"}]');
You can use jsonb_array_elements (when using jsonb) or json_array_elements (when using json) to expand the array elements.
For example:
WITH sample_data_array(arr) AS (
VALUES ('[{"name":"Mickey Mouse","age":10},{"name":"Donald Duck","age":5}]'::jsonb)
)
, sample_data_elements(elem) AS (
SELECT jsonb_array_elements(arr) FROM sample_data_array
)
SELECT elem->'name' AS extracted_name FROM sample_data_elements;
In this example, sample_data_elements is equivalent to a table with a single jsonb column called elem, with two rows (the two array elements in the initial data).
The result consists of two rows (one jsonb column, or of type text if you used ->>'name' instead):
extracted_name
----------------
"Mickey Mouse"
"Donald Duck"
(2 rows)
You should them be able to group and aggregate as usual to return the count of individual names.