PostgreSQL json aggregation issue - json

How with Postgres aggregation function merge one object as element in array field in parent object.
What need
Sector
Project
Result
My SQL request.
select row_to_json(t)
from (
select id, data,
(
select array_to_json(array_agg(row_to_json(p)))
from (
select id, data
from public."Project"
where (s.data -ยป 'projectId') :: UUID = id
) p
) as projects
from public."Sector" s
) t;
It don't work, because projects is null. But need, unwind data field and merge projectId in data with Project table. Like unwind and lookup in MongoDB.

Related

How Can I Iterate over each element in JSON array and join with rows in SQL SERVER

I have this table named Employee I want to Pivot this into a new table:
That is into the new table dynamically means the number of attributes of JSON data can be changed dynamically:
How can I do in SQL Server 2017 dynamically?
You can use OPENJSON with a table definition for this.
You only need to use REPLACE to make it valid JSON
SELECT
e.id,
e.name,
j.score,
j.point
FROM Employee AS e
CROSS APPLY OPENJSON(REPLACE(REPLACE(e.info, 'point', '"point"'), 'score', '"score"'))
WITH (score int, point int) AS j;

Get bottom-up nested json for query in postgresql

Given the following table, I want to find a category by ID, then get a JSON object containing its parent row as JSON. if I look up category ID 999 I would like the following json structure.
How can I achieve this?
{
id: 999
name: "Sprinting",
slug: "sprinting",
description: "sprinting is fast running",
parent: {
id: 2
name: "Running",
slug: "running ",
description: "All plans related to running.",
parent: {
id: 1
name: "Sport",
slug: "sport ",
description: null,
}
}
}
CREATE TABLE public.categories (
id integer NOT NULL,
name text NOT NULL,
description text,
slug text NOT NULL,
parent_id integer
);
INSERT INTO public.categories (id, name, description, slug, parent_id) VALUES (1, 'Sport', NULL, 'sport', NULL);
INSERT INTO public.categories (id, name, description, slug, parent_id) VALUES (2, 'Running', 'All plans related to running.', 'running', 1);
INSERT INTO public.categories (id, name, description, slug, parent_id) VALUES (999, 'Sprinting', 'sprinting is fast running', 'sprinting', 2);```
demo:db<>fiddle
(Explanation below)
WITH RECURSIVE hierarchy AS (
SELECT id, parent_id
FROM categories
WHERE id = 999
UNION
SELECT
c.id, c.parent_id
FROM categories c
JOIN hierarchy h ON h.parent_id = c.id
),
jsonbuilder AS (
SELECT
c.id,
h.parent_id,
jsonb_build_object('id', c.id, 'name', c.name, 'description', c.description, 'slug', c.slug) as jsondata
FROM hierarchy h
JOIN categories c ON c.id = h.id
WHERE h.parent_id IS NULL
UNION
SELECT
c.id,
h.parent_id,
jsonb_build_object('id', c.id, 'name', c.name, 'description', c.description, 'slug', c.slug, 'parent', j.jsondata)
FROM hierarchy h
JOIN categories c ON c.id = h.id
JOIN jsonbuilder j ON j.id = h.parent_id
)
SELECT
jsondata
FROM jsonbuilder
WHERE id = 999
Generally you need a recursive query to create nested JSON objects. The naive approach is:
Get record with id = 999, create a JSON object
Get record with id = parent_id of record with 999 (id = 2), build JSON object, add this als parent attribute to previous object.
Repeat step 2 until parent is NULL
Unfortunately I saw no simple way to add a nested parent. Each step nests the JSON into deep. Yes, I am sure, there is a way to do this, storing a path of parents and use jsonb_set() everytime. This could work.
On the other hand, it's much simpler to put the currently created JSON object into a new one. So to speak, the approach is to build the JSON from the deepest level. In order to do this, you need the parent path as well. But instead create and store it while creating the JSON object, you could create it first with a separate recursive query:
WITH RECURSIVE hierarchy AS (
SELECT id, parent_id
FROM categories
WHERE id = 999
UNION
SELECT
c.id, c.parent_id
FROM categories c
JOIN hierarchy h ON h.parent_id = c.id
)
SELECT * FROM hierarchy
Fetching the record with id = 999 and its parent. Afterwards fetch the record of the parent, its id and its parent_id. Do this until parent_id is NULL.
This yields:
id | parent_id
--: | --------:
999 | 2
2 | 1
1 | null
Now we have a simple mapping list which shows the traversal tree. What is the difference to our original data? If your data contained two or more children for record with id = 1, we would not know which child we have to take to finally reach child 999. However, this result lists exactly only the anchestor relations and would not return any siblings.
Well having this, we are able to traverse the tree from the topmost element which can be embedded at the deepest level:
Fetch the record which has no parent. Create a JSON object from its data.
Fetch the child of the previous record. Create a JSON object from its data and embed the previous JSON data as parent.
Continue until there is no child.
How does it work?
This query uses recursive CTEs. The first part is the initial query, the first record, so to speak. The second part, the part after UNION, is the recursive part which usually references to the WITH clause itself. This is always a reference to the previous turn.
The JSON part is simply creating a JSON object using jsonb_build_object() which takes an arbitrary number of values. So we can use the current record data and additionally for the parent attribute the already created JSON data from the previous turn.

In a SQL Server table how do I filter records based on JSON search on a column having JSON values

I am facing a challenge while filtering records in a SQL Server 2017 table which has a VARCHAR column having JSON type values:
Sample table rows with JSON column values:
Row # 1. {"Department":["QA"]}
Row # 2. {"Department":["DEV","QA"]}
Row # 3. {"Group":["Group 2","Group 12"],"Cluster":[Cluster 11"],"Vertical":
["XYZ"],"Department":["QAT"]}
Row # 4. {"Group":["Group 20"],"Cluster":[Cluster 11"],"Vertical":["XYZ"],"Department":["QAT"]}
Now I need to filter records from this table based on an input parameter which can be in the following format:
Sample JSON input parameter to query:
1. `'{"Department":["QA"]}'` -> This should return Row # 1 as well as Row # 2.
2. `'{"Group":["Group 2"]}'` -> This should return only Row # 3.
So the search should be like if the column value contains "any available json tag with any matching value" then return those matching records.
Note - This is exactly similar to PostgreSQL jsonb as shown below:
PostgreSQL filter clause:
TableName.JSONColumnName #> '{"Department":["QA"]}'::jsonb
By researching on internet I found OPENJSON capability that is available in SQL Server which works as below.
OPENJSON sample example:
SELECT * FROM
tbl_Name UA
CROSS APPLY OPENJSON(UA.JSONColumnTags)
WITH ([Department] NVARCHAR(500) '$.Department', [Market] NVARCHAR(300) '$.Market', [Group] NVARCHAR(300) '$.Group'
) AS OT
WHERE
OT.Department in ('X','Y','Z')
and OT.Market in ('A','B','C')
But the problem with this approach is that if in future there is a need to support any new tag in JSON (like 'Area'), that will also need to be added to every stored procedure where this logic is implemented.
Is there any existing SQL Server 2017 capability I am missing or any dynamic way to implement the same?
Only thing I could think of as an option when using OPENJSON would be break down your search string into its key value pair, break down your table that is storing the json you want to search into its key value pair and join.
There would be limitations to be aware of:
This solution would not work with nested arrays in your json
The search would be OR not AND. Meaning if I passed in multiple "Department" I was searching for, like '{"Department":["QA", "DEV"]}', it would return the rows with either of the values, not those that only contained both.
Here's a working example:
DECLARE #TestData TABLE
(
[TestData] NVARCHAR(MAX)
);
--Load Test Data
INSERT INTO #TestData (
[TestData]
)
VALUES ( '{"Department":["QA"]}' )
, ( '{"Department":["DEV","QA"]}' )
, ( '{"Group":["Group 2","Group 12"],"Cluster":["Cluster 11"],"Vertical": ["XYZ"],"Department":["QAT"]}' )
, ( '{"Group":["Group 20"],"Cluster":["Cluster 11"],"Vertical":["XYZ"],"Department":["QAT"]}' );
--Here is the value we are searching for
DECLARE #SeachJson NVARCHAR(MAX) = '{"Department":["QA"]}';
DECLARE #SearchJson TABLE
(
[Key] NVARCHAR(MAX)
, [Value] NVARCHAR(MAX)
);
--Load the search value into a temp table as its key\value pair.
INSERT INTO #SearchJson (
[Key]
, [Value]
)
SELECT [a].[Key]
, [b].[Value]
FROM OPENJSON(#SeachJson) [a]
CROSS APPLY OPENJSON([a].[Value]) [b];
--Break down TestData into its key\value pair and then join back to the search table.
SELECT [TestData].[TestData]
FROM (
SELECT [a].[TestData]
, [b].[Key]
, [c].[Value]
FROM #TestData [a]
CROSS APPLY OPENJSON([a].[TestData]) [b]
CROSS APPLY OPENJSON([b].[Value]) [c]
) AS [TestData]
INNER JOIN #SearchJson [srch]
ON [srch].[Key] COLLATE DATABASE_DEFAULT = [TestData].[Key]
AND [srch].[Value] = [TestData].[Value];
Which gives you the following results:
TestData
-----------------------------
{"Department":["QA"]}
{"Department":["DEV","QA"]}

SQL JSON array sort

record in DB:
id info
0 [{"name":"a", "time":"2017-9-25 17:20:21"},{"name":"b", "time":"2017-9-25 23:23:41"},{"name":"c", "time":"2017-9-25 12:56:78"}]
my goal is to sort json array column info base on time, like:
id info
0 [{"name":"c", "time":"2017-9-25 12:56:78"},{"name":"a", "time":"2017-9-25 17:20:21"},{"name":"b", "time":"2017-9-25 23:23:41"},]
I use sparkSQL, having no clue
You can do this by converting the json array into a sql result set, extract the sorting column, and finally convert it back to a json array:
DECLARE #json NVARCHAR(MAX);
SET #json = '[
{"name":"a", "time":"2017-09-25 17:20:21"},
{"name":"b", "time":"2017-09-25 23:23:41"},
{"name":"c", "time":"2017-09-25 12:56:59"}
]';
WITH T AS (
SELECT [Value] AS array_element
, TRY_CAST(JSON_VALUE(Value, 'strict $.time') AS DATETIME) AS sorting
FROM OPENJSON(#json, 'strict $')
)
SELECT STRING_AGG(T.array_element, ',') WITHIN GROUP (ORDER BY sorting)
FROM T
Notice:
I changed the sample data slightly, due to invalid months and seconds.
The STRING_AGG() function is only available from SQL 2017/Azure SQL Database. For older versions, use the classic "FOR XML PATH" method, which I will leave as an exercise to the reader.
If you want to apply it to a full sql table, use CROSS APPLY as follows:
DECLARE #json NVARCHAR(MAX);
SET #json = '[
{"name":"a", "time":"2017-09-25 17:20:21"},
{"name":"b", "time":"2017-09-25 23:23:41"},
{"name":"c", "time":"2017-9-25 12:56:59"}
]';
WITH dat AS (
SELECT * FROM (VALUES (1,#json), (2,#json)) AS T(id, info)
)
, T AS (
SELECT id, [Value] AS array_element
, TRY_CAST(JSON_VALUE(Value, 'strict $.time') AS DATETIME) AS sorting
FROM dat
CROSS APPLY OPENJSON(info, 'strict $')
)
SELECT id
, STRING_AGG(T.array_element, ',') WITHIN GROUP (ORDER BY sorting) AS info
FROM T
GROUP BY id
What I Suggest
Alternate Database Storage System
I would not recommend storing data in this manner. It simply makes your data less accessible and malleable as you are experiencing now. If you store your data like this:
id name time
0 a 2017-9-25 17:20:21
0 b 2017-9-25 23:23:41
0 c 2017-9-25 12:56:71
1 ... ...
Then you can select the data in time order using the ORDER BY method at the end of a select query. For example:
SELECT name, time FROM table_name where id=0 ORDER BY time asc;
If you have more columns in this table that you did not show, you may need a another table to efficiently store the information, but the performance benefits of foreign keys and joins between these types of tables would outweigh having all the data in one table as inconvenient JSON arrays.

Spliting Json data into array<string> columns

I have many json arrays stored in a table like this:
{"p_id":
{"id_type":"XXX","id":"ABC111"},
"r_ids":[
{"id_type":"HAWARE_ABCDA1","id":"dfe234fhgt"},
{"id_type":"HAWARE_CDFE2","id":"sgteth5673"}
]
}
My requirement is to get data in below format:
p_id , p_id_type ,r_ids (array string), r_id_type (array string)
Ex: XXX,ABC111,[dfe234fhgt,sgteth5673],[HAWARE_ABCDA1,HAWARE_CDFE2]
I am able to get the whole set in exploded format but how to generate array
My current query:
select p_id
,p_id_type
,get_json_object(c.qqqq,'$.id') as r_id
,get_json_object(c.qqqq,'$.id_type') as r_id_type
from
(
select p_id
,p_id_type
,qqqq
from
(
select
get_json_object(a.main_pk,'$.id_type') as p_id_type
,get_json_object(a.main_pk,'$.id') as p_id
,split(regexp_replace(regexp_replace(a.r_ids,'\\}\\,\\{','\\}\\;\\{'),'\\[|\\]',''),'\\;') as yyyy
from
(
select
get_json_object(json_string,'$.p_id') as main_pk
,get_json_object(json_string, '$.r_ids') as r_ids
from sample_table limit 10
) a
) b lateral view explode(b.yyyy) yyyy_exploded as qqqq
)c
Can anyone help me what wrong I am doing? Any suggestions will be appreciated.
If you use JsonSerDe, it will be more easy to solve complex data types.
I am giving here small example, you can solve by using this:
CREATE TABLE table_json (
p_id struct<id_type:string,
id:string,
r_ids:array<struct<id_type:string,
id:string>>>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe';
LOAD DATA LOCAL INPATH '<path>/your_file.json'
OVERWRITE INTO TABLE table_json;