Postgres convert json with duplicate IDs - json

With this select:
json_agg(json_build_object("id", price::money))
I get the resulting value:
[
{"6" : "$475.00"},
{"6" : "$1,900.00"},
{"3" : "$3,110.00"},
{"3" : "$3,110.00"}
]
I would like the data in this format instead:
{
"6": ["$475.00","$1,900.00"],
"3": ["$3,110.00","$3,110.00"]
}
When queried on the server or used with jsonb, the IDs are duplicate and only one of the key value pairs make it through.

You should aggregate prices in groups by ids and use the aggregate function json_object_agg(). You have to use a derived table (subquery in the from clause) because aggregates cannot be nested:
select json_object_agg(id, prices)
from (
select id, json_agg(price::money) as prices
from my_table
group by id
) s
Working example in rextester.

Related

Select and display named elements of an array in a JSON column

In Azure SQL I have a table, "violation", with a JSON column, "course_json" that contains an array. An example is:
[{
"course_int": "1465",
"course_key": "LEND1254",
"course_name": "Mortgage Servicing Introduction",
"test_int": "0"
}, {
"course_int": "1464",
"course_key": "LEND1211",
"course_name": "Mortgage Servicing Transfer",
"test_int": "0"
}]
I would like to select rows in the violation table and display columns of the table and the "course_key" as:
LEND12654,LEND1211
If there were always a fixed number of course_key's I could use:
select person_id,event_date, JSON_VALUE(course_json, '$[0].course_key') + ',' + JSON_VALUE(course_json, '$[1].course_key') from violation
But they aren't fixed... there may be one, two, ten... I'll never know.
So, is it possible to iterate through all the course_keys and display them all in a comma separated format?
Instead of JSON_VALUE, use OPENJSON to get all the courses and STRING_AGG to build the course_key delimited list.
SELECT
person_id
, event_date
, (SELECT STRING_AGG(course_key,',')
FROM OPENJSON(course_json)
WITH (
course_key nvarchar(MAX) '$.course_key'
)) AS course_key
FROM dbo.violation;
person_id
event_date
course_key
1
2022-12-21
LEND1254,LEND1211

Count JSON column in Snowflake

I have a table called HISTORY in Snowflake that has column called RECORD with VARIANT datatype, this column contain JSON data in it, I would like to add new column for HISTORY table that counting the JSON columns ( values ) for each row of HISTORY table , pls help.
Json data starts like:
{"prizes":
[ {"year":"2018",
"category":"physics",
"laureates":[ {"id":"960","firstname":"Arthur","surname":"Ashkin"}
, { "id":"961","firstname":"G\u00e9rard","surname":"Mourou" }
]
},
...
]
}
First flatten the data to the lowest level I need (laureates), and then apply on the "year" element, which is one level above the laureates element. you can also filter on the lowest level columns if I need to.
select
count(*)
from NobelPrizeJson
, lateral flatten(INPUT=>json:prizes) prizes
, lateral flatten(INPUT=>prizes.value:laureates) laureates
where prizes.value:year::int > 2010;
This is posted at:
https://community.snowflake.com/s/question/0D50Z00008xAQSY/i-have-a-query-that-counts-the-number-of-objects-inside-a-large-json-document-and-now-i-need-to-filter-on-only-objects-with-a-specific-keyvalue-pair-inside-those-objects-how-can-i-filter

Querying element inside a collection on a json field - Postgres

I have the following json structure on my Postgres. The table is named "customers" and the field that contains the json is named "data"
{
customerId: 1,
something: "..."
list: [{ nestedId: 1, attribute: "a" }, { nestedId: 2, attribute: "b" }]
}
I'm trying to query all customers that have an element inside the field "list" with nestedId = 1.
I accomplished that poorly trough the query:
SELECT data FROM customers a, jsonb_array_elements(data->'list') e WHERE (e->'nestedId')::int = 1
I said poorly because since I'm using jsonb_array_elements on the FROM clausule, it is not used as filter, resulting in a seq scan.
I tried something like:
SELECT data FROM customers where data->'list' #> '{"nestedId": 1, attribute: "a"}'::jsonb
But it does not return anything. I imagine because the "list" field is seen as an array and not as each type of my records.
Any ideas how to perform that query filtering nestedId on the WHERE condition?
Try this query:
SELECT data FROM customers where data->'list' #> '[{"nestedId": 1}]';
This query will work in Postgres 9.4+.

How to iterate through PostgreSQL jsonb array values for purposes of matching within a query

My table has many rows, each containing a jsonb object.
This object holds an array, in which there can potentially be multiple keys of the same name but with different values.
My goal is to scan my entire table and verify which rows contain duplicate values within this json object's array.
Row 1 example data:
{
"Name": "Bobb Smith",
"Identifiers": [
{
"Content": "123",
"RecordID": "123",
"SystemID": "Test",
"LastUpdated": "2017-09-12T02:23:30.817Z"
},
{
"Content": "abc",
"RecordID": "abc",
"SystemID": "Test",
"LastUpdated": "2017-09-13T10:10:21.598Z"
},
{
"Content": "def",
"RecordID": "def",
"SystemID": "Test",
"LastUpdated": "2017-09-13T10:10:21.598Z"
}
]
}
Row 2 example data:
{
"Name": "Bob Smith",
"Identifiers": [
{
"Content": "abc",
"RecordID": "abc",
"SystemID": "Test",
"LastUpdated": "2017-09-13T10:10:26.020Z"
}
]
}
My current query was originally used to find duplicates based on a name value, but, in cases where the names may be flubbed, using a record ID is a more full proof method.
However, I am having trouble figuring out how to essentially iterate over each 'Record ID' within every row and compare that 'Record ID' to every other 'Record ID' in every row within the same table to locate matches.
My current query to match 'Name':
discard temporary;
with dupe as (
select
json_document->>'Name' as name,
json_document->'Identifiers'->0->'RecordID' as record_id,
from staging
)
select name as "Name", record_id::text as "Record ID"
from dupe da
where ( select count(*) from dupe db where db.name = da.name) > 1
order by full_name;
The above query would return the matching rows IF the 'Name' field in both rows contained the same spelling of 'Bob'.
I need this same functionality using the nested value of the 'RecordID' field.
The problem here is that
json_document->'Identifiers'->0->'RecordID'
only returns the 'RecordID' at index 0 within the array.
For example, this does NOT work:
discard temporary;
with dupe as (
select
json_document->>'Name' as name,
json_document->'Identifiers'->0->'RecordID' as record_id,
from staging
)
select name as "Name", record_id::text as "Record ID"
from dupe da
where ( select count(*) from dupe db where db.record_id = da.record_id) > 1
order by full_name;
...because the query only checks the 'RecordID' value at index 0 of the 'Identifiers' array.
How could I essentially perform something like
SELECT json_document#>'RecordID'
in order to have my query check every index within the 'Identifiers' array for the 'RecordID' value?
Any and all help is greatly appreciated! Thanks!
I'm hoping to accomplish this with only a Postgres query and NOT by accessing this data with an external language. (Python, etc.)
I solved this by essentially performing the 'unnest()'-like jsonb_array_elements() on my nested jsonb array.
By doing this in a subquery, then scanning those results using a variation of my original query, I was able to achieve my desired result.
Here is what I came up with.
with dupe as (
select
json_document->>'Name' as name,
identifiers->'RecordID' as record_id
from (
select *,
jsonb_array_elements(json_document->'Identifiers') as identifiers
from staging
) sub
group by record_id, json_document
order by name
)
select * from dupe da where (select count(*) from dupe db where
db.record_id = da.record_id) > 1;

How to create an empty JSON object in postgresql?

Datamodel
A person is represented in the database as a meta table row with a name and with multiple attributes which are stored in the data table as key-value pair (key and value are in separate columns).
Simplified data-model
Now there is a query to retrieve all users (name) with all their attributes (data). The attributes are returned as JSON object in a separate column. Here is an example:
name data
Florian { "age":25 }
Markus { "age":25, "color":"blue" }
Thomas {}
The SQL command looks like this:
SELECT
name,
json_object_agg(d.key, d.value) AS data,
FROM meta AS m
JOIN (
JOIN d.fk_id, d.key, d.value AS value FROM data AS d
) AS d
ON d.fk_id = m.id
GROUP BY m.name;
Problem
Now the problem I am facing is, that users like Thomas which do not have any attributes stored in the key-value table, are not shown with my select function. This is because it does only a JOIN and no LEFT OUTER JOIN.
If I would use LEFT OUTER JOIN then I run into the problem, that json_object_agg try's to aggregate NULL values and dies with an error.
Approaches
1. Return empty list of keys and values
So I tried to check if the key-column of a user is NULL and return an empty array so json_object_agg would just create an empty JSON object.
But there is not really a function to create an empty array in SQL. The nearest thing I found was this:
select '{}'::text[];
In combination with COALESCE the query looks like this:
json_object_agg(COALESCE(d.key, '{}'::text[]), COALESCE(d.value, '{}'::text[])) AS data
But if I try to use this I get following error:
ERROR: COALESCE types text and text[] cannot be matched
LINE 10: json_object_agg(COALESCE(d.key, '{}'::text[]), COALES...
^
Query failed
PostgreSQL said: COALESCE types text and text[] cannot be matched
So it looks like that at runtime d.key is a single value and not an array.
2. Split up JSON creation and return empty list
So I tried to take json_object_agg and replace it with json_object which does not aggregate the keys for me:
json_object(COALESCE(array_agg(d.key), '{}'::text[]), COALESCE(array_agg(d.value), '{}'::text[])) AS data
But there I get the error that null value not allowed for object key. So COALESCE does not check that the array is empty.
Qustion
So, is there a function to check if a joined column is empty, and if yes return just a simple JSON object?
Or is there any other solution which would solve my problem?
Use left join with coalesce(). As default value use '{}'::json.
select name, coalesce(d.data, '{}'::json) as data
from meta m
left join (
select fk_id, json_object_agg(d.key, d.value) as data
from data d
group by 1
) d
on m.id = d.fk_id;
name | data
---------+------------------------------------
Florian | { "age" : "25" }
Marcus | { "age" : "25", "color" : "blue" }
Thomas | {}
(3 rows)