postgresql - query to build up json - json

Running: PostgreSQL 9.6.2
I have data stored in a table that is in the form of a key/value pair. The "key" is actually the path of a json object, each one being a property. So for example if the key was "cogs","props1","value", then the json object would be like so:
{
"cogs":{
"props1": {
"value": 100
}
}
}
I'd like to somehow reconstruct a json object via a SQL query if possible. Here is the test data set:
drop table if exists test_table;
CREATE TABLE test_table
(
id serial,
file_id integer NOT NULL,
key character varying[],
value character varying,
status character varying
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
insert into test_table (file_id, key, value, status)
values (1, '{"cogs","description"}', 'some awesome cog', 'approved');
insert into test_table (file_id, key, value, status)
values (1, '{"cogs","display"}', 'Giant Cog', null);
insert into test_table (file_id, key, value, status)
values (1, '{"cogs","props1","value"}', '100', 'not verified');
insert into test_table (file_id, key, value, status)
values (1, '{"cogs","props1","id"}', 26, 'approved');
insert into test_table (file_id, key, value, status)
values (1, '{"cogs","props1","dimensions"}', '{"200", "300"}', null);
insert into test_table (file_id, key, value, status)
values (1, '{"cogs","props2","value"}', '200', 'not verified');
insert into test_table (file_id, key, value, status)
values (1, '{"cogs","props2","id"}', 27, 'approved');
insert into test_table (file_id, key, value, status)
values (1, '{"cogs","props2","dimensions"}', '{"700", "800"}', null);
insert into test_table (file_id, key, value, status)
values (1, '{"widgets","description"}', 'some awesome widget', 'approved');
insert into test_table (file_id, key, value, status)
values (1, '{"widgets","display"}', 'Giant Widget', null);
insert into test_table (file_id, key, value, status)
values (1, '{"widgets","props1","value"}', '100', 'not verified');
insert into test_table (file_id, key, value, status)
values (1, '{"widgets","props1","id"}', 28, 'approved');
insert into test_table (file_id, key, value, status)
values (1, '{"widgets","props1","dimensions"}', '{"200", "300"}', null);
insert into test_table (file_id, key, value, status)
values (1, '{"widgets","props2","value"}', '200', 'not verified');
insert into test_table (file_id, key, value, status)
values (1, '{"widgets","props2","id"}', 29, 'approved');
insert into test_table (file_id, key, value, status)
values (1, '{"widgets","props2","dimensions"}', '{"900", "1000"}', null);
The output I'm looking for is in this format:
{
"cogs": {
"description": "some awesome cog",
"display": "Giant Cog",
"props1": {
"value": 100,
"id": 26,
"dimensions": [200, 300]
},
"props2": {
"value": 200,
"id": 27,
"dimensions": [700, 800]
}
},
"widgets": {
"description": "some awesome widget",
"display": "Giant Widget",
"props1": {
"value": 100,
"id": 28,
"dimensions": [200, 300]
},
"props2": {
"value": 200,
"id": 29,
"dimensions": [900, 1000]
}
}
}
Some issues I'm facing:
The "value" column can hold text, numbers, and an array. For whatever reason, the server-side code using knex.js is storing an array of integers (ie, [100,300]) into postgres as the following format: {"100","300"}. I need to ensure I extract this out as an array of integers as well.
Trying to make this dynamic as possible. Maybe a recursive procedure to figure out what depth of the "key" path exists.... rather than hard-coding array lookup values.
json_object_agg works well to group together properties into a single object. However it breaks when hitting a null value. So if the "key" column has only two values (ie, "cogs","description"), and I attempt to aggregate up an array of length three (ie, "cogs","props1","value"), it will break unless I filter on only arrays of length 3.
Preserve the ordering of the input. #klin solution below is amazing and gets me 95% of the way there. However I failed to mention to also preserve the ordering...

A dynamic solution needs some work.
First, we need a function to convert a text array and a value to a jsonb object.
create or replace function keys_to_object(keys text[], val text)
returns jsonb language plpgsql as $$
declare
i int;
rslt jsonb = to_jsonb(val);
begin
for i in select generate_subscripts(keys, 1, true) loop
rslt := jsonb_build_object(keys[i], rslt);
end loop;
return rslt;
end $$;
select keys_to_object(array['key', 'subkey', 'subsub'], 'value');
keys_to_object
------------------------------------------
{"key": {"subkey": {"subsub": "value"}}}
(1 row)
Next, another function to merge jsonb objects (see Merging JSONB values in PostgreSQL).
create or replace function jsonb_merge(a jsonb, b jsonb)
returns jsonb language sql as $$
select
jsonb_object_agg(
coalesce(ka, kb),
case
when va isnull then vb
when vb isnull then va
when jsonb_typeof(va) <> 'object' or jsonb_typeof(vb) <> 'object' then vb
else jsonb_merge(va, vb) end
)
from jsonb_each(a) e1(ka, va)
full join jsonb_each(b) e2(kb, vb) on ka = kb
$$;
select jsonb_merge('{"key": {"subkey1": "value1"}}', '{"key": {"subkey2": "value2"}}');
jsonb_merge
-----------------------------------------------------
{"key": {"subkey1": "value1", "subkey2": "value2"}}
(1 row)
Finally, let's create an aggregate based on the above function,
create aggregate jsonb_merge_agg(jsonb)
(
sfunc = jsonb_merge,
stype = jsonb
);
and we are done:
select jsonb_pretty(jsonb_merge_agg(keys_to_object(key, translate(value, '{}"', '[]'))))
from test_table;
jsonb_pretty
----------------------------------------------
{ +
"cogs": { +
"props1": { +
"id": "26", +
"value": "100", +
"dimensions": "[200, 300]" +
}, +
"props2": { +
"id": "27", +
"value": "200", +
"dimensions": "[700, 800]" +
}, +
"display": "Giant Cog", +
"description": "some awesome cog" +
}, +
"widgets": { +
"props1": { +
"id": "28", +
"value": "100", +
"dimensions": "[200, 300]" +
}, +
"props2": { +
"id": "29", +
"value": "200", +
"dimensions": "[900, 1000]" +
}, +
"display": "Giant Widget", +
"description": "some awesome widget"+
} +
}
(1 row)

Related

Inserting data from a for loop in database

I am fetching data from an APi, extracting part of it. The data comes in nested dictionaries and lists and I used a nested for loop to extract variables. I want to insert it in mysql db, not sure how to do so, as in some of the columns I will receive a different number of values to be stored. For example, cars could be 1,2,3 or 4.
All vehicle_id fetched should be inserted into a column all_vehicles, I am not sure how to do this either.
datetime_received= datetime.now()
car_dealer_id=11
int_id = 8
dealer_name ='XXX'
for car in cars:
code=car['Code']
start_date=car['RDate']
end_date=car['RDate']
for portion in car['Consists']['Portions']:
location= portion['Location']
for consist in portion['Consist']:
ext_id = consist['ExtId']
for vehicle in consist['Vehicles']:
vehicle_id= vehicle['Id']
sql = """
INSERT INTO table
(`datetime_received`, `car_dealer_id` , `ind_id`, `dealer_name`,`code`,`start_date`, `start_time`, `end_date`, `location`, `ext_id`, `all_vehilces`)
VALUES ('%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s')"""
cursor.executemany(sql, data)
connection.commit()
connection.close()
Data:
cars = {
"Consists": {
"Portions": [
{
"Consist": [
{
"ext_id": "755411",
"Position": "0",
"Vehicles": [
{
"Id": "92",
"Position": "1"
},
{
"Id": "921",
"Position": "2"
},
{
"Id": "932",
"Position": "3"
},
{
"Id": "34",
"Position": "4"
},
{
"Id": "92",
"Position": "5"
}
]
}
],
"Location": "ATA"
}
],
"Updated": "2022-07-21T04:25:08.0000000+01:00"
},
"Code": "5`enter code here`75",
"RDate": "2022-07-21T08:25:00.0000000+01:00",
"RunDate": "2022-07-21T00:00:00.0000000+01:00",
}
EDITED: Thanks to Barmar, I managed to insert the values.
I have one final value to insert in the data[]. Based on the ext_id value I get, I have a function returning the corresponding my_system_id. I want to insert the my_system_ids as well, but I am not calling the function from the correct place and it is not being inserted into the db table.
Here is the function:
def get_my_system_id(ext_id):
cursor=db_conn.cursor()
sql=("""SELECT my_system_id FROM table
WHERE ext_id= %s""")
data=(ext_id,)
cursor.execute(sql,data)
id_row =cursor.fetchone()
if_row is not None:
my_id=id_row[0]
return(my_id)
else:
return null
Use ','.join() to combine all the vehicle IDs into a comma-delimited list.
In the prepared statement, %s should not be quoted. You also only had 10 of them, but you're inserting into 11 columns.
With th edit, add a call to get_my_system_id(ext_id) to the loop, and add that value to the data list.
data = []
for car in cars:
code=car['Code']
start_date, end_date = car['RDate'].split('T')
end_date=car['RDate']
for portion in car['Consists']['Portions']:
location= portion['Location']
for consist in portion['Consist']:
ext_id = consist['ExtId']
vehicle_ids = ','.join(v['id'] for v in consist['Vehicles'])
system_id = get_my_system_id(ext_id)
if not system_id:
print(f"No system ID found for ext_id = {ext_id}, skipping")
continue
data.append((datetime_received, car_dealer_id, int_id, dealer_name, code, start_date, start_time, end_date, location, ext_id, system_id, vehicle_ids))
sql = """
INSERT INTO table
(`datetime_received`, `car_dealer_id` , `ind_id`, `dealer_name`,`code`,`start_date`, `start_time`, `end_date`, `location`, `ext_id`, my_system_id, `all_vehicles`)
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"""
cursor.executemany(sql, data)
connection.commit()
connection.close()

how to return query in postgresql function

I've PostgreSQL function with json data object and I need to return some values
this is my function
CREATE OR REPLACE FUNCTION "public"."insert_from_json"("in_json_txt" json)
RETURNS "pg_catalog"."void" AS $BODY$
INSERT INTO json_test2 (name, age, location_id)
WITH t1 AS (
SELECT (rec->>'name')::text , (rec->>'age')::integer FROM
json_array_elements(in_json_txt->'data') rec
),t2 AS (
WITH my_v_table ( jsonblob ) AS ( VALUES ( in_json_txt:: jsonb ) )
SELECT
((my_v_table.jsonblob ->> 'Store_IntegrationCode')::numeric) as store_id
FROM my_v_table
)
SELECT * from t1,t2
$BODY$
LANGUAGE sql VOLATILE
COST 100
when I use returns query I got error :(
this is call statement
select insert_from_json('{
"Customer_IntegrationCode": "558889999",
"XretialOrderCode": "000020430",
"ShippingAddress": "Cairo, Nasr City, 01128777733",
"ShippingAddress_IntegrationCode": null,
"PaymentOption": 1,
"CreationDate": "2021-01-04T07:38:57.033Z",
"Total": 73.0,
"Currency": "EGP",
"Note": null,
"ShippingCost": 15.0,
"CODFee": 25.0,
"ShipmentProvider": null,
"Plateform": 1,
"SubTotal": 33.0,
"TotalDiscountAmount_PerOrderLevel": 0,
"OriginalSubTotal": 33.0,
"TaxPercentage": null,
"TaxValue": null,
"Store_IntegrationCode": "1234567",
"data": [
{
"name": "12345678",
"age": "23456789",
"Qty": 3,
"UnitPrice": 11.0,
"NetPrice": 11.0,
"SKUDiscount": 0,
"Total": 33.0,
"ShipmentCost": 0.0,
"SubTotal": 33.0
},
{
"name": "999999",
"age": "988888",
"Qty": 3,
"UnitPrice": 11.0,
"NetPrice": 11.0,
"SKUDiscount": 0,
"Total": 33.0,
"ShipmentCost": 0.0,
"SubTotal": 33.0
}
]
}
')
when I add return query to function I got this error
ERROR: syntax error at or near "RETURN"
LINE 18: RETURN query SELECT * from t1,t2
You can not return the values from a function returning VOID.
If you want to return the rows after insertion you can try below mentioned function definition:
CREATE OR REPLACE FUNCTION "public"."insert_from_json"("in_json_txt" json)
RETURNS
table (name_ text, age_ int, location_ numeric)
AS $BODY$
BEGIN
RETURN QUERY
INSERT INTO json_test2 (name, age, location_id)
(WITH t1 AS (
SELECT (rec->>'name')::text , (rec->>'age')::integer FROM
json_array_elements(in_json_txt->'data') rec
),t2 AS (
WITH my_v_table ( jsonblob ) AS ( VALUES ( in_json_txt:: jsonb ) )
SELECT
((my_v_table.jsonblob ->> 'Store_IntegrationCode')::numeric) as store_id
FROM my_v_table
)
SELECT * from t1,t2
)
RETURNING name, age, location_id;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
DEMO
Even same thing you can write in more shorter and simpler way like below:
CREATE OR REPLACE FUNCTION "public"."insert_from_json"("in_json_txt" json)
RETURNS
table (name_ text, age_ int, location_ numeric)
AS $BODY$
BEGIN
RETURN QUERY
INSERT INTO json_test2 (name, age, location_id)
SELECT
(rec->>'name')::text, (rec->>'age')::integer, (in_json_txt->>'Store_IntegrationCode')::numeric
FROM json_array_elements(in_json_txt->'data') rec
RETURNING name, age, location_id;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
DEMO
You can reduce the function to just an "Insert ... Returning" by 'converting' to SQL function.
create or replace function insert_from_json(in_json_txt json)
returns setof json_test2
language sql
as $$
insert into json_test2 (name, age, location)
select (rec->>'name')::text
, (rec->>'age')::integer
, (in_json_txt->>'Store_IntegrationCode')::numeric
from json_array_elements(in_json_txt->'data') rec
returning *;
$$;
See Example

SQL Server For JSON Path dynamic column name

We are exploring the JSON feature in SQL Sever and for one of the scenarios we want to come up with a SQL which can return a JSON like below
[
{
"field": {
"uuid": "uuid-field-1"
},
"value": {
"uuid": "uuid-value" //value is an object
}
},
{
"field": {
"uuid": "uuid-field-2"
},
"value": "1". //value is simple integer
}
... more rows
]
The value field can be a simple integer/string or a nested object.
We are able to come up with a table which looks like below:
field.uuid | value.uuid | value|
------------|---------- | -----|
uuid-field-1| value-uuid | null |
uuid-field-2| null | 1 |
... more rows
But as soon as we apply for json path, it fails saying
Property 'value' cannot be generated in JSON output due to a conflict with another column name or alias. Use different names and aliases for each column in SELECT list.
Is it possible to do it somehow generate this? The value will either be in the value.uuid or value not both?
Note: We are open to possibility of if we can convert each row to individual JSON and add all of them in an array.
select
json_query((select v.[field.uuid] as 'uuid' for json path, without_array_wrapper)) as 'field',
value as 'value',
json_query((select v.[value.uuid] as 'uuid' where v.[value.uuid] is not null for json path, without_array_wrapper)) as 'value'
from
(
values
('uuid-field-1', 'value-uuid1', null),
('uuid-field-2', null, 2),
('uuid-field-3', 'value-uuid3', null),
('uuid-field-4', null, 4)
) as v([field.uuid], [value.uuid], value)
for json auto;--, without_array_wrapper;
The reason for this error is that (as is mentioned in the documentation) ... FOR JSON PATH clause uses the column alias or column name to determine the key name in the JSON output. If an alias contains dots, the PATH option creates nested objects. In your case value.uuid and value both generate a key with name value.
I can suggest an approach (probably not the best one), which uses JSON_MODIFY() to generate the expected JSON from an empty JSON array:
Table:
CREATE TABLE Data (
[field.uuid] varchar(100),
[value.uuid] varchar(100),
[value] int
)
INSERT INTO Data
([field.uuid], [value.uuid], [value])
VALUES
('uuid-field-1', 'value-uuid', NULL),
('uuid-field-2', NULL, 1),
('uuid-field-3', NULL, 3),
('uuid-field-4', NULL, 4)
Statement:
DECLARE #json nvarchar(max) = N'[]'
SELECT #json = JSON_MODIFY(
#json,
'append $',
JSON_QUERY(
CASE
WHEN [value.uuid] IS NOT NULL THEN (SELECT d.[field.uuid], [value.uuid] FOR JSON PATH, WITHOUT_ARRAY_WRAPPER)
WHEN [value] IS NOT NULL THEN (SELECT d.[field.uuid], [value] FOR JSON PATH, WITHOUT_ARRAY_WRAPPER)
END
)
)
FROM Data d
SELECT #json
Result:
[
{
"field":{
"uuid":"uuid-field-1"
},
"value":{
"uuid":"value-uuid"
}
},
{
"field":{
"uuid":"uuid-field-2"
},
"value":1
},
{
"field":{
"uuid":"uuid-field-3"
},
"value":3
},
{
"field":{
"uuid":"uuid-field-4"
},
"value":4
}
]

Update every value in an array in postgres json

In my postgres database I have json that looks similar to this:
{
"myArray": [
{
"myValue": 1
},
{
"myValue": 2
},
{
"myValue": 3
}
]
}
Now I want to rename myValue to otherValue. I can't be sure about the length of the array! Preferably I would like to use something like set_jsonb with a wildcard as the array index, but that does not seem to be supported. So what is the nicest solution?
You have to decompose a whole jsonb object, modify individual elements and build the object back.
The custom function will be helpful:
create or replace function jsonb_change_keys_in_array(arr jsonb, old_key text, new_key text)
returns jsonb language sql as $$
select jsonb_agg(case
when value->old_key is null then value
else value- old_key || jsonb_build_object(new_key, value->old_key)
end)
from jsonb_array_elements(arr)
$$;
Use:
with my_table (id, data) as (
values(1,
'{
"myArray": [
{
"myValue": 1
},
{
"myValue": 2
},
{
"myValue": 3
}
]
}'::jsonb)
)
select
id,
jsonb_build_object(
'myArray',
jsonb_change_keys_in_array(data->'myArray', 'myValue', 'otherValue')
)
from my_table;
id | jsonb_build_object
----+------------------------------------------------------------------------
1 | {"myArray": [{"otherValue": 1}, {"otherValue": 2}, {"otherValue": 3}]}
(1 row)
Using json functions are definitely the most elegant, but you can get by on using character replacement. Cast the json(b) as text, perform the replace, then change it back to json(b). In this example I included the quotes and colon to help the text replace target the json keys without conflict with values.
CREATE TABLE mytable ( id INT, data JSONB );
INSERT INTO mytable VALUES (1, '{"myArray": [{"myValue": 1},{"myValue": 2},{"myValue": 3}]}');
INSERT INTO mytable VALUES (2, '{"myArray": [{"myValue": 4},{"myValue": 5},{"myValue": 6}]}');
SELECT * FROM mytable;
UPDATE mytable
SET data = REPLACE(data :: TEXT, '"myValue":', '"otherValue":') :: JSONB;
SELECT * FROM mytable;
http://sqlfiddle.com/#!17/1c28a/9/4

How to update jsonb string with PostgreSQL?

I'm using PostgreSQL 9.4.5. I'd like to update a jsonb column.
My table is structured this way:
CREATE TABLE my_table (
gid serial PRIMARY KEY,
"data" jsonb
);
JSON strings are like this:
{"files": [], "ident": {"id": 1, "country": null, "type ": "20"}}
The following SQL doesn't do the job (syntax error - SQL state = 42601):
UPDATE my_table SET "data" -> 'ident' -> 'country' = 'Belgium';
Is there a way to achieve that?
Ok there are two functions:
create or replace function set_jsonb_value(p_j jsonb, p_key text, p_value jsonb) returns jsonb as $$
select jsonb_object_agg(t.key, t.value) from (
select
key,
case
when jsonb_typeof(value) = 'object' then set_jsonb_value(value, p_key, p_value)
when key = p_key then p_value
else value
end as value from jsonb_each(p_j)) as t;
$$ language sql immutable;
First one just changes the value of the existing key regardless of the key path:
postgres=# select set_jsonb_value(
'{"files": [], "country": null, "ident": {"id": 1, "country": null, "type ": "20"}}',
'country',
'"foo"');
set_jsonb_value
--------------------------------------------------------------------------------------
{"files": [], "ident": {"id": 1, "type ": "20", "country": "foo"}, "country": "foo"}
(1 row)
create or replace function set_jsonb_value(p_j jsonb, p_path text[], p_value jsonb) returns jsonb as $$
select jsonb_object_agg(t.key, t.value) from (
select
key,
case
when jsonb_typeof(value) = 'object' then set_jsonb_value(value, p_path[2:1000], p_value)
when key = p_path[1] then p_value
else value
end as value from jsonb_each(p_j)
union all
select
p_path[1],
case
when array_length(p_path,1) = 1 then p_value
else set_jsonb_value('{}', p_path[2:1000], p_value) end
where not p_j ? p_path[1]) as t;
$$ language sql immutable;
Second one changes the value of the existing key using the path specified or creates it if the path does not exists:
postgres=# select set_jsonb_value(
'{"files": [], "country": null, "ident": {"id": 1, "type ": "20"}}',
'{ident,country}'::text[],
'"foo"');
set_jsonb_value
-------------------------------------------------------------------------------------
{"files": [], "ident": {"id": 1, "type ": "20", "country": "foo"}, "country": null}
(1 row)
postgres=# select set_jsonb_value(
'{"files": [], "country": null, "ident": {"id": 1, "type ": "20"}}',
'{ident,foo,bar,country}'::text[],
'"foo"');
set_jsonb_value
-------------------------------------------------------------------------------------------------------
{"files": [], "ident": {"id": 1, "foo": {"bar": {"country": "foo"}}, "type ": "20"}, "country": null}
(1 row)
Hope it will help to someone who uses the PostgreSQL < 9.5
Disclaimer: Tested on PostgreSQL 9.5
In PG 9.4 you are out of luck with "easy" solutions like jsonb_set() (9.5). Your only option is to unpack the JSON object, make the changes and re-build the object. That sounds very cumbersome and it is indeed: JSON is horrible to manipulate, no matter how advanced or elaborate the built-in functions.
CREATE TYPE data_ident AS (id integer, country text, "type" integer);
UPDATE my_table
SET "data" = json_build_object('files', "data"->'files', 'ident', ident.j)::jsonb
FROM (
SELECT gid, json_build_object('id', j.id, 'country', 'Belgium', 'type', j."type") AS j
FROM my_table
JOIN LATERAL jsonb_populate_record(null::data_ident, "data"->'ident') j ON true) ident
WHERE my_table.gid = ident.gid;
In the SELECT clause "data"->'ident' is unpacked into a record (for which you need to CREATE TYPE a structure). Then it is built right back into a JSON object with the new country name. In the UPDATE that "ident" object is re-joined with the "files" object and the whole thing cast to a jsonb.
A pure thing of beauty -- just so long as speed is not your thing...
My previous solution relied on 9.5 functionality.
I would recommend instead either going with abelisto's solutions below or using pl/perlu, plpythonu, or plv8js to write json mutators in a language that has better support for them.