How can I merge records inside two JSON arrays? - json

I have two Postgres SQL queries returning JSON arrays:
q1:
[
{"id": 1, "a": "text1a", "b": "text1b"},
{"id": 2, "a": "text2a", "b": "text2b"},
{"id": 2, "a": "text3a", "b": "text3b"},
...
]
q2:
[
{"id": 1, "percent": 12.50},
{"id": 2, "percent": 75.00},
{"id": 3, "percent": 12.50}
...
]
I want the result to be a union of both array unique elements:
[
{"id": 1, "a": "text1a", "b": "text1b", "percent": 12.50},
{"id": 2, "a": "text2a", "b": "text2b", "percent": 75.00},
{"id": 3, "a": "text3a", "b": "text3b", "percent": 12.50},
...
]
How can this be done with SQL in Postgres 9.4?

Assuming data type jsonb and that you want to merge records of each JSON array that share the same 'id' value.
Postgres 9.5
makes it simpler with the new concatenate operator || for jsonb values:
SELECT json_agg(elem1 || elem2) AS result
FROM (
SELECT elem1->>'id' AS id, elem1
FROM (
SELECT '[
{"id":1, "percent":12.50},
{"id":2, "percent":75.00},
{"id":3, "percent":12.50}
]'::jsonb AS js
) t, jsonb_array_elements(t.js) elem1
) t1
FULL JOIN (
SELECT elem2->>'id' AS id, elem2
FROM (
SELECT '[
{"id": 1, "a": "text1a", "b": "text1b", "percent":12.50},
{"id": 2, "a": "text2a", "b": "text2b", "percent":75.00},
{"id": 3, "a": "text3a", "b": "text3b", "percent":12.50}]'::jsonb AS js
) t, jsonb_array_elements(t.js) elem2
) t2 USING (id);
The FULL [OUTER] JOIN makes sure you don't lose records without match in the other array.
The type jsonb has the convenient property to only keep the latest value for each key in the record. Hence, the duplicate 'id' key in the result is merged automatically.
The Postgres 9.5 manual also advises:
Note: The || operator concatenates the elements at the top level of
each of its operands. It does not operate recursively. For example, if
both operands are objects with a common key field name, the value of
the field in the result will just be the value from the right hand operand.
Postgres 9.4
Is a bit less convenient. My idea would be to extract array elements, then extract all key/value pairs, UNION both results, aggregate into a single new jsonb values per id value and finally aggregate into a single array.
SELECT json_agg(j) -- ::jsonb
FROM (
SELECT json_object_agg(key, value)::jsonb AS j
FROM (
SELECT elem->>'id' AS id, x.*
FROM (
SELECT '[
{"id":1, "percent":12.50},
{"id":2, "percent":75.00},
{"id":3, "percent":12.50}]'::jsonb AS js
) t, jsonb_array_elements(t.js) elem, jsonb_each(elem) x
UNION ALL -- or UNION, see below
SELECT elem->>'id' AS id, x.*
FROM (
SELECT '[
{"id": 1, "a": "text1a", "b": "text1b", "percent":12.50},
{"id": 2, "a": "text2a", "b": "text2b", "percent":75.00},
{"id": 3, "a": "text3a", "b": "text3b", "percent":12.50}]'::jsonb AS js
) t, jsonb_array_elements(t.js) elem, jsonb_each(elem) x
) t
GROUP BY id
) t;
The cast to jsonb removes duplicate keys. Alternatively you could use UNION to fold duplicates (for instance if you want json as result). Test which is faster for your case.
Related:
How to turn json array into postgres array?
Merging Concatenating JSON(B) columns in query

For any single jsonb element this use of the concat || operator works well for me with strip_nulls and another trick to cast the result back to jsonb (not an array).
select jsonb_array_elements(jsonb_strip_nulls(jsonb_agg(
'{
"a" : "unchanged value",
"b" : "old value",
"d" : "delete me"
}'::jsonb
|| -- The concat operator works as merge on jsonb, the right operand takes precedence
-- NOTE: it only works one JSON level deep
'{
"b" : "NEW value",
"c" : "NEW field",
"d" : null
}'::jsonb
)));
This gives the result
{"a": "unchanged value", "b": "NEW value", "c": "NEW field"}
which is properly typed jsonb

Related

Use MySQL 8 json_table to dynamically extract JSON keys and nested value

I've been attempting to use MySQL 8's JSON_TABLE to extract the root keys and then their nested values. The problem is the root keys are dynamic and the nested key/value pairs might not exist.
JSON:
{
"Foo": {
"A": 3
},
"Bar": {
"A": 1,
"B": 368
},
"Biz": {
"C": 2,
"D": 10
}
}
In this JSON the root keys "Foo", "Bar", and "Biz" are dynamic and for each of their objects I want to extract the "A" key's value, which may or may not exist. For example, the above code would return this result set:
json_key
a_value
Foo
3
Bar
1
Biz
null
I've been something along these lines but no luck (just returns one row of nulls):
select * from json_table('{"Foo": {"A": 3}, "Bar": {"A": 1, "B": 368}, "Biz": {"C": 2}}',
'$' COLUMNS(
json_key varchar(255) path '$.*',
sub_value integer path '$.*.A'
)
) as i;
In the worst case I can try to restructure the JSON but it's already in the database so I'm hoping to leverage MySQL's JSON capability. Any ideas?
SELECT json_key,
JSON_EXTRACT(#json_value, CONCAT('$.', json_key, '.A')) a_value
FROM JSON_TABLE(JSON_KEYS(#json_value),
'$[*]' COLUMNS (json_key VARCHAR(255) PATH '$')) keystable
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=dd9c01e77d57206d587dd2d17340bc02

How to declare a JSON string in postgres?

There are many examples of json parsing in POSTGRES, which pull data from a table. I have a raw json string handy and would like to practice using JSON functions and operators. Is it possible to do this without using tables? Or ... what is the most straightfoward way to declare it as a variable? Something like...
# Declare
foojson = "{'a':'foo', 'b':'bar'}"
# Use
jsonb_array_elements(foojson) -> 'a'
Basically I'd like the last line to print to console or be wrappable in a SELECT statement so I can rapidly "play" with some of these operators.
You can pass it directly to the function
select '{"a": "foo", "b": "bar"}'::jsonb ->> 'a';
select *
from jsonb_each('{"a": "foo", "b": "bar"}');
select *
from jsonb_array_elements('[{"a": "foo"}, {"b": "bar"}]');
Or if you want to pretend, it's part of a table:
with data (json_value) as (
values
('{"a": "foo", "b": "bar"}'::jsonb),
('{"foo": 42, "x": 100}')
)
select e.*
from data d
cross join jsonb_each(d.json_value) as e;
with data (json_value) as (
values
('{"a": 1, "b": "x"}'::jsonb),
('{"a": 42, "b": "y"}')
)
select d.json_value ->> 'a',
d.json_value ->> 'b'
from data d;

How to search a json array for some items with a certain key?

Table: some_table, Records:
id|data (json array of json object)
1|[{"a": 1, "b": 2}, {"a": 3, "b": 4}]
2|[{"a": 5, "b": 6}, {"a": 3, "b": 4}]
Would u show me the SQL (MySQL), to find any record whose data has a item (with a key "a" and value 5). Only record #2 will be found. I tried the following SQL but failed, because I'm new to use SQL for JSON.
select * from some_table where json_contains(data, '5', '$.a') = 1;
select id, data
from
(
select 1 as id, '[{"a": 1, "b": 2}, {"a": 3, "b": 4}]' as data
union select 2 as id, '[{"a": 5, "b": 6}, {"a": 3, "b": 4}]' as data
) as tmp
where json_contains(data -> '$[*].a','5')
You have two fundamental flaws in your current SQL statement:
For checking if keys exist in a JSON document, you'll want to utilize json_contains_path() as opposed to json_contains(), which will allow you to specify a specific key path without actually having to provide a value for which you're looking for said key.
The path you're leveraging to look for key a doesn't respect that that key would be contained within an array structure - you need to modify your path to look everywhere in the document for a key a. (H/T to MySQL :: MySQL 8.0 Reference Manual :: 11.6 The JSON Data Type)
With this knowledge, your query would now look like this:
select * from some_table where json_contains_path(data, 'one', '$**.a')
which, utilizing your sample dataset, returns:
+----+--------------------------------------+
| id | data |
+----+--------------------------------------+
| 1 | [{"a": 1, "b": 2}, {"a": 3, "b": 4}] |
| 2 | [{"a": 5, "b": 6}, {"a": 3, "b": 4}] |
+----+--------------------------------------+
DB Fiddle

Postgresql 9.4 - Invalid input syntax when converting to JSONB

I've made a simple function to update a jsonb with new values:
CREATE OR REPLACE FUNCTION jsonupdate(
IN "pJson" jsonb, IN "pNewValues" jsonb)
RETURNS jsonb AS
$BODY$
DECLARE
jsonreturn jsonb;
BEGIN
jsonreturn := (SELECT json_object_agg(keyval.key, keyval.value::jsonb)
FROM (SELECT key,
CASE WHEN "pNewValues" ? key THEN
(SELECT "pNewValues" ->> key)
ELSE
value
END
FROM jsonb_each_text("pJson")) keyval);
RETURN jsonreturn;
END; $BODY$
LANGUAGE plpgsql IMMUTABLE
COST 100;
Sample inputs and outputs:
IN: SELECT jsonupdate('{"a" : "1", "b" : "2"}', '{"a": "3"}');
OUT: {"a": 3, "b": 2}
IN: SELECT jsonupdate('{"a" : "3", "b" : { "c": "text", "d": 1 }}', '{"b": { "c": "another text" }}');
OUT: {"a": 3, "b": {"c": "another text"}}
IN: SELECT jsonupdate('{"a" : "1", "b" : "2", "c": 3, "d": 4}', '{"a": "5", "d": 6}');
OUT: {"a": 5, "b": 2, "c": 3, "d": 6}
The problem happens when using inputs like this one: SELECT jsonupdate('{"a" : "1", "b" : ""}', '{"a": "5"}') or this one: SELECT jsonupdate('{"a" : "1", "b" : "2"}', '{"a": "."}') or this one: SELECT jsonupdate('{"a" : "1", "b" : "2"}', '{"a": ""}') it gives me an error
ERROR: invalid input syntax for type json
DETAIL: The input string ended unexpectedly.
CONTEXT: JSON data, line 1:
What's wrong here?
You sould use the jsonb_each() function (instead of jsonb_each_text()). Also, the -> operator (instead of ->>):
CREATE OR REPLACE FUNCTION jsonupdate(IN "pJson" jsonb, IN "pNewValues" jsonb)
RETURNS jsonb
LANGUAGE sql
IMMUTABLE AS
$BODY$
SELECT json_object_agg(key, CASE
WHEN "pNewValues" ? key THEN "pNewValues" -> key
ELSE value
END)
FROM jsonb_each("pJson")
$BODY$;
jsonb_each_text() and the ->> operator converts any non-string JSON value to their string representation. Converting those back to JSON will modify your data in a way you probably don't want to.
But I have to admit, what you are trying to achieve is almost the || (concatenation) operator. I.e.
SELECT jsonb '{"a" : "1", "b" : "2"}' || jsonb '{"a": "3"}'
will give you your desired output. The only difference between || and your function is when pNewValues contains key(s), which are not in pJson: || will append those too, while your function does not append them (it only modifies existing ones).
Update: for simulating the || operator on 9.4, you can use the following function:
CREATE OR REPLACE FUNCTION jsonb_merge_objects(jsonb, jsonb)
RETURNS jsonb
LANGUAGE sql
IMMUTABLE AS
$func$
SELECT json_object_agg(key, COALESCE(b.value, a.value))
FROM jsonb_each($1) a
LEFT JOIN jsonb_each($2) b USING (key)
$func$;

jsonb_agg: Avoid objects wrapped with "jsonb_build_object"

I can create JSON objects using jsonb_build_object the way I want them. E.g.
SELECT jsonb_build_object('id', id) FROM (SELECT generate_series(1,3) id) objects;
results in
jsonb_build_object
------------------
{"id": 1}
{"id": 2}
{"id": 3}
But when I want to add them to an array, they are wrapped in an additional object, using the column name as key:
SELECT jsonb_build_object(
'foo', 'bar',
'collection', jsonb_agg(collection)
)
FROM (
SELECT jsonb_build_object('id', id)
FROM (
SELECT generate_series(1,3) id
) objects
) collection;
results in
{"foo": "bar", "collection": [{"jsonb_build_object": {"id": 1}}, {"jsonb_build_object": {"id": 2}}, {"jsonb_build_object": {"id": 3}}]}
How can I get
{"foo": "bar", "collection": [{"id": 1}, {"id": 2}, {"id": 3}]}
instead?
Use jsonb_agg(collection.jsonb_build_object). You can use aliases too, but the point is that collection refers to the entire row, which has a (single) jsonb_build_object named (by default) column, which is the JSON you want to aggregate.
With simplifying and aliases, you query can be:
SELECT jsonb_build_object(
'foo', 'bar',
'collection', jsonb_agg(js)
)
FROM generate_series(1,3) id
CROSS JOIN LATERAL jsonb_build_object('id', id) js;
Notes:
LATERAL is implicit, I just wrote it for clarity
aliasing like this in the FROM clause creates a table & a column alias too, with the same name. So it is equivalent to jsonb_build_object('id', id) AS js(js)