I am at a loss on how to use a JSON_TABLE in mysql 8.0 only on a small part of another table, if the selection of this part involves a JOIN or a subquery:
Let's start with a table predictions which contains the value of a predictions as well as some foreign key
SELECT * FROM predictions
id | ml_model_id | value
+--+-------------+------
1 | 1 | [{"class": "dog", "confidence": 1}, {"class": "cat", "confidence": 0.03}]
Here is the selection of the whole table that works well:
SELECT *
FROM predictions,
JSON_TABLE(
predictions.value,
"$[*]"
COLUMNS (class VARCHAR(30) PATH "$.class",
confidence FLOAT PATH "$.confidence")
) myjson
Original problem: a subquery cannot be used as a table to build the JSON_TABLE on
However, this does not work anymore if I replace the predictions table by a subquery of it
SELECT * FROM (
SELECT * FROM predictions
WHERE ml_model_id = 1
# Do any type of filtering here, or even none at all
) preds,
JSON_TABLE(
preds.value,
"$[*]"
COLUMNS (class VARCHAR(30) PATH "$.class",
confidence FLOAT PATH "$.confidence")
) myjson
)
I will get an SQL Error 1210 "Incorrect argument to JSON_TABLE"
Limited workaround: If the filtering is on a table column, you can use WHERE...
My first thought was to take the filtering outside of the subquery, which to a certain extent work:
SELECT * FROM predictions,
JSON_TABLE(...) myjson
WHERE predictions.ml_model_id = 1
This solution works, already comes off as unelegant to me, because it means that the , between predictions and JSON_TABLE is an implicit lateral join instead of a cross join (Edit: note that even when using a CROSS JOIN instead of the , does an implicit lateral join)
...But it does not work if your filtering involves JOINs
However, if instead of a WHERE i'd want to filter by using a JOIN, I will get an even more cryptic error:
SELECT * FROM predictions,
JSON_TABLE(...) myjson
INNER JOIN ml_models
ON ml_models.id = predictions.ml_model_id
AND ml_models.usage = "classification"
will raise an SQL ERROR [1054] Unknown column 'predictions.ml_model_id' in 'on clause'... Why would this column not be found ? Running without the inner join actually returns it !
Unelegant workaround that seems to work in all cases: Using EXISTS in WHERE clause
Simply replacing the INNER JOIN by a WHERE EXISTS works, but this is really counterintuitive to work that way.
SELECT * FROM predictions,
JSON_TABLE(...) myjson
WHERE EXISTS (SELECT 1 FROM ml_models
WHERE ml_models.id = predictions.ml_model_id
AND ml_models.usage = "classification")
Is there something that I don't understand on JSON_TABLE that prevents me to use a subquery on it ?
I finally managed to make it work with a subquery as first part of the join, by casting the JSON field as a JSON again in the JSON_TABLE
SELECT * FROM (
SELECT * FROM predictions
WHERE ml_model_id = 1
# Do any type of filtering here, or even none at all
) preds,
JSON_TABLE(
CAST(preds.value AS JSON),
"$[*]"
COLUMNS (class VARCHAR(30) PATH "$.class",
confidence FLOAT PATH "$.confidence")
) myjson
)
I am however even more at a loss on why this would work, so I will keep the question open if someone has an explanation about why it works this way and not without the cast.
This works for me, it appears to be an order of operations when it comes to the joins.
SELECT * FROM predictions preds
INNER JOIN ml_models
ON ml_models.id = preds.ml_model_id
AND ml_models.usage = "classification",
JSON_TABLE(preds.value, ...) myjson
Related
I have json column inside my PostgreSQL table that looks something similar to this:
{"example--4--":"test 1","another example--6--":"test 2","final example--e35b172a-af71-4207-91be-d1dc357fe8f3--Equipment":"ticked"}
{"example--4--":"test 4","another example--6--":"test 5","final example--e35b172a-af71-4207-91be-d1dc357fe8f3--Equipment":"ticked"}
Each key contains a map which is separated by --. The prefix is unique, ie: "example", "another example" and "final example".
I need to query on the unique prefix and so far, nothing I'm trying is even close.
select some_table.json_column from some_table
left join lateral (select array(select * from json_object_keys(some_table.json_column) as keys) k on true
where (select SPLIT_PART(k::text, '--', 1) as part_name) = 'example'
and some_table.json_column->>k = 'test 1'
The above is resulting in the following error (last line):
operator does not exist: json -> record
My expected output would be any records where "example--4--":"test 1" is present (in my above example, the only result would be)
{"example--4--":"test 1","another example--6--":"test 2","final example--e35b172a-af71-4207-91be-d1dc357fe8f3--Equipment":"ticked"}
Any help appreciated. After debugging around for a while, I can see the main issue resolves in the implicit cast to ::text. k seems to be a "record" of the keys that I need to loop and split to compare, currently, I'm casting a record to text which is causing the issue.
One way to do it, is to use an EXIST condition together with jsonb_each_text()
select *
from the_table
where exists (select *
from jsonb_each_text(data) as x(key,value)
where x.key like 'example%'
and x.value = 'test 1')
If your column isn't a jsonb (which it should be), you need to use json_each_text() instead
Another option is to use a JSON path expression:
select *
from the_table
where data #? '$.keyvalue() ? (#.key like_regex "^example" && #.value == "test 1")'
If I have a table with a column named json_stuff, and I have two rows with
{ "things": "stuff" } and { "more_things": "more_stuff" }
in their json_stuff column, what query can I make across the table to receive [ things, more_things ] as a result?
Use this:
select jsonb_object_keys(json_stuff) from table;
(Or just json_object_keys if you're using just json.)
The PostgreSQL json documentation is quite good. Take a look.
And as it is stated in the documentation, the function only gets the outer most keys. So if the data is a nested json structure, the function will not return any of the deeper keys.
WITH t(json_stuff) AS ( VALUES
('{"things": "stuff"}'::JSON),
('{"more_things": "more_stuff"}'::JSON)
)
SELECT array_agg(stuff.key) result
FROM t, json_each(t.json_stuff) stuff;
Here is the example if you want to get the key list of each object:
select array_agg(json_keys),id from (
select json_object_keys(json_stuff) as json_keys,id from table) a group by a.id
Here id is the identifier or unique value of each row. If the row cannot be distinguished by identifier, maybe it's better to try PL/pgSQL.
Here's a solution that implements the same semantics as MySQL's JSON_KEYS(), which...:
is NULL safe (i.e. when the array is empty, it produces [], not NULL, or an empty result set)
produces a JSON array, which is what I would have expected from how the question was phrased.
SELECT
o,
(
SELECT coalesce(json_agg(j), json_build_array())
FROM json_object_keys(o) AS j (j)
)
FROM (
VALUES ('{}'::json), ('{"a":1}'::json), ('{"a":1,"b":2}'::json)
) AS t (o)
Replace json by jsonb if needed.
Producing:
|o |coalesce |
|-------------|----------|
|{} |[] |
|{"a":1} |["a"] |
|{"a":1,"b":2}|["a", "b"]|
Insert json_column and table
select distinct(tableProps.props) from (
select jsonb_object_keys(<json_column>) as props from <table>
) as tableProps
I wanted to get the amount of keys from a JSONB structure, so I'm doing something like this:
select into cur some_jsonb from mytable where foo = 'bar';
select into keys array_length(array_agg(k), 1) from jsonb_object_keys(cur) as k;
I feel it is a little bit wrong, but it works. It's unfortunate that we can't get an array directly from the json_object_keys() function. That would save us some code.
Datamodel
A person is represented in the database as a meta table row with a name and with multiple attributes which are stored in the data table as key-value pair (key and value are in separate columns).
Simplified data-model
Now there is a query to retrieve all users (name) with all their attributes (data). The attributes are returned as JSON object in a separate column. Here is an example:
name data
Florian { "age":25 }
Markus { "age":25, "color":"blue" }
Thomas {}
The SQL command looks like this:
SELECT
name,
json_object_agg(d.key, d.value) AS data,
FROM meta AS m
JOIN (
JOIN d.fk_id, d.key, d.value AS value FROM data AS d
) AS d
ON d.fk_id = m.id
GROUP BY m.name;
Problem
Now the problem I am facing is, that users like Thomas which do not have any attributes stored in the key-value table, are not shown with my select function. This is because it does only a JOIN and no LEFT OUTER JOIN.
If I would use LEFT OUTER JOIN then I run into the problem, that json_object_agg try's to aggregate NULL values and dies with an error.
Approaches
1. Return empty list of keys and values
So I tried to check if the key-column of a user is NULL and return an empty array so json_object_agg would just create an empty JSON object.
But there is not really a function to create an empty array in SQL. The nearest thing I found was this:
select '{}'::text[];
In combination with COALESCE the query looks like this:
json_object_agg(COALESCE(d.key, '{}'::text[]), COALESCE(d.value, '{}'::text[])) AS data
But if I try to use this I get following error:
ERROR: COALESCE types text and text[] cannot be matched
LINE 10: json_object_agg(COALESCE(d.key, '{}'::text[]), COALES...
^
Query failed
PostgreSQL said: COALESCE types text and text[] cannot be matched
So it looks like that at runtime d.key is a single value and not an array.
2. Split up JSON creation and return empty list
So I tried to take json_object_agg and replace it with json_object which does not aggregate the keys for me:
json_object(COALESCE(array_agg(d.key), '{}'::text[]), COALESCE(array_agg(d.value), '{}'::text[])) AS data
But there I get the error that null value not allowed for object key. So COALESCE does not check that the array is empty.
Qustion
So, is there a function to check if a joined column is empty, and if yes return just a simple JSON object?
Or is there any other solution which would solve my problem?
Use left join with coalesce(). As default value use '{}'::json.
select name, coalesce(d.data, '{}'::json) as data
from meta m
left join (
select fk_id, json_object_agg(d.key, d.value) as data
from data d
group by 1
) d
on m.id = d.fk_id;
name | data
---------+------------------------------------
Florian | { "age" : "25" }
Marcus | { "age" : "25", "color" : "blue" }
Thomas | {}
(3 rows)
Here is a brief explanation of what I'm trying to accomplish; my query follows below.
There are 4 tables and 1 view which are relevant for this particular query (sorry the names look messy, but they follow a strict convention that would make sense if you saw the full list):
Performances may have many Performers, and those associations are stored in PPerformer. Fans can have favorites, which are stored in Favorite_Performer. The _UpcomingPerformances view contains all the information needed to display a user-friendly list of upcoming performances.
My goal is to select all the data from _UpcomingPerformances, then include one additional column that specifies whether the given Performance has a Performer which the Fan added as their favorite. This involves selecting the list of Performers associated with the Performance, and also the list of Performers who are in Favorite_Performer for that Fan, and intersecting the two arrays to determine if anything is in common.
When I execute the below query, I get the error #1054 - Unknown column 'up.pID' in 'where clause'. I suspect it's somehow related to a misuse of Correlated Subqueries but as far as I can tell what I'm doing should work. It works when I replace up.pID (in the WHERE clause of t2) with a hard-coded number, and yes, pID is an existing column of _UpcomingPerformances.
Thanks for any help you can provide.
SELECT
up.*,
CASE
WHEN EXISTS (
SELECT * FROM (
SELECT RID FROM Favorite_Performer
WHERE FanID = 107
) t1
INNER JOIN
(
SELECT r.ID as RID
FROM PPerformer pr
JOIN Performer r ON r.ID = pr.Performer_ID
WHERE pr.Performance_ID = up.pID
) t2
ON t1.RID = t2.RID
)
THEN "yes"
ELSE "no"
END as pText
FROM
_UpcomingPerformances up
The problem is scope related. The nested Selects make the up table invisible inside the internal select. Try this:
SELECT
up.*,
CASE
WHEN EXISTS (
SELECT *
FROM Favorite_Performer fp
JOIN Performer r ON fp.RID = r.ID
JOIN PPerformer pr ON r.ID = pr.Performer_ID
WHERE fp.FanID = 107
AND pr.Performance_ID = up.pID
)
THEN 'yes'
ELSE 'no'
END as pText
FROM
_UpcomingPerformances up
I'm new to joins and I'm sure this is ridiculously simple. If I remove one join in the query the remainder of the query works regardless of which join I remove. But as shown it gives the error saying the column doesn't exist. Any pointers?
select
loc_carr.address1 as carr_addr1,
loc_cust.address1 as cust_addr1
from db_name.carrier, db_name.customer
join db_name.location as loc_carr on vats.carrier.location_id=loc_carr.location_id
join db_name.location as loc_cust on vats.customer.location_id=loc_cust.location_id
thanks
I'll take a guess that there is a column named something like carrier_id that can be used to join the carrier and customer tables. Given that assumption, try this:
select
loc_carr.address1 as carr_addr1
, loc_cust.address1 as cust_addr1
from vats.carrier as a
join vats.customer as b
on b.carrier_id=a.carrier_id
join vats.location as loc_carr
on loc_carr.location_id=a.location_id
join vats.location as loc_cust
on loc_cust.location_id=b.location_id
Notice the use of aliases for the table references to make things easier to read. Also note how I'm using explicit SQL join syntax (instead of listing tables separated by commas).
#Bob Duell has the solution for your problem. To understand better why this error is produced, notice that in the FROM clause, you "join" tables using both explicit JOIN syntax and the implicit joins with comma: , which is (almost) equivalent to a CROSS JOIN. The precedence however of JOIN is stronger than the comma , operator. So, that part is parsed like this:
FROM
( db_name.carrier )
,
( ( db_name.customer
JOIN db_name.location AS loc_carr
ON carrier.location_id = loc_carr.location_id -- this line
) -- gives the error
JOIN join db_name.location AS loc_cust
ON customer.location_id = loc_cust.location_id
)
In the mentioned line above, the vats.carrier.location_id throws the error, as there is no carrier table in that scope (inside that parenthesis).