Mysql JSON_EXTRACT ignore some fields when doing 'not matching' requests - mysql

I went into some trouble when performing some select JSON_EXTRACT requests on JSON data stored in a Mysql database.
Each row doesn't have exactly the same JSON data structure. All is going well when I'm using JSON_EXTRACT to select fields matching a condition.
The problem is when trying to select fields that are not matching the condition. Only fields which does have the key (though not matching data of course) are returned.
You'll find a fiddle here that reproduces this behavior.
I think it's an intended thing but I wonder if there is a sugar workaround that can lead to the fiddle's fourth request result without adding another condition (in the real case, the requests are programmatically generated based on a specific API syntax and adding contextual conditions will be a pain) ?

One way around your problem is to select id's which match the expression, and then use them in an IN or NOT IN expression dependent on whether you want to check for a match or non-match e.g.
SELECT *
FROM `test`
WHERE id IN (SELECT id
FROM `test`
WHERE data->>'$.test' = 'passed');
or
SELECT *
FROM `test`
WHERE id NOT IN (SELECT id
FROM `test`
WHERE data->>'$.test' = 'passed');
The only difference in the queries is the addition of the word NOT to negate the match.
Demo

Related

Is sql IN clause with single parameter the same as a query for single column?

In case I only provide a single value, are the following sql statements equivalent (eg in terms of performance)?
SELECT * FROM mytable where lastname IN(:lastnames);
SELECT * FROM mytable where lastname = :lastname;
Background: I have service that should serve a list, and a service that serves a single result. Now I thought why creating two database query endpoints, if I could achieve the same thing with just one query (means: also a single result could be queried by using the IN clause).
i tried it on my mariaDB database on a small table with hundred of records and the query with IN is a bit slower than the first one (which is to be expected) but we are talking of 0.02 sec difference
Assuming your db's engine is optimised and would check if there is one value inside the IN parameters and "convert" it to an equal/do the correct operation it would still be technically longer than just a written equal.
Also see this about IN performance.
Use This Query.
It May Be Solve Your Problem.
SELECT * FROM mytable where lastname IN(SELECT lastname FROM mytable where lastname = :lastname);

Check if MySQL Table is empty: COUNT(*) is zero vs. LIMIT(0,1) has a result?

This is a simple question about efficiency specifically related to the MySQL implementation. I want to just check if a table is empty (and if it is empty, populate it with the default data). Would it be best to use a statement like SELECT COUNT(*) FROM `table` and then compare to 0, or would it be better to do a statement like SELECT `id` FROM `table` LIMIT 0,1 then check if any results were returned (the result set has next)?
Although I need this for a project I am working on, I am also interested in how MySQL works with those two statements and whether the reason people seem to suggest using COUNT(*) is because the result is cached or whether it actually goes through every row and adds to a count as it would intuitively seem to me.
You should definitely go with the second query rather than the first.
When using COUNT(*), MySQL is scanning at least an index and counting the records. Even if you would wrap the call in a LEAST() (SELECT LEAST(COUNT(*), 1) FROM table;) or an IF(), MySQL will fully evaluate COUNT() before evaluating further. I don't believe MySQL caches the COUNT(*) result when InnoDB is being used.
Your second query results in only one row being read, furthermore an index is used (assuming id is part of one). Look at the documentation of your driver to find out how to check whether any rows have been returned.
By the way, the id field may be omitted from the query (MySQL will use an arbitrary index):
SELECT 1 FROM table LIMIT 1;
However, I think the simplest and most performant solution is the following (as indicated in Gordon's answer):
SELECT EXISTS (SELECT 1 FROM table);
EXISTS returns 1 if the subquery returns any rows, otherwise 0. Because of this semantic MySQL can optimize the execution properly.
Any fields listed in the subquery are ignored, thus 1 or * is commonly written.
See the MySQL Manual for more info on the EXISTS keyword and its use.
It is better to do the second method or just exists. Specifically, something like:
if exists (select id from table)
should be the fastest way to do what you want. You don't need the limit; the SQL engine takes care of that for you.
By the way, never put identifiers (table and column names) in single quotes.

Need a SQL Server 2008 case statement to evaluate a table and return two values

I'm a novice SQL programmer and have been banging my head against this all morning, so please bear with me. My situation is this: I have a table of SKUs that need to be sent to our eCommerce website. Each of these SKUs has a 'quantity', an 'active' value, and a 'discontinued' value. This was easy enough to handle when we were dealing with one SKU at a time, but now I have to send kits, which contain one or more SKUs.
For example, if my Kit's ID is 000920_001449_001718_999999 (a combination of four SKUs) I need to collect data for the entire set of SKUs like so:
Here's the logic I need to incorporate:
If any of the SKUs have null or WEBNO as an IsActive value, the entire kit must return WEBNO. Otherwise, return WEBYES.
If any of the SKUs have null or '1' as an IsDiscontinued value, the entire kit must return IsDiscontinued = '1'. Otherwise, return a 0.
My code is a bit of a mess, but here's what I've managed so far:
SELECT
CASE WHEN 'WEBNO' in
(
SELECT IsActive
FROM #SkusToSend as Sending
RIGHT JOIN
(
SELECT * FROM [eCommerce].[dbo].[Split] (
'000920_001449_001718_999999'
,'_')
) as SplitSkus
on Sending.SKU = SplitSkus.items
) THEN 'WEBNO'
ELSE 'WEBYES'
END
My question is this: Is it possible to write a statement that parses through my example table, returning only one row of 'IsActive' and 'IsDiscontinued'? I've tried using GROUP BY and HAVING statements on those fields, but always get multiple rows returned.
The code I have handles the WEBNO value, but not NULL, and doesn't even start to take into consideration the IsDiscontinued field yet. Is there a concise way to parse this together, or a better way to handle this type of problem?
I think a combination of ISNULL and MIN / MAX should do the trick:
SELECT
MIN(ISNULL(sending.IsActive, 'WEBNO')) AS IsActive,
MAX(ISNULL(sending.IsDiscontinuted, 1)) AS IsDiscontinuted
FROM
(
SELECT * FROM [eCommerce].[dbo].[Split] (
'000920_001449_001718_999999'
,'_')
) AS SplitSkus
LEFT JOIN #SkusToSend AS Sending
AS Sending.SKU = SplitSkus.items
I think this would be easier if you had a working example of some sample data in those tables. From guessing it looks like you have a table function splitting a string apart and giving multiple rows. You have some temp table that right joins to that so that is taking the function and essentially returning all rows it gets even if there are nulls in the temp table. This could return multiple rows as if you have a condition where you expect a single entity on a left or right join and there is a null at times you will get multiples. Or if you have a value repeated you will get multiples. You would have to ensure that you get one one result I am believing from your
Case when 'WEBNO' in
(
As while the logic may be correct to return the 'WEBNO' answer, it may be repeating the row result multiple times as the engine may interpret 'this happened' once, twice, three times. You could alleviate this by potentially doing a
'Select Distinct IsActive'
Which will make the expression return only a single result that is distinct for that column return.
Again this would be easier if we could see examples of what data those objects contained but this would be my guess.

MySQL IN() clause multiple returns

I have a special data environment where I need to be returned data in a certain way to populate a table.
This is my current query:
SELECT
bs_id,
IF(bs_board = 0, 'All Boards', (SELECT b_name FROM certboards WHERE b_id IN (REPLACE(bs_board, ';', ',')))) AS board
FROM boardsubs
As you can see I have an if statement then a special subselect.
The reason I have this is that the field bs_board is a varchar field containing multiple row IDs like so:
1;2;6;17
So, the query like it is works fine, but it only returns the first matched b_name. I need it to return all matches. For instance in this was 1;2 it should return two boards Board 1 and Board 2 in the same column. Later I can deal with adding a <br> in between each result.
But the problem I am dealing with is that it has to come back in a single column both name, or all names since the field can contain as many as the original editor selected.
This will not work the way you're thinking it will work.
Let's say bs_board is '1;2;3'
In your query, REPLACE(bs_board, ';', ',') will resolve to '1,2,3', which is a single literal string. This makes your final subquery:
SELECT b_name FROM certboards WHERE b_id IN ('1,2,3')
which is equivalent to:
SELECT b_name FROM certboards WHERE b_id = '1,2,3'
The most correct solution to the problem is to normalize your database. Your current system or storing multiple values in a single field is exactly what you should never do with an RDBMS, and this is exactly why. The database is not designed to handle this kind of field. You should have a separate table with one row for each bs_board, and then JOIN the tables.
There are no good solutions to this problem. It's a fundamental schema design flaw. The easiest way around it is to fix it with application logic. First you run:
SELECT bs_id, bs_board FROM boardsubs
From there you parse the bs_board field in your application logic and build the actual query you want to run:
SELECT bs_id,
IF(bs_board = 0, 'All Boards', (SELECT b_name FROM certboards WHERE b_id IN (<InsertedStringHere>) AS board
FROM boardsubs
There are other ways around the problem, but you will have problems with sorting order, matching, and numerous other problems. The best solution is to add a table and move this multi-valued field to that table.
The b_id IN (REPLACE(bs_board, ';', ',')) will result in b_id IN ('1,2,6,7') which is different from b_id IN (1,2,6,7) which is what you are looking for.
To make it work either parse the string before doing the query, or use prepared statements.

Mysql Joins - How to know which row is retrieved from which table...?

Consider i am using join on three tables to get a desired result-set...
Now in that result set, is there anyway that i can find out which row comes from which table...?
Update :
I see that i have phrased the question rather wrongly.... As pointed in one of the answers below, a result-set returned by join may will contain a row made up columns from multiple talbes...
So the question should actually be "Consider i am using union on three tables to get a desired result-set..."
You can add a table identifier column to each:
select 'A' as table_name, col1, col2 from A
union
select 'B' as table_name, col1, col2 from B
union
...
This returns a single result set which is handled by your application as any ordinary select statement:
while ( rows available ) {
row = fetchrow
if row.table_name == 'A' then
do special handling for table A
else if row.table_name == 'B' then
do special handling for table B
else if ...
}
The actual syntax is dependent on the language you are using, but most procedural languages follow the scheme above.
If you are asking this question, then your database is probably not structured correctly. (correctly being a subjective term).
A proper SQL query on a normalized database should not depend, nor be concerned with, where the data comes from.
Each row would be a combination of all tables, with null values being inserted in columns for left/right/outer joins which do not match the joining criteria. You could perhaps test if a column (from a particular table) is null, and derive from that that the non-null values must originate from the opposite table(s).
Then again, if you were actually performing an UNION, as Marcelo suggested, you would have to look at ancillary columns to determine the source of the data, as that information is lost in the combination.