MySQL Case function behave strange and inconstent - mysql

We are using MySQL 8 as our java application DB.
We have a query with the following format:
select
id,
group_concat(NAME ORDER BY ID separator ',,') AS Code,
CASE
WHEN MAX(p.VARIABLEfactor) = 1 THEN MAX(i.factor) ELSE MAX(p.factor) END AS factor
from MA_TABLE
join TABLE_P p on (...)
join TABLE_I i on (...)
group by id
The query worked very fine in our development environments until deploy with client where the factor column is getting null.
We have run the same query in the client environment from MySQL Workbench and we can see that the factor column is getting well populated.
After some debugging,we changed :
CASE
WHEN MAX(p.VARIABLEfactor) = 1 THEN MAX(i.factor) ELSE MAX(p.factor) END AS factor
to
MAX(
WHEN p.VARIABLEfactor = 1 THEN i.factor ELSE p.factor END ) AS factor,
and the query worked correctly.
Any help here please?

From your explanation I gather that you don't understand the difference of your two case expressions. But they are very different. Let's look at an example for one ID:
ID
VARIABLEfactor
i.factor
p.factor
100
0
null
10
100
1
null
20
Your expression
CASE WHEN MAX(p.VARIABLEfactor) = 1 THEN MAX(i.factor) ELSE MAX(p.factor) END
looks at the maximum VARIABLEfactor, which is 1, so the THEN case applies and the maximum i.factor is returned. This is null, as all i.factor are null.
Your expression
MAX(WHEN p.VARIABLEfactor = 1 THEN i.factor ELSE p.factor END)
looks at each row's VARIABLEfactor. For the first row this is 0, so the ELSE case applies and p.factor 10 is used. For the second row the VARIABLEfactor is 1, so its i.factor null gets used. Of these you take the maximum, which is 10.
To recap: The first expression is just a CASE expression on the aggregation results. It returns null here. The second expression is a conditional aggregation. It returns 10 for the sample data.

Related

Fetch how many failure and success did each person (name) have

I have a table buggy, the dummy dataset link can be see here
https://github.com/FirzaCank/Project/blob/main/SQL/IFG%20test/Dataset%20Dummy%20no%205.sql
Which contains:
id (INT)
name (VARCHAR)
bug (INT, contains the numbers 0 and 1)
With dataset explanations on 'bug' column are:
0, it means fault / failure
1, it means success
If there is no 'fault', then the 'fault' value will be filled with '0' (null is okay too), so is 'success'
I've tried a MySQL query like this:
SELECT name,
CASE
WHEN bug = 0 THEN COUNT(bug)
END AS failure,
CASE
WHEN bug = 1 THEN COUNT(bug)
END AS success
FROM buggy
GROUP BY name;
The desire output is like This, but as far as I've tried in the above syntax it just came out like this
Thank you for the help!
You should use SUM instead of Count.
SELECT
name,
SUM(IF(bug = 0, 1, 0)) as fault,
SUM(IF(bug = 1, 1, 0)) as success
FROM buggy
GROUP BY name
This counts the number of rows satisfying the conditions inside the IF function.
this sql will give wanted result
SELECT t.name , SUM(t.failure) as failure , SUM(t.success) as success
from ( SELECT name , CASE
WHEN bug < 1 THEN COUNT(bug) ELSE 0
END AS failure,
CASE
WHEN bug = 1 THEN COUNT(bug) ELSE 0
END AS success
FROM buggy
GROUP BY name,bug ) t
GROUP BY t.name;

having trouble understanding CASE WHEN statement in SQL

SELECT state,
COUNT(CASE WHEN elevation >= 2000 THEN 1 ELSE NULL END) as count_high_elevation_aiports
FROM airports
GROUP BY state;
In the above statement, what is the THEN 1 and what does '1' signify?
How does that value '1' after THEN affect the output?
First, note that these three expressions are equivent:
CASE WHEN elevation >= 2000 THEN 1 ELSE NULL END
IF(elevation >= 2000, 1, NULL)
((elevation >= 2000) OR NULL)
If elevation >= 2000, the expression evaluates as "1", otherwise the expression evaluates as NULL.
"1" is conventionally used as boolean true, and you could substitute the MySQL literal TRUE in the above expressions with equivalent results... but that isn't what the "1" is for, here.
When used with COUNT(), in cases like this, the only real significance of 1 is that it is not NULL.
This is important, because -- contrary to popular belief -- COUNT() does not count rows. It counts values.
What's the difference? NULL is not technically a value. Instead, it is a marker that signifies the absence of a value, thus COUNT(expr) only counts rows where expr is not null.
By using an expression like the one here, you're asking the server to count the rows with elevation => 2000, and you do this by giving COUNT() a NULL for rows you want not to be counted... and a non-null value for rows you do.
Aggregate (GROUP BY) functions operate on values -- and NULL, again, is not a value in this sense.
Another aggregate function that makes this rationale perhaps even more clear is AVG(). If you had 3 rows... with values 5, NULL, and 10... what's the average? If you said 7.5, that's correct: the average of these 3 rows is (5 + 10) ÷ 2 = 5 because the 3 rows have only two values. NULL is not 0, otherwise the average would be (5 + 0 + 10) ÷ 3 = 5, which it is not.
So, that's how and why this works.
How does that value '1' after THEN affect the output?
It really doesn't. You could just as easily have said COUNT(CASE WHEN elevation >= 2000 THEN 'cat videos are funny' ELSE NULL END) because, just like the literal 1, the literal string 'cat videos are funny' is also not null, and non-null values -- anything not null -- is what count will count.
A novice might try to accomplish this task with COUNT(elevation >= 2000), but that gives the wrong answer, because the 0 (false) for rows where elevation is < 2000 is not null, so these rows would still be counted.
You may then ask, "why not just use COUNT(*) ... WHERE elevation >= 2000?" Good question. The reasons vary, but if you GROUP BY state and there are states with no rows matching WHERE, those states would be entirely eliminated from the results, which is often not what you want. This query includes them, with a count of zero.
Note that ((elevation >= 2000) OR NULL), the third example expression at the top, doesn't actually need the parentheses. I included them because this form is not necessarilly intuitive at first glance. The natural precedence of operations will cause this to be evaluated correctly if written simply elevation >= 2000 OR NULL. This expression is equivalent to the other two because elevation >= 2000 first evaluates to 1 if true, 0 if false, or NULL if elevation is null. Then the lower-precedence OR is evaluated, and you get one of these: 1 OR NULL => 1 ... 0 OR NULL => NULL ... NULL OR NULL => NULL... and you may actually be awarded a SQL wizard badge by the elders of the Internet at the point when writing queries with COUNT(elevation >= 2000 OR NULL) comes naturally to you.
Query will simply return 1 if elevation is > or = 2000, else it will return NULL (this is use full for boolean representation of field because NULL represents 0),
Now returned value will be set into count_high_elevation_airports.

MySQL Help - Combining 2 MySQL Selects - Result in 1

I have two MySQL statements (see below), I would like to combine them together so if they both result in a 1 then the end result will be 1. I'm not sure how to construct this and was hoping for some help.
select count(*)
from monitor
where name='job_starttime' and value < ( UNIX_TIMESTAMP( ) -600)
select count(*)
from monitor
where name='job_active' and value = 0
So for example I would like when both statements are true to result in a value of 1, if 1 or none are true it results in a 0.

MAX with extra criteria

I have the following part of a query I'm working on in MYSQL.
SELECT
MAX(CAST(MatchPlayerBatting.BatRuns AS SIGNED)) AS HighestScore
FROM
MatchPlayerBatting
It returns the correct result. However there is another column I need it to work off.
That is if the maximum value it finds also has a value of "not out" within "BatHowOut", it should show the result as for example 96* rather than just 96.
How could this be done?
To help make the data concrete, consider two cases:
BatRuns BatHowOut
96 not out
96 lbw
BatRuns BatHowOut
96 not out
102 lbw
For the first data, the answer should be '96*'; for the second, '102'.
You can achieve this using self-join like this:
SELECT t1.ID
, CONCAT(t1.BatRuns,
CASE WHEN t1.BatHowOut = 'Not Out' THEN '*' ELSE '' END
) AS HighScore
FROM MatchPlayerBatting t1
JOIN
(
SELECT MAX(BatRuns) AS HighestScore
FROM MatchPlayerBatting
) t2
ON t1.BatRuns = t2.HighestScore
See this sample SQLFiddle with highest "Not Out"
See this another sample SQLFiddle with highest "Out"
See this another sample SQLFiddle with two highest scores
How about ordering the scores in descending order and selecting only the first record?
select concat(BatRuns , case when BatHowOut = 'not out' then '*' else '' end)
from mytable
order by cast(BatRuns as signed) desc,
(case when BatHowOut = 'not out' then 1 else 2 end)
limit 1;
Sample here.
If you want to find highest score score for each player, here is a solution that may not be elegant, but quite effective.
select PlayerID,
case when runs != round(runs)
then concat(round(runs),'*')
else
round(runs)
end highest_score
from (select PlayerID,
max(cast(BatRuns as decimal) +
case when BatHowOut = 'not out' then 0.1 else 0 end
) runs
from MatchPlayerBatting
group by PlayerID) max_runs;
This takes advantage of the fact that, runs can never be fractions, only whole numbers. When there is a tie for highest score and one of them is unbeaten,
adding 0.1 to the unbeaten score will make it the highest. This can be later removed and concatenated with *.
Sample here.

Comparing 2 Columns in same table

I need to compare 2 columns in a table and give 3 things:
Count of rows checked (Total Rows that were checked)
Count of rows matching (Rows in which the 2 columns matched)
Count of rows different (Rows in which the 2 columns differed)
I've been able to get just rows matching using a join on itself, but I'm unsure how to get the others all at once. The importance of getting all of the information at the same time is because this is a very active table and the data changes with great frequency.
I cannot post the table schema as there is a lot of data in it that is irrelevant to this issue. The columns in question are both int(11) unsigned NOT NULL DEFAULT '0'. For purposes of this, I'll call them mask and mask_alt.
select
count(*) as rows_checked,
sum(col = col2) as rows_matching,
sum(col != col2) as rows_different
from table
Note the elegant use of sum(condition).
This works because in mysql true is 1 and false is 0. Summing these counts the number of times the condition is true. It's much more elegant than case when condition then 1 else 0 end, which is the SQL equivalent of coding if (condition) return true else return false; instead of simply return condition;.
Assuming you mean you want to count the rows where col1 is or is not equal to col2, you can use an aggregate SUM() coupled with CASE:
SELECT
COUNT(*) AS total,
SUM(CASE WHEN col = col2 THEN 1 ELSE 0 END )AS matching,
SUM(CASE WHEN col <> col2 THEN 1 ELSE 0 END) AS non_matching
FROM table
It may be more efficient to get the total COUNT(*) in a subquery though, and use that value to subtract the matching to get the non-matching, if the above is not performant enough.
SELECT
total,
matching,
total - matching AS non_matching
FROM
(
SELECT
COUNT(*) AS total,
SUM(CASE WHEN col = col2 THEN 1 ELSE 0 END )AS matching
FROM table
) sumtbl