count and sum in case statement - mysql

What is the difference below if I use case instead of sum? I believe I would get the same output?
SELECT SUM(CASE WHEN salary > 100000 THEN 1 ELSE 0 END) AS Total
SELECT COUNT(CASE WHEN salary > 100000 THEN 1 END) AS Total
SELECT COUNT(CASE WHEN salary > 100000 THEN 1 ELSE NULL END) AS Total
Thanks!

Per the other answer, all forms are equivalent. There also a couple of other form that are more compact and achieve the same result:
count_if(salary > 100000)
count(if(salary > 100000, 1))
However, the idiomatic and more general way to do this in Trino (formerly known as Presto SQL) is:
SELECT count(*) FILTER (WHERE salary > 100000) AS Total
FROM ...
See the documentation for more details about filtered aggregations.
All other forms except for the one based on SUM should, per the SQL specification, raise a warning to indicate that null values have been eliminated. This behavior is not yet implemented in Trino, but will be added at some point in the future.

The three are equivalent. All of them count the number of rows that meet the particular condition (salary > 100000). All return 0/1 and would not return NULL values for the column.
From a performance perspective, all should be equivalent as well. I have a personal preference for the first version. I consider the third to be unnecessarily verbose because else NULL is the default for a case expression.

Related

How to calculate AVG, MAX and MIN number of rows in a column

I try to collect general statistics on the depth of correspondence: average, maximum and minimum number of messages of each type per one request. Have 2 tables:
First:
ticketId,ticketQueueId,ticketCreatedDate
Second:
articleId,articleCreatedDt,articleType (can be IN or OUT - support responses), ticketId
I reasoned like this:
SELECT AVG(COUNT(articleType='IN')) AS AT_IN, AVG(COUNT(articleType='OUT')) AS AT_OUT
FROM tickets.tickets JOIN tickets.articles
ON tickets.ticketId=articles.ticketId;
GROUP BY tickets.ticketId
but it doesn't work.
Error Code: 1111. Invalid use of group function
you can't use nested aggregation function (AVG(COUNT())) but use proper subquery and apply the aggregation function the the subquery gradually
also your use of of count in improper
the count function count each row where the related column is not null so in your case the evaluation articleType='IN' (or articleType='OUT') returning 0 or 1 is never null
select AVG(T_IN), AVG(T_OUT)
from (
SELECT sum(case when articleType='IN' then 1 else 0 END AS T_IN, sum(case when articleType='OUT' then 1 else 0 END AS T_OUT
FROM tickets.tickets
JOIN tickets.articles ON tickets.ticketId=articles.ticketId
GROUP BY tickets.ticketId
) t
(and You have also a wrong semicolon )

Query - To display the Failure rate %

I have written this query to get my data, and all the data is fine.
I have one column which has either Pass Or Fail. I want to calculate the % of number of bookings that failed, and output it in a single value.
I will have to write another query to show that one number.
For example : The below data, I have 4 bookings , out which 2 failed. So 50% is the failure rate. I am omitting some columns , in the display, but can be seen in the query.
That's an aggregation over all records and simple math:
select count(case when decision = 'Fail' then 1 end) / count(*) * 100
from (<your query here>) results;
Explanation: COUNT(something) counts non null values. case when decision = 'Fail' then 1 end is 1 (i.e. not null) for failures and null otherwise (as null is the default for no match in CASE/WHEN &dash; you could as well write else null end explicitly).
Modify your original condition to the following. Notice that there is no need to wrap your query in a subquery.
CONCAT(FORMAT((100 * SUM(CASE WHEN trip_rating.rating <= 3 AND
(TIMESTAMPDIFF(MINUTE,booking.pick_up_time,booking_activity.activity_time) -
ROUND(booking_tracking_detail.google_adjusted_duration_driver_coming/60)) /
TIMESTAMPDIFF(MINUTE,booking.pick_up_time,booking_activity.activity_time)*100 >= 15
THEN 1
ELSE 0
END) / COUNT(*)), 2), '%') AS failureRate
This will also format your failure rate in the format 50.0%, with a percentage sign.

Query to retrieve values form DB on multiple criteria

I Need to retrieve values from database to plot them in graph. For that I need to get values on criteria basis. Data matching different criteria has to be returned as different rows/ column to my query
(i.e)
I have a table called TABLEA which has a column TIME. I need to get the value based on time critreia as a result, count of rows which are matching TIME>1 and TIME<10 as a result, TIME>11 and TIME <20 as a result and so on. Is it possible to get the values in a single query. I use Mysql with JDBC.
I should plot all the counts in a graph
Thanks in advance.
select sum(case when `time` between 2 and 9 then 1 else 0 end) as count_1,
sum(case when `time` between 12 and 19 then 1 else 0 end) as count_2
from your_table
This can be done with CASE statements, but they can get kind of verbose. You may just want to rely on Boolean (true/false) logic:
SELECT
SUM(TIME BETWEEN 1 AND 10) as `1 to 10`,
SUM(TIME BETWEEN 11 and 20) as `11 to 20`,
SUM(TIME BETWEEN 21 and 30) as `21 to 30`
FROM
TABLEA
The phrase TIME BETWEEN 1 AND 10) will either returnTRUEorFALSEfor each record.TRUEbeing equivalent to1andFALSEbeing equivalent to0`, we then only need sum the results and give our new field a name.
I also made the assumption that you wanted records where 1 <= TIME <= 10 instead of 1 < TIME < 10 which you stated since, as stated, it would drop values where the TIME was 1,10,20, etc. If that was your intended result, then you can just adjust the TIME BETWEEN 1 AND 10 to be TIME BETWEEN 2 AND 9 instead.

MYSQL Query for Grouping Column Based on Different Conditions

I have project table in following format:
And, I need to have MYSQL that can give me data in following format:
Basically, I have to group the data based on location. Then have to count the successful and unsuccessful projects. "Successful" column has total count of projects for which percentageRaised is more than or equal to 1 and Unsuccessful column has total count of project for which percetageRaised in less than 1.
I just have basic understanding of mysql. Need your advise.
select location
, sum(case when PercentageRaised >= 1.0 then 1 end) as successful
, sum(case when PercentageRaised < 1.0 then 1 end) as unsuccessful
from YourTable
group by
location
MySQL Supports boolean arithmetic.
SELECT Location,
SUM(percentageRaised > 0) successful,
SUM(percentageRaised < 0) unsuccessful,
FROM tableName
GROUP BY Location

Combine multiple room availability queries into one

I'm currently trying to optimize an database by combining queries. But I keep hitting dead ends while optimizing an room availability query.
I have a room availability table where each records states the available number of rooms per date. It's formatted like so:
room_availability_id (PK)
room_availability_rid (fk_room_id)
room_availability_date (2011-02-11)
room_availability_number (number of rooms available)
The trouble is getting a list of rooms that are available for EACH of the provided days. When I use IN() like so:
WHERE room_availability_date IN('2011-02-13','2011-02-14','2011-02-15')
AND room_availability_number > 0
If the 14th has availability 0 it still gives me the other 2 dates. But I only want that room_id when it is available on ALL three dates.
Please tell me there is a way to do this in MySQL other than querying each date/room/availability combination separately (that is what is done now :-( )
I tried all sorts of combinations, tried to use room_availability_date = ALL (...), tried some dirty repeating subqueries but to no avail.
Thank you in advance for any thoughts!
You would need to construct a query to group on the room ID and then check that there is availability on each date, which can be done using the having clause. Leaving the where clause predicate in for room_availability_date will help to keep the query efficient (as indexes etc. can't be used with a having clause easily).
SELECT
room_availability_rid
WHERE room_availability_date IN ('2011-02-13','2011-02-14','2011-02-15')
AND room_availability_number > 0
GROUP BY room_availability_rid
HAVING count(case room_availability_date when '2011-02-13' THEN 1 END) > 0
AND count(case room_availability_date when '2011-02-14' THEN 1 END) > 0
AND count(case room_availability_date when '2011-02-15' THEN 1 END) > 0
I think I can improve on a'r's answer:
SELECT
room_availability_rid, count(*) n
WHERE room_availability_date IN ('2011-02-13','2011-02-14','2011-02-15')
AND room_availability_number > 0
GROUP BY room_availability_rid
HAVING n=3
Edit: This of course assumes that there is only one table entry per room per day. Is this a valid assumption?
You can group by room ID, generate a list of dates available, and then see if all the dates you need are included.
This will give you a list of dates each room is available:
select `room_availability_rid`,group_concat(`room_ availability_date`) as `datelist`
from `table` where room_availability_number>0
group by `room_availability_rid`
Then we can add a having clause to get the rooms that are available on all of the dates we need:
select `room_availability_rid`,group_concat(`room_ availability_date`) as `datelist`
from `table` where room_availability_number>0
group by `room_availability_rid`
having find_in_set('2011-02-13',`datelist`) and
find_in_set('2011-02-14',`datelist`) and
find_in_set('2011-02-15',`datelist`)
This should work. Test it for me will ya? :)