mysql increment variable using case - mysql

I have two tables marks and exams.
In the marks table I have studentid, mark1, mark2 and examid-foreign key from exams for different exams.
I want to get distinct student id and their number of failures in one single query.
The condition for failure is mark1+mark2 <50 or mark1<30. For e.g. If a student having studentid 1 has 15 entries(15 exams) in marks table and the same student failed in 6 so I want to get result as '1' and '6' in two columns and similarly for all students. For this case I wrote query using 'case' and is given below
select
distinct t1.studentid,
(#arrear:=
case
when (t1.mark1+t1.mark2) <50 OR t1.mark1 < 30
then #arrear+1 else #arrear
end) as failures
from marks t1, exams t2,
(select #arrear := 0) r
where t1.examid = t2.examid group by t1.studentid;
But the above query failed to give correct result. How can I modify the query to get correct result?

Try this. You don't need to use variables to help you.
select
m.studentid,
sum(case when m.mark1 + m.mark2 < 50 or m.mark1 < 30 then 1 else 0 end) as failures
from
marks m inner join exams e
on
m.examid = e.examid
group by
m.studentid
The case statement works out if the result is a failure or not and returns 1 for fail, 0 for no fail. Summing the result of this (grouped by studentid) gives you the number of fails per studentid
Oh and the join makes a more efficient join between your two tables :)

You don't need variable #arrear. You can get your info using only query
Try this:
select
distinct t1.studentid,
sum(
case
when (t1.mark1+t1.mark2) <50 OR t1.mark1 < 30
then 1
else 0
end
) as failures
from marks t1, exams t2
where t1.examid = t2.examid group by t1.studentid;

Related

How do I COUNT rows of a GROUP BY query where a condition matches?

This is my persons table:
neighborhood birthyear
a 1958
a 1959
b 1970
c 1980
I'd like to get the COUNT of people in an age group within every neighborhood. For example, if I wanted to get everyone under the age of 18, I would get:
neighborhood count
a 0
b 0
c 0
If I wanted to get everyone over 50, I'd get
neighborhood count
a 2
b 0
c 0
I tried
SELECT neighborhood, COUNT(*)
FROM persons
WHERE YEAR(NOW()) - persons.birthyear < 18
GROUP BY neighborhood;
but this gives me 0 rows, when instead I want 3 rows with distinct neighborhoods and 0 count for each. How would I accomplish this?
You can use conditional aggregation:
SELECT neighborhood, SUM(YEAR(NOW()) - p.birthyear) as under_18,
SUM(YEAR(NOW()) - p.birthyear BETWEEN 34 AND 42) as age_34_42
FROM persons p
GROUP BY neighborhood;
I think that if the count is 0, the row doesn't appear.
Your code seems correct to me, if you try it on the example with age 50, it should give you one row whith the expected line (neighborhood:a,count:2)
I would recommend using a sub query:
SELECT
count(*) [group-by-count-greater-than-ten]
FROM
(
SELECT
columnFoo,
count(*) cnt
FROM barTable
WHERE columnBaz = "barbaz"
GROUP BY columnFoo
)
AS subQuery
WHERE cnt > 10
In the above, the subquery return result set is being used by the main query as any other table.
The column cnt is no longer seen by the main query as a computed field and does not have to reference the count() function.
However, inside the subquery running a where clause or a having clause that must look at the alias cnt column, the count() function would have to be referenced as referencing cnt in the subquery would throw an error.
In your case using a subquery would look something like this.
SELECT
neighborhood,
age,
count(*) as cnt
FROM
(
SELECT
*,
(YEAR(NOW()) - birthyear) as age
FROM PERSONS
) as WithAge
WHERE age < 18
GROUP BY neighborhood, age

MySQL -- Finding % of orders with a transaction failure

I have an order_transactions table with 3 relevant columns. id (unique id for the transaction attempt), order_id (the id of the order for which the attempt is being made), and success an int which is 0 if failed, and 1 if successful.
There can be 0 or more failed transactions before a successful transaction, for each order_id.
The question is, how do I find:
The number of orders which never had a successful transaction
The number of orders which had a transaction with a failure (eventually successful or not)
The number of orders which never had a failed transaction (success only)
I realize this is some combination of distinct, group by, maybe a subselect, etc, I'm just not well versed in this enough. Thanks.
To get the number of orders which never had a successful transaction you can use:
SELECT COUNT(*)
FROM (
SELECT order_id
FROM transactions
GROUP BY order_id
HAVING COUNT(CASE WHEN success = 1 THEN 1 END) = 0) AS t
Demo here
The number of orders which had a transaction with a failure (eventually successful or not) can be obtained using the query:
SELECT COUNT(*)
FROM (
SELECT order_id
FROM transactions
GROUP BY order_id
HAVING COUNT(CASE WHEN success = 0 THEN 1 END) > 0) AS t
Demo here
Finally, to get the number of orders which never had a failed transaction (success only):
SELECT COUNT(*)
FROM (
SELECT order_id
FROM transactions
GROUP BY order_id
HAVING COUNT(CASE WHEN success = 0 THEN 1 END) = 0) AS t
Demo here
You want "counts" of orders that meet specific conditions over multiple rows, so I'd start with a GROUP BY order_id
SELECT ...
FROM mytable t
GROUP BY t.order_id
To find out if a particular order ever had a failed transaction, etc. we can use aggregates on expressions that "test" for conditions.
For example:
SELECT MAX(t.success=1) AS succeeded
, MAX(t.success=0) AS failed
, IF(MAX(t.success=1),0,1) AS never_succeeded
FROM mytable t
GROUP BY t.order_id
The expressions in the SELECT list of that query are MySQL shorthand. We could use longer expressions (MySQL IF() function or ANSI CASE expressions) to achieve an equivalent result, e.g.
CASE WHEN t.success = 1 THEN 1 ELSE 0 END
We could include the `order_id` column in the SELECT list for testing. We can compare the results for each order_id to the rows in the original table, to verify that the results returned meet the specification.
To get "counts" of orders, we can reference the query as an inline view, and use aggregate expressions in the SELECT list.
For example:
SELECT SUM(r.succeeded) AS cnt_succeeded
, SUM(r.failed) AS cnt_failed
, SUM(r.never_succeeded) AS cnt_never_succeeded
FROM (
SELECT MAX(t.success=1) AS succeeded
, MAX(t.success=0) AS failed
, IF(MAX(t.success=1),0,1) AS never_succeeded
FROM mytable t
GROUP BY t.order_id
) r
Since the expressions in the SELECT list return either 0, 1 or NULL, we can use the SUM() aggregate to get a count. To make use of a COUNT() aggregate, we would need to return NULL in place of a 0 (FALSE) value.
SELECT COUNT(IF(r.succeeded,1,NULL)) AS cnt_succeeded
, COUNT(IF(r.failed,1,NULL)) AS cnt_failed
, COUNT(IF(r.never_succeeded,1,NULL)) AS cnt_never_succeeded
FROM (
SELECT MAX(t.success=1) AS succeeded
, MAX(t.success=0) AS failed
, IF(MAX(t.success=1),0,1) AS never_succeeded
FROM mytable t
GROUP BY t.order_id
) r
If you want a count of all order_id, add a COUNT(1) expression in the outer query. If you need percentages, do the division and multiply by 100,
For example
SELECT SUM(r.succeeded) AS cnt_succeeded
, SUM(r.failed) AS cnt_failed
, SUM(r.never_succeeded) AS cnt_never_succeeded
, SUM(1) AS cnt_all_orders
, SUM(r.failed)/SUM(1)*100.0 AS pct_with_a_failure
, SUM(r.succeeded)/SUM(1)*100.0 AS pct_succeeded
, SUM(r.never_succeeded)/SUM(1)*100.0 AS pct_never_succeeded
FROM (
SELECT MAX(t.success=1) AS succeeded
, MAX(t.success=0) AS failed
, IF(MAX(t.success=1),0,1) AS never_succeeded
FROM mytable t
GROUP BY t.order_id
) r
(The percentages here are a comparison to the count of distinct order_id values, not as the total number of rows in the table).
successful order
select count(*) from
( select distinct order_id from my_table where success = 1 ) as t;
unsuccessful order
select count(*) from
( select distinct order_id from my_table where success = 0 ) as t;
never filed transaction
select count(*) from
( select distintc order_id from my_table where id not in
(select distinct order_id from my_table where success = 0) ) as t;

Count DISTINCT on a single column over multiple conditions

I have a table, and I want to get the DISTINCT count of usernames over a certain period of time. Currently I'm running this query
SELECT DISTINCT username FROM user_activity WHERE company_id = 9 AND timestamp BETWEEN '2015-09-00' AND '2015-10-01' AND action = "Login Success";
It works great, however, I have multiple Companies that I want to select the count for. How do I expand the previous query to show me the distinct counts for multiple companies?
select count(distinct username),
sum(case when company_id = 1 then 1 else 0 end) A,
sum(case when company_id = 9 then 1 else 0 end) B
from `user_activity` Where timestamp BETWEEN '2015-09-00' AND '2015-10-01' AND action = "Login Success"
I've done something like this, however, I'm not getting the correct numbers. Ideally I would like to list each count as a different value for ease of reading, like the previous query illustrates. I don't need the count(distinct username) column to appear in my result, just the conditionals.
Thanks in advance.
If you don't mind two rows instead of two columns:
SELECT company_id, COUNT(DISTINCT username)
FROM user_activity
WHERE company_id IN (1,9)
AND timestamp >= '2015-09-01'
AND timestamp < '2015-09-01' + INTERVAL 1 MONTH
AND action = "Login Success"
GROUP BY company_id

SQL maximum number of records with a common value

Please consider the following two tables:
Holidays
HolidayID (PK)
Destination
Length
MaximumNumber
...
Bookings
BookingID (PK)
HolidayID (FK)
Name
...
Customers can book holidays (e.g. go to Hawaii). But, suppose that a given holiday has a maximum number of places. e.g. there are only 75 holidays to Hawaii this year (ignoring other years).
So if some customer wants to book a holiday to Hawaii. I need to count the records in Bookings table, and if that number is greater than 75 I have to tell the customer it's too late.
This I can do using 2 MySQL queries (1 to get MaximumNumber for the holiday, 2 to get the current total from Bookings) and PHP (for example) to compare the count value with the maximum number of Hawaii holidays.
But I want to know if there is a way to do this purely in SQL (MySQL in this case)? i.e. count the number of bookings for Hawaii and compare against Hawaii's MaximumNumber value.
EDIT:
My method:
$query1 = "SELECT MaximumNumber FROM Holidays WHERE HolidayID=$hawaiiID";
$query2 = "SELECT COUNT(BookingID) FROM Bookings WHERE HolidayID=$hawaiiID";
So if the first query gives 75 and the second query gives 75 I can compare these values in PHP. But I wondered if there was a way to do this somehow in SQL alone.
Maybe I am missing something, but why not use a subquery to determine the total bookings for each holidayid:
select *
from holidays h
left join
(
select count(*) TotalBooked, HolidayId
from bookings
group by holidayId
) b
on h.holidayId = b.holidayId
WHERE h.HolidayID=$hawaiiID;
See SQL Fiddle with Demo.
Then you could use a CASE expression to compare the TotalBooked to the MaxNumber similar to this:
select h.destination,
case
when b.totalbooked = h.maxNumber
then 'Not Available'
else 'You can still book' end Availability
from holidays h
left join
(
select count(*) TotalBooked, HolidayId
from bookings
group by holidayId
) b
on h.holidayId = b.holidayId
WHERE h.HolidayID=$hawaiiID;
See SQL Fiddle with Demo.
You will notice that I used a LEFT JOIN which will return all rows from the Holidays table even if there are not matching rows in the Bookings table.
Something like this will work. You can fill in the details:
select case
when
(select count(*)
from Bookings
where holidayID = $hawaiiid)
<= MaximumNumber then 'available' else 'sold out' end status
from holidays
etc
You might try something like this:
select case when b.freq < h.MaximumNumber
then 'Available'
else 'Not Available'
end as MyResult
from Holidays h
left join (
select HolidayID
, count(*) as freq
from Bookings
where HolidayID=$hawaiiID
group by HolidayID
) b
on h.HolidayID=b.HolidayID

SQL query that reports N or more consecutive absents from attendance table

I have a table that looks like this:
studentID | subjectID | attendanceStatus | classDate | classTime | lecturerID |
12345678 1234 1 2012-06-05 15:30:00
87654321
12345678 1234 0 2012-06-08 02:30:00
I want a query that reports if a student has been absent for 3 or more consecutive classes. based on studentID and a specific subject between 2 specific dates as well. Each class can have a different time. The schema for that table is:
PK(`studentID`, `classDate`, `classTime`, `subjectID, `lecturerID`)
Attendance Status: 1 = Present, 0 = Absent
Edit: Worded question so that it is more accurate and really describes what was my intention.
I wasn't able to create an SQL query for this. So instead, I tried a PHP solution:
Select all rows from table, ordered by student, subject and date
Create a running counter for absents, initialized to 0
Iterate over each record:
If student and/or subject is different from previous row
Reset the counter to 0 (present) or 1 (absent)
Else, that is when student and subject are same
Set the counter to 0 (present) or plus 1 (absent)
I then realized that this logic can easily be implemented using MySQL variables, so:
SET #studentID = 0;
SET #subjectID = 0;
SET #absentRun = 0;
SELECT *,
CASE
WHEN (#studentID = studentID) AND (#subjectID = subjectID) THEN #absentRun := IF(attendanceStatus = 1, 0, #absentRun + 1)
WHEN (#studentID := studentID) AND (#subjectID := subjectID) THEN #absentRun := IF(attendanceStatus = 1, 0, 1)
END AS absentRun
FROM table4
ORDER BY studentID, subjectID, classDate
You can probably nest this query inside another query that selects records where absentRun >= 3.
SQL Fiddle
This query works for intended result:
SELECT DISTINCT first_day.studentID
FROM student_visits first_day
LEFT JOIN student_visits second_day
ON first_day.studentID = second_day.studentID
AND DATE(second_day.classDate) - INTERVAL 1 DAY = date(first_day.classDate)
LEFT JOIN student_visits third_day
ON first_day.studentID = third_day.studentID
AND DATE(third_day.classDate) - INTERVAL 2 DAY = date(first_day.classDate)
WHERE first_day.attendanceStatus = 0 AND second_day.attendanceStatus = 0 AND third_day.attendanceStatus = 0
It's joining table 'student_visits' (let's name your original table so) to itself step by step on consecutive 3 dates for each student and finally checks the absence on these days. Distinct makes sure that result willn't contain duplicate results for more than 3 consecutive days of absence.
This query doesn't consider absence on specific subject - just consectuive absence for each student for 3 or more days. To consider subject simply add .subjectID in each ON clause:
ON first_day.subjectID = second_day.subjectID
P.S.: not sure that it's the fastest way (at least it's not the only).
Unfortunately, mysql does not support windows functions. This would be much easier with row_number() or better yet cumulative sums (as supported in Oracle).
I will describe the solution. Imagine that you have two additional columns in your table:
ClassSeqNum -- a sequence starting at 1 and incrementing by 1 for each class date.
AbsentSeqNum -- a sequence starting a 1 each time a student misses a class and then increments by 1 on each subsequent absence.
The key observation is that the difference between these two values is constant for consecutive absences. Because you are using mysql, you might consider adding these columns to the table. They are big challenging to add in the query, which is why this answer is so long.
Given the key observation, the answer to your question is provided by the following query:
select studentid, subjectid, absenceid, count(*) as cnt
from (select a.*, (ClassSeqNum - AbsentSeqNum) as absenceid
from Attendance a
) a
group by studentid, subjectid, absenceid
having count(*) > 2
(Okay, this gives every sequence of absences for a student for each subject, but I think you can figure out how to whittle this down just to a list of students.)
How do you assign the sequence numbers? In mysql, you need to do a self join. So, the following adds the ClassSeqNum:
select a.StudentId, a.SubjectId, count(*) as ClassSeqNum
from Attendance a join
Attendance a1
on a.studentid = a1.studentid and a.SubjectId = a1.Subjectid and
a.ClassDate >= s1.classDate
group by a.StudentId, a.SubjectId
And the following adds the absence sequence number:
select a.StudentId, a.SubjectId, count(*) as AbsenceSeqNum
from Attendance a join
Attendance a1
on a.studentid = a1.studentid and a.SubjectId = a1.Subjectid and
a.ClassDate >= a1.classDate
where AttendanceStatus = 0
group by a.StudentId, a.SubjectId
So the final query looks like:
with cs as (
select a.StudentId, a.SubjectId, count(*) as ClassSeqNum
from Attendance a join
Attendance a1
on a.studentid = a1.studentid and a.SubjectId = a1.Subjectid and
a.ClassDate >= s1.classDate
group by a.StudentId, a.SubjectId
),
a as (
select a.StudentId, a.SubjectId, count(*) as AbsenceSeqNum
from Attendance a join
Attendance a1
on a.studentid = a1.studentid and a.SubjectId = a1.Subjectid and
a.ClassDate >= s1.classDate
where AttendanceStatus = 0
group by a.StudentId, a.SubjectId
)
select studentid, subjectid, absenceid, count(*) as cnt
from (select cs.studentid, cs.subjectid,
(cs.ClassSeqNum - a.AbsentSeqNum) as absenceid
from cs join
a
on cs.studentid = a.studentid and cs.subjectid = as.subjectid
) a
group by studentid, subjectid, absenceid
having count(*) > 2