Determining values based on the count of that item in - mysql

I have two tables. ticket & ticketlist.
The sold out column in the ticket table needs to be 1 if that item is sold out.
Table ticket needs to be soldout when the count for that item in table ticketlist is 2.
ticket
ticketid, quantity, soldout
21 2 1
ticketlist
ticketlistid, ticketid
3 21
4 21
The logic is:
soldout should be '1' if ticket.quantity - (COUNT(ticketlist.ticketlistid) WHERE ticket.ticketid = ticketlist.ticketlistid) > 0
This is the MySQL that I tried
UPDATE ticket
SET soldout = '1'
WHERE quantity - (SELECT ticket.ticketid, COUNT(ticketlist.ticketlistid)
FROM ticket, ticketlist
WHERE ticket.ticketid = ticketlist.ticketid) > '0';
Any help will be appreciated.

In your subselect:
You should only return one column.
Don't select the same table you already have from your update.
You probably also want to set sold_out to one when quantity - (SELECT ...) <= 0, rather than > 0 as you are currently doing.
Change the query to this:
UPDATE ticket
SET soldout = '1'
WHERE quantity - (
SELECT COUNT(ticketlist.ticketlistid)
FROM ticketlist
WHERE ticket.ticketid = ticketlist.ticketid
) > 0;
Also your database is denormalized. You are storing information in one table that can be derived from the data in another table. This redundancy can cause errors if the two ever get out of sync. I'd recommend only doing this if you need it for performance reasons.

You're better of implementing this as a view, otherwise risk the soldout number being out of sync.
CREATE VIEW vw_tickets AS
SELECT t.ticketid,
t.quantity,
COUNT(*) AS sold,
CASE
WHEN t.quantity = COUNT(*) THEN 1
WHEN t.quantity < COUNT(*) THEN -1 -- oversold
ELSE 0
END AS soldout
FROM TICKET t
LEFT JOIN TICKETLIST tl ON tl.ticketid = t.ticketid
GROUP BY t.ticketid, t.quantity

One problem I see is this:
(SELECT ticket.ticketid, COUNT(ticketlist.ticketlistid)
FROM ticket, ticketlist
WHERE ticket.ticketid = ticketlist.ticketid)
You are testing the result of that query against "> 0" however, it returns ticketid and count. You need to removed ticket.ticketid.

Try this:
UPDATE `ticket`
SET `soldout` = 1
WHERE `ticketid` IN (
SELECT `ticketid` FROM `ticketlist` GROUP BY `ticketid` HAVING COUNT(`ticketlistid`) = 2 )

Related

Mysql - most efficient way to retrieve data based on multiple selects and wheres

I'm having trouble finding the most efficient way of retrieving various different sumed values from a Mysql table.
Let's say I've got 4 columns - userid, amount, paid, referral.
I'd like to retrieve the following based on a user id:
1 - the sum of amount that is paid (marked as 1)
2 - the sum of amount that is unpaid (marked as 0)
3 - the sum of amount that is paid and referral (marked as 1 on both paid and referral columns)
4 - the sum of amount that unpaid and referral (marked as 0 on paid and 1 on referral columns)
I've tried an embedded select statement like this:
SELECT (
SELECT sum(payout)
FROM table1
WHERE ispaid = 0 and userid = '100'
) AS unpaid
(
SELECT sum(payout)
FROM table1
WHERE ispaid = 1 and userid = '100'
) AS paid,
(
SELECT sum(payout)
FROM table1
WHERE ispaid = 0 and isreferral = 1 and userid = '100'
) AS refpending,
(
SELECT sum(payout)
FROM table1
WHERE ispaid = 1 and isreferral = 1 and userid = '100'
) AS refpaid
This works, but its slow (or at least feels like it could be quicker) on my server, around 1.5 seconds.
I'm sure there is a better way of doing this with a group statement but can't get my head around it!
Any help is much appreciated.
Thanks
You can use conditional expressions inside SUM():
SELECT
SUM(CASE WHEN ispaid=0 THEN payout END) AS unpaid,
SUM(CASE WHEN ispaid=1 THEN payout END) AS paid,
SUM(CASE WHEN ispaid=0 AND isreferral=1 THEN payout END) AS refpending,
SUM(CASE WHEN ispaid=0 AND isreferral=1 THEN payout END) AS refpaid
FROM table1
WHERE userid = '100'
If a given row is not matched by any CASE...WHEN clause, then the value of the expression is NULL, and SUM() ignores NULLs. You could also have an ELSE 0 clause in there if you want to be more explicit, since SUM() will not be increased by a 0.
Also make sure you have an index on userid in this table to select only the rows you need.

MySQL: Returning multiple columns from an in-line subquery with complex date comparison operators

I have a MySQL procedure which counts records in a very large MySQL table called overview_table. More or less the algorithm is the following:
Create a temporary table called step1 which filters down the overview_table to contain records for only 1 week and only the required data.
Go through each record in the newly created step1 table and do a count of 2 distinct dates in the entire overview_table
My query works. However, I am doing an inner SELECT twice. This is ineficient since I could technically retrieve both data points at the same time.
I read a few articles on how to overcome this (for example this: MySQL: Returning multiple columns from an in-line subquery), but my problem is the date filter which I apply inside of the subquery.
How do I re-write this query so that I only query once for the data I am trying to get in the inner select? This means I need to somehow return the following 2 data points at the same time:
COUNT(a_date)
COUNT(d_date)
Thank you.
CREATE DEFINER=`abc`#`%` PROCEDURE `abc_help`()
BEGIN
# Filter overview table (VERY LARGE!) to contain records for 1 week
CREATE TABLE step1
SELECT
wj_id,
jp_id,
p_id,
d_date,
a_date,
r_date,
h_date,
create_date
FROM overview_table
WHERE create_date BETWEEN '2020-02-01' AND '2020-02-01';
# Go through each record in step 1 and count number of date occurrences for each record.
# a_date and d_date will sometimes by NULL, therefore COUNT() works by counting non NULL values
CREATE TABLE step2
SELECT
step1.wj_id,
-- Historal counter 1
( SELECT COUNT(a_date) FROM overview_table AS search WHERE
search.a_date < step1.create_date AND
search.p_id = step1.p_id AND
search.jp_id = step1.jp_id
) AS date_a_count,
-- Historal counter 2
( SELECT COUNT(d_date) FROM overview_table AS search WHERE
search.d_date < step1.create_date AND
search.p_id = step1.p_id AND
search.jp_id = step1.jp_id
) AS date_d_count
FROM step1
GROUP BY 1;
END
You can do it with a LEFT JOIN of step1 to overview_table:
SELECT
step1.wj_id,
COUNT(CASE WHEN search.a_date < step1.create_date THEN 1 END) AS date_a_count,
COUNT(CASE WHEN search.d_date < step1.create_date THEN 1 END) AS date_d_count
FROM step1 LEFT JOIN overview_table AS search
ON ((search.a_date < step1.create_date) OR (search.d_date < step1.create_date)) AND
search.p_id = step1.p_id AND
search.jp_id = step1.jp_id
GROUP BY step1.wj_id;
If you want to count only distinct dates then:
SELECT
step1.wj_id,
COUNT(DISTINCT CASE WHEN search.a_date < step1.create_date THEN search.a_date END) AS date_a_count,
COUNT(DISTINCT CASE WHEN search.d_date < step1.create_date THEN search.D_date END) AS date_d_count
FROM step1 LEFT JOIN overview_table AS search
ON ((search.a_date < step1.create_date) OR (search.d_date < step1.create_date)) AND
search.p_id = step1.p_id AND
search.jp_id = step1.jp_id
GROUP BY step1.wj_id;

Mysql replace column value with other column value

I have 2 tables:
table: transaction:
====================
id billed_date amount
1 2016-09-30 5
2 2016-10-04 15
3 2016-10-06 10
table: report_date
====================
transaction_id report_date
1 2016-10-01
I want:
Create a report which sum all transactions's amount in October 2016
Base on report date, not billed date
When report date is not set, it's base on billed_date
In above example, I want result is 30 (not 25)
Then I write:
The First:
SELECT
sum(t.amount),
CASE WHEN d.report_date IS NOT NULL THEN d.report_date ELSE t.billed_date END AS new_date
FROM
transaction t LEFT JOIN report_date d ON t.id = d.transaction_id
WHERE new_date BETWEEN '2016-10-01' AND '2016-10-30'
The Second:
SELECT sum(amount) FROM
(SELECT t.amount,
CASE WHEN d.report_date IS NOT NULL THEN d.report_date ELSE t.billed_date END AS date
FROM transaction t LEFT JOIN report_date d ON t.id = d.transaction_id
) t
WHERE t.date BETWEEN '2016-10-01' AND '2016-10-30'
Result:
The First:
Unknown column 'new_date' in 'where clause'
If I replace 'new_date' by 'date': result = 25 (exclude id=1)
The Second:
result = 30 => Correct, but in my case, when transaction table have about 30k records, the process is too slow.
Anybody can help me?
First of all - the part
CASE WHEN d.report_date IS NOT NULL THEN d.report_date ELSE t.billed_date END
can be written shorter as
COALESCE(d.report_date, t.billed_date)
or as
IFNULL(d.report_date, t.billed_date)
In your first query you are using a column alias in the WHERE clause, wich is not allowed. You can fix it by moving the expression behind the alias to the WHERE clause:
SELECT sum(t.amount)
FROM transaction t LEFT JOIN report_date d ON t.id = d.transaction_id
WHERE COALESCE(d.report_date, t.billed_date) BETWEEN '2016-10-01' AND '2016-10-30'
This is almost the same as your own solution.
Your second query is slow because MySQL has to store the subquery result (30K rows) into a temporary table. Trying to optimize it, you will end up with the same solution above.
However if you have indexes on transaction.billed_date and report_date.report_date this query still can not use them. In order to use the indexes, you can split the query into two parts:
Entries with a report (will use report_date.report_date index):
SELECT sum(amount)
FROM transaction t JOIN report_date d ON id = transaction_id
WHERE d.report_date BETWEEN '2016-10-01' AND '2016-10-30'
Entries without a report (will use transaction.billed_date index):
SELECT sum(amount)
FROM transaction t LEFT JOIN report_date d ON id = transaction_id
WHERE d.report_date IS NULL AND t.billed_dateBETWEEN '2016-10-01' AND '2016-10-30'
Both queries can use an index. You just need to sum the results, wich can also be done combining the two queries:
SELECT (
SELECT sum(amount)
FROM transaction t JOIN report_date d ON id = transaction_id
WHERE d.report_date BETWEEN '2016-10-01' AND '2016-10-30'
) + (
SELECT sum(amount)
FROM transaction t LEFT JOIN report_date d ON id = transaction_id
WHERE d.report_date IS NULL AND t.billed_dateBETWEEN '2016-10-01' AND '2016-10-30'
) AS sum_amount
I finally find out the solution with the help from my brother:
SELECT sum(amount)
FROM transaction t LEFT JOIN report_date d ON id = transaction_id
WHERE (report_date BETWEEN '2016-10-01' AND '2016-10-30') OR (report_date IS NULL AND billed_date BETWEEN '2016-10-01' AND '2016-10-30')
Thank you for caring me!
Is fill table: report_date with absent values from table: transaction: the case?
SELECT id FROM report_date WHERE report_date BETWEEN '2016-10-01' AND '2016-10-30';
INSERT INTO report_date SELECT id, billed_date FROM transaction WHERE billed_date BETWEEN '2016-10-01' AND '2016-10-30' AND id NOT IN (ids_from previous_query);
SELECT sum(t.amount) FROM transaction LEFT JOIN report_date d ON (t.id = d.transaction_id) WHERE d.report_date BETWEEN '2016-10-01' AND '2016-10-30';
Your Second Query is correct,no need to re-write query. But I have one thing to tell you, which will help you a lot when dealing with thousand/millions of records. We have focus on some other things too. Because when your table contains large amount of data(in thousands and millions) of records then it takes time to execute query. It may causes locking also, might be query lock or database gone away kind of issue. To avoid this issue,you just create INDEX of one column. Create INDEX on that column which act/use on where clauses. Like in your case you can create INDEX on billed_date column from transaction table. Because your result is based on transaction table. For more details how to create index in mysql/phpmyadmin you can take reference from this http://www.yourwebskills.com/dbphpmyadmintable.php link.
I had been faced same issue at some point of time then I created INDEX on column. Now I am dealing with millions of records using mysql.

How to GROUP BY consecutive data (date in this case)

I have a products table and a sales table that keeps record of how many items a given product sold during each date. Of course, not all products have sales everyday.
I need to generate a report that tells me how many consecutive days a product has had sales (from the latest date to the past) and how many items it sold during those days only.
I'd like to tell you how many things I've tried so far, but the only succesful (and slow, recursive) ones are solutions inside my application and not inside SQL, which is what I want.
I also have browsed several similar questions on SO but I haven't found one that lets me have a clear idea of what I really need.
I've setup a SQLFiddle here to show you what I'm talking about. There you will see the only query I can think of, which doesn't give me the result I need. I also added comments there showing what the result of the query should be.
I hope someone here knows how to accomplish that. Thanks in advance for any comments!
Francisco
http://sqlfiddle.com/#!2/20108/1
Here is a store procedure that do the job
CREATE PROCEDURE myProc()
BEGIN
-- Drop and create the temp table
DROP TABLE IF EXISTS reached;
CREATE TABLE reached (
sku CHAR(32) PRIMARY KEY,
record_date date,
nb int,
total int)
ENGINE=HEAP;
-- Initial insert, the starting point is the MAX sales record_date of each product
INSERT INTO reached
SELECT products.sku, max(sales.record_date), 0, 0
FROM products
join sales on sales.sku = products.sku
group by products.sku;
-- loop until there is no more updated rows
iterloop: LOOP
-- Update the temptable with the values of the date - 1 row if found
update reached
join sales on sales.sku=reached.sku and sales.record_date=reached.record_date
set reached.record_date = reached.record_date - INTERVAL 1 day,
reached.nb=reached.nb+1,
reached.total=reached.total + sales.items;
-- If no more rows are updated it means we hit the most longest days_sold
IF ROW_COUNT() = 0 THEN
LEAVE iterloop;
END IF;
END LOOP iterloop;
-- select the results of the temp table
SELECT products.sku, products.title, products.price, reached.total as sales, reached.nb as days_sold
from reached
join products on products.sku=reached.sku;
END//
Then you just have to do
call myProc()
A solution in pure SQL without store procedure : Fiddle
SELECT sku
, COUNT(1) AS consecutive_days
, SUM(items) AS items
FROM
(
SELECT sku
, items
-- generate a new guid for each group of consecutive date
-- ie : starting with day_before is null
, #guid := IF(#sku = sku and day_before IS NULL, UUID(), #guid) AS uuid
, #sku := sku AS dummy_sku
FROM
(
SELECT currents.sku
, befores.record_date as day_before
, currents.items
FROM sales currents
LEFT JOIN sales befores
ON currents.sku = befores.sku
AND currents.record_date = befores.record_date + INTERVAL 1 DAY
ORDER BY currents.sku, currents.record_date
) AS main_join
CROSS JOIN (SELECT #sku:=0) foo_sku
CROSS JOIN (SELECT #guid:=UUID()) foo_guid
) AS result_to_group
GROUP BY uuid, sku
The query is really not that hard. Declare variables via cross join (SELECT #type:=0) type. Then in the selects, you can set variables value row by row. It is necessary for simulating Rank function.
select
p.*,
sum(s.items) sales,
count(s.record_date) days_sold
from
products p
join
sales s
on
s.sku = p.sku
where record_date between '2013-04-18 00:00:00' and '2013-04-26 00:00:00'
group by sku;

SQL query that reports N or more consecutive absents from attendance table

I have a table that looks like this:
studentID | subjectID | attendanceStatus | classDate | classTime | lecturerID |
12345678 1234 1 2012-06-05 15:30:00
87654321
12345678 1234 0 2012-06-08 02:30:00
I want a query that reports if a student has been absent for 3 or more consecutive classes. based on studentID and a specific subject between 2 specific dates as well. Each class can have a different time. The schema for that table is:
PK(`studentID`, `classDate`, `classTime`, `subjectID, `lecturerID`)
Attendance Status: 1 = Present, 0 = Absent
Edit: Worded question so that it is more accurate and really describes what was my intention.
I wasn't able to create an SQL query for this. So instead, I tried a PHP solution:
Select all rows from table, ordered by student, subject and date
Create a running counter for absents, initialized to 0
Iterate over each record:
If student and/or subject is different from previous row
Reset the counter to 0 (present) or 1 (absent)
Else, that is when student and subject are same
Set the counter to 0 (present) or plus 1 (absent)
I then realized that this logic can easily be implemented using MySQL variables, so:
SET #studentID = 0;
SET #subjectID = 0;
SET #absentRun = 0;
SELECT *,
CASE
WHEN (#studentID = studentID) AND (#subjectID = subjectID) THEN #absentRun := IF(attendanceStatus = 1, 0, #absentRun + 1)
WHEN (#studentID := studentID) AND (#subjectID := subjectID) THEN #absentRun := IF(attendanceStatus = 1, 0, 1)
END AS absentRun
FROM table4
ORDER BY studentID, subjectID, classDate
You can probably nest this query inside another query that selects records where absentRun >= 3.
SQL Fiddle
This query works for intended result:
SELECT DISTINCT first_day.studentID
FROM student_visits first_day
LEFT JOIN student_visits second_day
ON first_day.studentID = second_day.studentID
AND DATE(second_day.classDate) - INTERVAL 1 DAY = date(first_day.classDate)
LEFT JOIN student_visits third_day
ON first_day.studentID = third_day.studentID
AND DATE(third_day.classDate) - INTERVAL 2 DAY = date(first_day.classDate)
WHERE first_day.attendanceStatus = 0 AND second_day.attendanceStatus = 0 AND third_day.attendanceStatus = 0
It's joining table 'student_visits' (let's name your original table so) to itself step by step on consecutive 3 dates for each student and finally checks the absence on these days. Distinct makes sure that result willn't contain duplicate results for more than 3 consecutive days of absence.
This query doesn't consider absence on specific subject - just consectuive absence for each student for 3 or more days. To consider subject simply add .subjectID in each ON clause:
ON first_day.subjectID = second_day.subjectID
P.S.: not sure that it's the fastest way (at least it's not the only).
Unfortunately, mysql does not support windows functions. This would be much easier with row_number() or better yet cumulative sums (as supported in Oracle).
I will describe the solution. Imagine that you have two additional columns in your table:
ClassSeqNum -- a sequence starting at 1 and incrementing by 1 for each class date.
AbsentSeqNum -- a sequence starting a 1 each time a student misses a class and then increments by 1 on each subsequent absence.
The key observation is that the difference between these two values is constant for consecutive absences. Because you are using mysql, you might consider adding these columns to the table. They are big challenging to add in the query, which is why this answer is so long.
Given the key observation, the answer to your question is provided by the following query:
select studentid, subjectid, absenceid, count(*) as cnt
from (select a.*, (ClassSeqNum - AbsentSeqNum) as absenceid
from Attendance a
) a
group by studentid, subjectid, absenceid
having count(*) > 2
(Okay, this gives every sequence of absences for a student for each subject, but I think you can figure out how to whittle this down just to a list of students.)
How do you assign the sequence numbers? In mysql, you need to do a self join. So, the following adds the ClassSeqNum:
select a.StudentId, a.SubjectId, count(*) as ClassSeqNum
from Attendance a join
Attendance a1
on a.studentid = a1.studentid and a.SubjectId = a1.Subjectid and
a.ClassDate >= s1.classDate
group by a.StudentId, a.SubjectId
And the following adds the absence sequence number:
select a.StudentId, a.SubjectId, count(*) as AbsenceSeqNum
from Attendance a join
Attendance a1
on a.studentid = a1.studentid and a.SubjectId = a1.Subjectid and
a.ClassDate >= a1.classDate
where AttendanceStatus = 0
group by a.StudentId, a.SubjectId
So the final query looks like:
with cs as (
select a.StudentId, a.SubjectId, count(*) as ClassSeqNum
from Attendance a join
Attendance a1
on a.studentid = a1.studentid and a.SubjectId = a1.Subjectid and
a.ClassDate >= s1.classDate
group by a.StudentId, a.SubjectId
),
a as (
select a.StudentId, a.SubjectId, count(*) as AbsenceSeqNum
from Attendance a join
Attendance a1
on a.studentid = a1.studentid and a.SubjectId = a1.Subjectid and
a.ClassDate >= s1.classDate
where AttendanceStatus = 0
group by a.StudentId, a.SubjectId
)
select studentid, subjectid, absenceid, count(*) as cnt
from (select cs.studentid, cs.subjectid,
(cs.ClassSeqNum - a.AbsentSeqNum) as absenceid
from cs join
a
on cs.studentid = a.studentid and cs.subjectid = as.subjectid
) a
group by studentid, subjectid, absenceid
having count(*) > 2