Create View with rank column in limited amount of data - mysql

So I have an MySQL table structured like this:
CREATE TABLE `spenttime` {
`id` int(11) NOT NULL AUTO_INCREMENT,
`userid` int(11) NOT NULL,
`serverid` int(11) NOT NULL,
`time` int(11) NOT NULL,
`day` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `dbid_sid_day` (`userid`,`serverid`,`day`)
}
Where I'm storing time spent on my game servers every day for each registered player. time is the amount of time spent, in seconds, day is an unix timestamp of each day (beginning of the day). I want to create an View on my database that will show for each user time spent on server every week, but with an column displaying rank of that time, independent for each server on each week. For example data (for clarify i will use date format Y-M-D instead of unix timestamp for day column on this example):
INSERT INTO `spenttime` (`userid`, `serverid`, `time`, `day`) VALUES
(1, 1, 200, '2013-04-01'),
(1, 1, 150, '2013-04-02'),
(2, 1, 100, '2013-04-02'),
(3, 1, 500, '2013-04-04'),
(2, 2, 400, '2013-04-04'),
(1, 1, 300, '2013-04-08'),
(3, 1, 200, '2013-04-08');
For that data in viev named spenttime_week should appear:
+--------+----------+--------+------------+------+
| userid | serverid | time | yearweek | rank |
+--------+----------+--------+------------+------+
| 1 | 1 | 350 | '2013-W14' | 2 |
| 2 | 1 | 100 | '2013-W14' | 3 |
| 3 | 1 | 500 | '2013-W14' | 1 |
| 2 | 2 | 400 | '2013-W14' | 1 |
| 1 | 1 | 300 | '2013-W15' | 1 |
| 3 | 1 | 200 | '2013-W15' | 2 |
+--------+----------+--------+------------+------+
I know how to generate view wihout rank, i have only troubles with rank column...
How can I make that happen?
//edit
Additionaly, this column MUST appear in viev, I cannot generate It in select from that view, because app where I will use it don't allow that...

First you need to create a first VIEW that sums the spent time for every user on the same week:
CREATE VIEW total_spent_time AS
SELECT userid,
serverid,
sum(time) AS total_time,
yearweek(day, 3) as week
FROM spenttime
GROUP BY userid, serverid, week;
then you can create your view as this:
CREATE VIEW spenttime_week AS
SELECT
s1.userid,
s1.serverid,
s1.total_time,
s1.week,
count(s2.total_time)+1 AS rank
FROM
total_spent_time s1 LEFT JOIN total_spent_time s2
ON s1.serverid=s2.serverid
AND s1.userid!=s2.userid
AND s1.week = s2.week
AND s1.total_time<=s2.total_time
GROUP BY
s1.userid,
s1.serverid,
s1.total_time,
s1.week
ORDER BY
s1.week, s1.serverid, s1.userid
Please see a fiddle here.

Lots of ways you could get the yearweek column, a quick lazy solution to that for clarity (because I doubt you're struggling with that). But here's how you can get the rank.
Use a self join to get dataset including rows with higher time value than current row, then count the rows with higher value:
This is much easier in MSSQL, which is where I live 99% of the time, and where you can just use the RANK() function. I hadn't realised until today there wasn't an equivalent in mysql. Fun to work out how to get the same result without MS's helping hand.
Prep stuff for context:
CREATE TABLE spenttime (userid int, serverid int, [time] int, [day] DATETIME)
CREATE TABLE weeklookup (weekname VARCHAR(10), weekstart DATETIME, weekend DATETIME)
INSERT INTO spenttime (userid, serverid, [time], [day]) VALUES
(1, 1, 200, '2013-apr-01'),
(1, 1, 150, '2013-apr-02'),
(2, 1, 100, '2013-apr-02'),
(3, 1, 500, '2013-apr-04'),
(2, 2, 400, '2013-apr-04'),
(1, 1, 300, '2013-apr-08'),
(3, 1, 200, '2013-apr-08');
INSERT INTO weeklookup(weekname, weekstart, weekend) VALUES
('2013-w14', '01/apr/2013', '08/apr/2013'),
('2013-w15', '08/apr/2013', '15/apr/2013')
GO
CREATE VIEW weekgroup AS
SELECT a.userid ,
a.serverid ,
a.[time] ,
w1.weekname
FROM spenttime a
INNER JOIN weeklookup w1 ON [day] >= w1.weekstart
AND [day] < w1.weekend
GO
Select statement for the view:
SELECT wv1.userid ,
wv1.serverid ,
wv1.[time] ,
wv1.weekname AS yearweek ,
COUNT(wv2.[time]) + 1 AS rank
FROM weekgroup wv1
LEFT JOIN weekgroup wv2 ON wv1.[time] < wv2.[time]
AND wv1.weekname = wv2.weekname
AND wv1.serverid = wv2.serverid
GROUP BY wv1.userid ,
wv1.serverid ,
wv1.[time] ,
wv1.weekname
ORDER BY wv1.weekname ,
wv1.[time] DESC

If you want to store the rank, you would use an insert trigger. The insert trigger would calculate the rank, as something like:
select count(*)
from spenttime_week w
where w.yearweek = new.yearweek and time >= new.time
However, I would not recommend this, because you then have to create an update trigger as well, and modify rank values that are already inserted.
Instead, access the table using SQL like:
select w.*,
(select count(*) from spenttime_week w2 where w2.yearweek = w.yearweek and w2.time >= w.time
) as rank
from spenttime_week w
This SQL may vary, depending on how you want to handle ties in the data. For performance reasons, you should have an index on at least yearweek, and probably on yearweek, time.

Related

How do I select the max(timestamp) from a relational mysql table fast

We are developing a ticket system and for the dashboard we want to show the tickets with it's latest status. We have two tables. The first one for the ticket itself and a second table for the individual edits.
The system is running already, but the performance for the dashboard is very bad (6 seconds for ~1300 tickets). At first we used a statemant which selected 'where timestamp = (select max(Timestamp))' for every ticket. In the second step we created a view which only includes the latest timestamp for every ticket, but we are not able to also include the correct status into this view.
So the main Problem might be, that we can't build a table in which for every ticket the lastest ins_date and also the latest status is selected.
Simplyfied database looks like:
CREATE TABLE `ticket` (
`id` int(10) NOT NULL,
`betreff` varchar(100) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `ticket_relation` (
`id` int(11) NOT NULL,
`ticket` int(10) NOT NULL,
`info` varchar(10000) DEFAULT NULL,
`status` int(1) NOT NULL DEFAULT '0',
`ins_date` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`ins_user` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `ticket` (`id`, `betreff`) VALUES
(1, 'Technische Frage'),
(2, 'Ticket 2'),
(3, 'Weitere Fragen');
INSERT INTO `ticket_relation` (`id`, `ticket`, `info`, `status`, `ins_date`, `ins_user`) VALUES
(1, 1, 'Betreff 1', 0, '2019-05-28 11:02:18', 123),
(2, 1, 'Betreff 2', 3, '2019-05-28 12:07:36', 123),
(3, 2, 'Betreff 3', 0, '2019-05-29 06:49:32', 123),
(4, 3, 'Betreff 4', 1, '2019-05-29 07:44:07', 123),
(5, 2, 'Betreff 5', 1, '2019-05-29 07:49:32', 123),
(6, 2, 'Betreff 6', 3, '2019-05-29 08:49:32', 123),
(7, 3, 'Betreff 7', 2, '2019-05-29 09:49:32', 123),
(8, 2, 'Betreff 8', 1, '2019-05-29 10:49:32', 123),
(9, 3, 'Betreff 9', 2, '2019-05-29 11:49:32', 123),
(10, 3, 'Betreff 10', 3, '2019-05-29 12:49:32', 123);
I have created a SQL Fiddle: http://sqlfiddle.com/#!9/a873b6/3
The first three Statements are attempts that won't work correct or way too slow. The last one is the key I think, but I don't understand, why this gets the status wrong.
The attempt to create the table with latest ins_date AND status for each ticket:
SELECT
ticket, status, MAX(ins_date) as max_date
FROM
ticket_relation
GROUP BY
ticket
ORDER BY
ins_date DESC;
This query gets the correct (latest) ins_date for every ticket, but not the latest status:
+--------+--------+----------------------+
| ticket | status | max_date |
+--------+--------+----------------------+
| 3 | 1 | 2019-05-29T12:49:32Z |
+--------+--------+----------------------+
| 2 | 0 | 2019-05-29T10:49:32Z |
+--------+--------+----------------------+
| 1 | 0 | 2019-05-28T12:07:36Z |
+--------+--------+----------------------+
Expected output would be this:
+--------+--------+----------------------+
| ticket | status | max_date |
+--------+--------+----------------------+
| 3 | 3 | 2019-05-29T12:49:32Z |
+--------+--------+----------------------+
| 2 | 1 | 2019-05-29T10:49:32Z |
+--------+--------+----------------------+
| 1 | 3 | 2019-05-28T12:07:36Z |
+--------+--------+----------------------+
Is there a efficient way to select the latest timestamp and status for every ticket in the tiket-table?
Other approach is to think filtering not GROUPing..
Query
SELECT
ticket_relation_1.ticket
, ticket_relation_1.status
, ticket_relation_1.ins_date
FROM
ticket_relation AS ticket_relation_1
LEFT JOIN
ticket_relation AS ticket_relation_2
ON
ticket_relation_1.ticket = ticket_relation_2.ticket
AND
ticket_relation_1.ins_date < ticket_relation_2.ins_date
WHERE
ticket_relation_2.id IS NULL
ORDER BY
ticket_relation_1.id DESC
Result
| ticket | status | ins_date |
| ------ | ------ | ------------------- |
| 3 | 3 | 2019-05-29 12:49:32 |
| 2 | 1 | 2019-05-29 10:49:32 |
| 1 | 3 | 2019-05-28 12:07:36 |
see demo
This query would require a index KEY(ticket, ins_date, id) to get max performance..
One solution would be to use a subquery to compute the latest insert date for each ticket, and then to join the results with the original table, like:
SELECT t.ticket, t.status, t.ins_date
FROM ticket_relation t
INNER JOIN (
SELECT ticket, max(ins_date) max_ins_date
FROM ticket_relation
GROUP BY ticket
) x ON t.ticket = x.ticket AND t.ins_date = x.max_ins_date
For better performance with this query, you want an index on (ticket, ins_date).
Anoter option would be to use a NOT EXISTS condition to ensure that only the latest record is selected, like:
SELECT t.ticket, t.status, t.ins_date
FROM ticket_relation t
WHERE NOT EXISTS (
SELECT 1
FROM ticket_relation t1
WHERE t1.ticket = t.ticket AND t1.ins_date > t.ins_date)
)
NB: when dealing with GROUP BY, all non-aggregated columns must appear in the GROUP BY clause. Else, you will get either an error or unprectictable results (depending on whether server option ONLY_FULL_GROUP_BY is, respectively, enabled or disabled).
If you are able to upgrade to a recent version of mysql (8.0), then window functions can be used to simplify the query and possibly increase its performance, like:
SELECT ticket, status, ins_date
FROM (
SELECT
ticket,
status,
ins_date,
row_number() over(partition by ticket order by ins_date desc) rn
FROM ticket_relation
) x WHERE rn = 1
You can try below query -
SELECT
ticket, status, ins_date as max_date
FROM ticket_relation a
where ins_date in (select max(ins_date) from ticket_relation b where a.ticket=b.ticket)

Custom query with group by and then count

I am using events.I would like to know how to calculate sum in event or using single query
http://sqlfiddle.com/#!9/ad6d1c/1
DDL for question:
CREATE TABLE `table1` (
`id` int(11) NOT NULL,
`group_id` int(11) NOT NULL DEFAULT '0',
`in_use` tinyint(1) NOT NULL DEFAULT '1' COMMENT '0->in_use,1->not_in_use',
`auto_assign` tinyint(1) NOT NULL DEFAULT '0' COMMENT '0->Yes,1->No'
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE `table1`
ADD PRIMARY KEY (`id`);
ALTER TABLE `table1`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;
INSERT INTO `table1` (`id`, `group_id`, `in_use`, `auto_assign`) VALUES
(1, 3, 1, 0),(2, 2, 0,1),(3, 1, 1, 1),(4, 3, 1, 0),(5, 3, 0, 0),(6, 3, 0, 1),
(7, 3, 1, 0),(8, 3, 0, 1),(9, 3, 0, 1),(10, 3, 0, 1),(11, 3, 0, 1),(12, 3, 1, 1),
(13, 3, 1, 0),(14, 3, 0, 0),(15, 3, 0, 0),(16, 3, 0, 0),(17, 3, 0, 0),(18, 3, 1, 1),
(19, 3, 0, 0),(20, 3, 0, 0)
Expected Output :
| count | in_use | auto_assign | sum | check_count |
|-------|--------|-------------|------|------------ |
| 7 | 0 | 0 | 11 | 5 |
| 5 | 0 | 1 | 07 | 3 |
| 4 | 1 | 0 | 11 | 5 |
| 2 | 1 | 1 | 07 | 3 |
Here we can see that auto_assign=0 have total 11 count(7+4) and
auto_assign=1 have 7 count(5+2) this count should be stored into new column sum.
check_count column is percentage value of sum column.Percentage will be predefined.
Lets take 50%, So count 11(sum column value) ->50% = 5.5 = ROUND(5.5) == 5(In integer). Same way count 7(sum column value)->50% = 3.5 =ROUND(3.5)=3(Integer)
Here 5 > 4(auto_assign=0 and in_use=1 ).So have to insert record into another table(table2). if not then not.
Same way, If 3 >2 then also need to insert record into another table(table2).if not then not.
Note : This logic I would like to implement in event
This is bit complicated, but please suggest me how to do this in event.
Detail clarification :
here percentage_Value is 5 for auto_assign =0.But auto_assign=0 and in_use=1 have count is 4 which less than 5 ,then have to insert record into table 2.
suppose,if we get count is 6 for auto_assign=0 and in_use=1 ,Then no need to insert record into table2.
Same way,
here percentage_Value is 3 for auto_assign =1.But auto_assign=1 and in_use=1 have count is 2 which less than 3 ,then have to insert record into table 2.
suppose,if we get count is 4 for auto_assign=1 and in_use=1 ,Then no need to insert record into table2.
Insert query into table2:
Insert into table2(cli_group_id,auto_assign,percentage_value,result_value) values(3,0,5,4)
DEMO Fiddle
Break the problem down: we need a count of the records by auto_Assigns; so we generate a derived table (B) with that value and join back to your base table on auto_Assign. This then gives us the column we need for some and we use the truncate function and a division model to get the check_count
SELECT count(*), in_use, A.Auto_Assign, B.SumC, truncate(B.SumC/2,0) as check_Count
FROM table1 A
INNER JOIN (Select Auto_Assign, count(*) sumC
from table1
where Group_ID = 3
Group by Auto_Assign) B
on A.Auto_Assign = B.Auto_Assign
WHERE GROUP_ID = 3
Group by in_use, A.Auto_Assign
we can eliminate the double where clause by joining on it:
SELECT count(*), in_use, A.Auto_Assign, B.SumC, truncate(B.SumC/2,0) as check_Count
FROM table1 A
INNER JOIN (Select Auto_Assign, count(*) sumC, Group_ID
from table1
where Group_ID = 3
Group by Auto_Assign, Group_ID) B
on A.Auto_Assign = B.Auto_Assign
and A.Group_ID = B.Group_ID
Group by in_use, A.Auto_Assign
I'd need clarification on the rest of the question: I'm not sure what 5 > 4 your'e looking at and I see no 3 other than the check count but that's not "the same way" so I'm not sure what you're after.
Here 5 > 4(auto_assign=0 and in_use=1 ).So have to insert record into another table(table2). if not then not.
Same way, If 3 >2 then also need to insert record into another table(table2).if not then not.
Note : This logic I would like to implement in event
This is bit complicated, but please suggest me how to do this in event.
So to create the event: DOCS
Which results in:
CREATE EVENT myevent
ON SCHEDULE AT CURRENT_TIMESTAMP + INTERVAL 6 Minutes
DO
INSERT INTO table2
SELECT count(*) as mCount
, in_use
, A.Auto_Assign
, B.SumC, truncate(B.SumC/2,0) as check_Count
FROM table1 A
INNER JOIN (SELECT Auto_Assign, count(*) sumC, Group_ID
FROM table1
WHERE Group_ID = 3
GROUP BY Auto_Assign, Group_ID) B
ON A.Auto_Assign = B.Auto_Assign
AND A.Group_ID = B.Group_ID
GROUP BY in_use, A.Auto_Assign

MYSQL multiple conditional statements for count

I'm very new to MYSQL, have looked at many answers on this site but can't get the following to work...
Table is "member"
3 fields are "id" (Integer); and 2 date fields "dob" and "expiry"
I need to count the number of records where all are current members, ie
expiry<curdate()
then I need to know the count of records with the following conditions:
year(curdate())-year(dob) <25 as young
year(curdate())-year(dob) >25 and <=50 as Medium
year(curdate())-year(dob) >50 as Older
So I expect to get a single row with many columns and the count of each of these conditions.
Effectively I'm filtering current members for their age grouping.
I've tried a subquery but failed to get that to work.
Thanks
If you really want the end result as you have mentioned, you could use views. It takes a long way to achieve the result. However, here is the way. I created the following table member and inserted data as follows.
CREATE TABLE member (
id int(11) AUTO_INCREMENT PRIMARY KEY,
dob date DEFAULT NULL,
expiry date DEFAULT NULL
);
INSERT INTO member (id, dob, expiry) VALUES
(1, '1980-01-01', '2020-05-05'),
(2, '1982-05-05', '2020-01-01'),
(3, '1983-05-05', '2020-01-01'),
(4, '1981-05-05', '2020-01-01'),
(5, '1994-05-05', '2020-01-01'),
(6, '1992-05-05', '2020-01-01'),
(7, '1960-05-05', '2020-01-01'),
(8, '1958-05-05', '2020-01-01'),
(9, '1958-07-07', '2020-05-05');
Following is the member table with data.
id | dob | expiry
--------------------------------
1 | 1980-01-01 | 2020-05-05
2 | 1982-05-05 | 2020-01-01
3 | 1983-05-05 | 2020-01-01
4 | 1981-05-05 | 2020-01-01
5 | 1994-05-05 | 2020-01-01
6 | 1992-05-05 | 2020-01-01
7 | 1960-05-05 | 2020-01-01
8 | 1958-05-05 | 2020-01-01
9 | 1958-07-07 | 2020-05-05
Then I created a separate view for all the current employees named as current_members as follows.
CREATE VIEW current_members AS (SELECT * FROM member WHERE TIMESTAMPDIFF(YEAR, CAST(CURRENT_TIMESTAMP AS DATE), member.expiry) >= 0);
Then querying from that view, I created 3 separate views containing counts for each age ranges of young, middle and old as follows.
CREATE VIEW young AS (SELECT COUNT(*) as Young FROM (SELECT TIMESTAMPDIFF(YEAR, current_members.dob, CAST(CURRENT_TIMESTAMP AS DATE)) AS age FROM current_members HAVING age <= 25) yng);
CREATE VIEW middle AS (SELECT COUNT(*) as Middle FROM (SELECT TIMESTAMPDIFF(YEAR, current_members.dob, CAST(CURRENT_TIMESTAMP AS DATE)) AS age FROM current_members HAVING age BETWEEN 25 AND 50) mid);
CREATE VIEW old AS (SELECT COUNT(*) as Old FROM (SELECT TIMESTAMPDIFF(YEAR, current_members.dob, CAST(CURRENT_TIMESTAMP AS DATE)) AS age FROM current_members HAVING age >= 50) old);
Finally, the three views were cross joined in order to get the counts of each age range into a single row of one final table as follows.
SELECT * FROM young, middle, old;
This will give you the following result.
Young | Middle | Old
----------------------
2 | 4 | 3
SUGGESTION : FOR THE ABOVE TEDIOUS TIME DIFFERENCE CALCULATIONS, YOU COULD WRITE YOUR OWN STORED PROCEDURE TO SIMPLIFY THE CODE

MySQL storing and querying latest software version

With the following sample table, I want to create a MySQL query that returns the latest version for each of the following fictional applications (based on traditional software version numbering). I am using MySQL version 5.5.17.
I would also consider using a stored function, if a function can be created that makes a more elegant query.
app | major | minor | patch
------+-------+-------+--------
cat | 2 | 15 | 0
cat | 2 | 15 | 1
cat | 2 | 2 | 0
dog | 1 | 0 | 1
dog | 1 | 7 | 2
dog | 3 | 0 | 0
fish | 2 | 2 | 5
fish | 2 | 3 | 1
fish | 2 | 11 | 0
Expected query result:
app | major | minor | patch
------+-------+-------+--------
cat | 2 | 15 | 1
dog | 3 | 0 | 0
fish | 2 | 11 | 0
You can use this sql to create the table called my_table, so you can test.
CREATE TABLE IF NOT EXISTS `my_table` (
`app` varchar(10) NOT NULL,
`major` int(11) NOT NULL DEFAULT '0',
`minor` int(11) NOT NULL DEFAULT '0',
`patch` int(11) NOT NULL DEFAULT '0'
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `my_table` (`app`, `major`, `minor`, `patch`) VALUES
('cat', 2, 15, 1),
('cat', 2, 15, 0),
('cat', 2, 2, 0),
('dog', 1, 0, 1),
('dog', 1, 7, 2),
('dog', 3, 0, 0),
('fish', 2, 2, 5),
('fish', 2, 3, 1),
('fish', 2, 11, 0);
If you assume that the minor version and patch never go above 1000, you can combine them into a single number major*100000 + minor*1000 + patch. Then you can apply one of the techniques at SQL Select only rows with Max Value on a Column after calculating this for each row.
SELECT m.*
FROM my_table AS m
JOIN (SELECT app, MAX(major*1000000 + minor*1000 + patch) AS maxversion
FROM my_table
GROUP BY app) AS m1
ON m.app = m1.app AND major*1000000 + minor*1000 + patch = maxversion
DEMO
There are three approaches I can think of. And all of them are pretty ugly, and all of them involve subqueries.... 1) use correlated subqueries in SELECT list of a GROUP BY query, 2) use inline view to get max of canonical string concatenation of (zero padded) major_minor_patch 0002_0015_0001, and then either unpack the string representation, or join to table to get matching row, or 3) use a query that orders the rows by app, then by highest version of each app, and a trick (unsupported) with user defined values to flag the "first" row for each app. None of these is pretty.
Here's a demonstration of one approach.
We start with this, to get each app:
SELECT t.app
FROM my_table t
GROUP BY t.app
Next step, get the highest "major" for each app, we can do something like this:
SELECT t.app
, MAX(t.major) AS major
FROM my_table t
GROUP BY t.app
To get the highest minor within that major, we can make that an inline view... wrap it in parens and reference it like a table in another query
SELECT t2.app
, t2.major
, MAX(t2.minor) AS minor
FROM my_table t2
JOIN (
SELECT t.app
, MAX(t.major) AS major
FROM my_table t
GROUP BY t.app
) t1
ON t2.app = t1.app
AND t2.major = t1.major
GROUP BY t2.app, t2.major
To get the highest patch, we follow the same pattern. Using the previous query as an inline view.
SELECT t4.app
, t4.major
, t4.minor
, MAX(t4.patch) AS patch
FROM my_table t4
JOIN ( -- query from above goes here
SELECT t2.app
, t2.major
, MAX(t2.minor) AS minor
FROM my_table t2
JOIN ( SELECT t.app
, MAX(t.major) AS major
FROM my_table t
GROUP BY t.app
) t1
ON t2.app = t1.app
AND t2.major = t1.major
GROUP BY t2.app, t2.major
) t3
ON t4.app = t3.app
AND t4.major = t3.major
AND t4.minor = t3.minor
GROUP BY t4.app, t4.major, t4.minor
That's just an example of one approach.
FOLLOWUP:
For another approach (getting a canonical representation of the version, that is, combining the values of "major", "minor" and "patch" in a single expression so that the result can be "ordered" by that expression to get the highest version), see the answer from Gordon.

Find sum of stacked/overlapping date intersections in SQL table

I have the following table which represents bookings of articles:
+---+------------+----------+-------------+-------------+
|id | article_id | quantity | starts_at | ends_at |
+---+------------+----------+-------------+-------------+
| 1 | 1 | 1 | 2015-03-01 | 2015-03-20 |
| 2 | 1 | 2 | 2015-03-02 | 2015-03-03 |
| 3 | 1 | 3 | 2015-03-04 | 2015-03-15 |
| 4 | 1 | 2 | 2015-03-16 | 2015-03-22 |
| 5 | 1 | 2 | 2015-03-11 | 2015-03-19 |
| 6 | 2 | 2 | 2015-03-06 | 2015-03-22 |
| 7 | 2 | 3 | 2015-03-02 | 2015-03-04 |
+---+------------+----------+-------------+-------------+
From this table I want to extract the following information:
+------------+----------+
| article_id | sum |
+------------+----------+
| 1 | 6 |
| 2 | 3 |
+------------+----------+
Sum represents the max sum of quantity of stacked/overlapping booked articles for the given time ranges. In the first table article with id=1 has its maximum from booking 1, 3 and 5.
Is there any MySQL solution to obtain this information from a table like this?
Thank you very much!
EDIT: The date intersections are crucial. Let's say booking 5 starts at 2015-03-17 the sum for article_id=1 results 5, because booking 3 and 5 are not overlapping anymore. The sql should automatically consider all possible overlapping possibilities.
My answer is going to seem crazy complicated, perhaps; but it isn't, if one accepts that the use of a calendar table is an excellent MySQL idiom for dealing with date range related issues. I've closely adapted calendar table code from Artful Software's calendar table article. Artful Software's query techniques are a wonderful resource for doing complicated things in MySQL. The calendar table gives you a row per individual date that you are working with, which makes many things much easier.
For the whole thing below, you can go to this sqlfiddle for a place to play around with the code. It'll take a while to load.
First, here is your data:
CREATE TABLE articles
(`id` int, `article_id` int, `quantity` int, `starts_at` datetime, `ends_at` datetime);
INSERT INTO articles
(`id`, `article_id`, `quantity`, `starts_at`, `ends_at`)
VALUES
(1, 1, 1, '2015-03-01 00:00:00', '2015-03-20 00:00:00'),
(2, 1, 2, '2015-03-02 00:00:00', '2015-03-03 00:00:00'),
(3, 1, 3, '2015-03-04 00:00:00', '2015-03-15 00:00:00'),
(4, 1, 2, '2015-03-16 00:00:00', '2015-03-22 00:00:00'),
(5, 1, 2, '2015-03-11 00:00:00', '2015-03-19 00:00:00'),
(6, 2, 2, '2015-03-06 00:00:00', '2015-03-22 00:00:00'),
(7, 2, 3, '2015-03-02 00:00:00', '2015-03-04 00:00:00');
Next, here is the creation of the calendar table--I've created somewhat more date rows than needed (going back to start of year, and forward to start of next year). Ideally you just permanently keep a more massive calendar table on hand, covering a span of dates that will handle anything you could ever need. All the stuff below is going to seem quite lengthy and complex. But if you already have a calendar table lying around, the whole next section is not necessary.
CREATE TABLE calendar ( dt datetime primary key );
/* the views below will be joined and rejoined to themselves to
get the effect creating many rows. V ends up with 10 rows. */
CREATE OR REPLACE VIEW v3 as SELECT 1 n UNION ALL SELECT 1 UNION ALL SELECT 1;
CREATE OR REPLACE VIEW v as SELECT 1 n FROM v3 a, v3 b UNION ALL SELECT 1;
/* Going to limit the calendar table to first of year of min date
and first of year after max date */
SELECT #min := makedate(year(min(starts_at)),1) FROM articles;
SELECT #max := makedate(year(min(ends_at))+1,1) FROM articles;
SET #inc = -1;
/* below we work with #min date + #inc days successively, with #inc:=#inc+1
acting like ++variable, so we start with minus 1.
We insert as many individual date rows as we want by self-joining v,
and using some kind of limit via WHERE to keep the calendar table small
for our example. For n occurrences of v below, you get a max
of 10^n rows in the calendar table. We are using v as row-creation
engine. */
INSERT INTO calendar
SELECT #min + interval #inc:=#inc+1 day as dt
FROM v a, v b, v c, v d # , v e , v f
WHERE #inc < datediff(#max,#min);
Now we are ready to find the stackings. Assuming the above (big assumption, I know), this becomes pretty easy. I'm going to do it through a few views for readability.
/* now create a view that will let us easily view the articles
related to indvidual dates when we query.
Not necessary, just makes things easier to read. */
CREATE OR REPLACE VIEW articles_to_dates as
SELECT c.dt, article_id
FROM articles a
INNER JOIN calendar c on c.dt between (SELECT min(starts_at) FROM articles) and (SELECT max(ends_at) FROM articles)
GROUP BY article_id, c.dt;
--SELECT * FROM articles_to_dates --This query would show the view's result
/* next view is the total amount of articles booked per individual date */
CREATE OR REPLACE VIEW booked_quantities_per_day AS
SELECT a2d.dt,a2d.article_id, SUM(a.quantity) as booked_quantity
FROM articles_to_dates a2d
INNER JOIN articles a on a2d.dt between a.starts_at and a.ends_at and a.article_id = a2d.article_id
GROUP BY a2d.dt, a2d.article_id
ORDER by a2d.article_id, a2d.dt
--SELECT * from booked_quantities_per_day --this query would show the view's result
Finally, here are the desired results:
SELECT article_id, max(booked_quantity) max_stacked
FROM booked_quantities_per_day
GROUP BY article_id;
Results:
article_id max_stacked
1 6
2 3
This should work.
Two groups. First to get distinct list of possible 'quantity'; second - summarise them
SELECT article_id, SUM(sub.quantity) FROM
(SELECT article_id, quantity FROM table GROUP BY article_id, quantity) as sub
GROUP BY article_id
select sum(quantity) from ...
group by article_id
where
... select your date range ...