MySQL count rows within the same intervals to eachother - mysql

I have a table where one column is the date:
+----------+---------------------+
| id | date |
+----------+---------------------+
| 5 | 2012-12-10 10:12:37 |
+----------+---------------------+
| 4 | 2012-12-10 09:09:55 |
+----------+---------------------+
| 3 | 2012-12-09 21:12:35 |
+----------+---------------------+
| 2 | 2012-12-09 20:15:07 |
+----------+---------------------+
| 1 | 2012-12-09 20:01:42 |
+----------+---------------------+
What I need, is to count the rows which are for example whitin 3 hours to each other. In this example I want to join the upper row with the 2nd row, and the 3rd row with the 4th and 5th rows. So my output should be like this:
+----------+---------------------+---------+
| id | date | count |
+----------+---------------------+---------+
| 5 | 2012-12-10 10:12:37 | 2 |
+----------+---------------------+---------+
| 3 | 2012-12-09 21:12:35 | 3 |
+----------+---------------------+---------+
How could I do this?

I think you need a self-join for this:
select t.id, t.date, COUNT(t2.id)
from t left outer join
t t2
on t.date between t2.date - interval 3 hour and t2.date + interval 3 hour
group by t.id, t.date
(This is untested code so it might have a syntax error.)
If you are trying to divide everything into 3-hour intervals, you can do something like:
select max(t.date), t.id, count(*)
from (select t.*,
(date(date)*100 + floor(hour(date)/3)*3) as interval
from t
) t
group by interval

I am not sure how to do this with My SQL but i am able to build a set of queries in SQL Server 2005 which will provide the intended results. Here is the working sample, its very complex and may be overly complex but that's how i was able to get the desired result:
WITH BaseData AS
(
SELECT 5 AS ID, '2012-12-10 10:12:37' AS Date
UNION ALL
SELECT 4 AS ID, '2012-12-10 09:09:55' AS Date
UNION ALL
SELECT 3 AS ID, '2012-12-09 21:12:35' AS Date
UNION ALL
SELECT 2 AS ID, '2012-12-09 20:15:07' AS Date
UNION ALL
SELECT 1 AS ID, '2012-12-09 20:01:42' AS Date
),
BaseDataWithRowNum AS
(
SELECT ID,DATE, ROW_NUMBER() OVER (ORDER BY Date DESC) AS RowNum
FROM BaseData
),
InterRelatedDates AS
(
SELECT B1.RowNum AS RowNum1,B2.RowNum AS RowNum2
FROM BaseDataWithRowNum B1
INNER JOIN BaseDataWithRowNum B2
ON B1.Date BETWEEN B2.Date AND DATEADD(hh,3,B2.Date)
AND B1.RowNum < B2.RowNum
AND B1.ID != B2.ID
),
InterRelatedDatesWithinMultipleGroups AS
(
SELECT G1.RowNum1,G2.RowNum2
FROM InterRelatedDates G1
LEFT JOIN InterRelatedDates G2
ON G1.RowNum2 = G2.RowNum2
AND G1.RowNum1 != G2.RowNum1
)
SELECT BN.ID,
BN.Date,
CountExcludingOriginalGrouppingRecord +1 AS C
FROM
(
SELECT RowNum1 AS RowNum,COUNT(1) AS CountExcludingOriginalGrouppingRecord
FROM
(
-- If a row was used in only one group then it is ok. use as it is
SELECT D1.RowNum1
FROM InterRelatedDatesWithinMultipleGroups AS D1
WHERE D1.RowNum2 IS NULL
UNION ALL
-- In case a row was selected in two groups, choose the one with higher date
SELECT Min(D1.RowNum1)
FROM InterRelatedDatesWithinMultipleGroups AS D1
WHERE D1.RowNum2 IS NOT NULL
GROUP BY D1.RowNum2
) T
GROUP BY RowNum1
) T2
INNER JOIN BaseDataWithRowNum BN
ON BN.RowNum = T2.RowNum

Related

Create a row for every day in a date range?

I have a table like this:
+----+---------+------------+
| id | price | date |
+----+---------+------------+
| 1 | 340 | 2018-09-02 |
| 2 | 325 | 2018-09-05 |
| 3 | 358 | 2018-09-08 |
+----+---------+------------+
And I need to make a view which has a row for every day. Something like this:
+----+---------+------------+
| id | price | date |
+----+---------+------------+
| 1 | 340 | 2018-09-02 |
| 1 | 340 | 2018-09-03 |
| 1 | 340 | 2018-09-04 |
| 2 | 325 | 2018-09-05 |
| 2 | 325 | 2018-09-06 |
| 2 | 325 | 2018-09-07 |
| 3 | 358 | 2018-09-08 |
+----+---------+------------+
I can do that using PHP with a loop (foreach) and making a temp variable which holds the previous price til there is a new date.
But I need to make a view ... So I should do that using pure-SQL .. Any idea how can I do that?
You could use a recursive CTE to generate the records in the "gaps". To avoid that an infinite gap after the last date is "filled", first get the maximum date in the source data and make sure not to bypass that date in the recursion.
I have called your table tbl:
with recursive cte as (
select id,
price,
date,
(select max(date) date from tbl) mx
from tbl
union all
select cte.id,
cte.price,
date_add(cte.date, interval 1 day),
cte.mx
from cte
left join tbl
on tbl.date = date_add(cte.date, interval 1 day)
where tbl.id is null
and cte.date <> cte.mx
)
select id,
price,
date
from cte
order by 3;
demo with mysql 8
Here is an approach which should work without analytic functions. This answer uses a calendar table join approach. The first CTE below is the base table on which the rest of the query is based. We use a correlated subquery to find the most recent date earlier than the current date in the CTE which has a non NULL price. This is the basis for finding out what the id and price values should be for those dates coming in from the calendar table which do not appear in the original data set.
WITH cte AS (
SELECT cal.date, t.price, t.id
FROM
(
SELECT '2018-09-02' AS date UNION ALL
SELECT '2018-09-03' UNION ALL
SELECT '2018-09-04' UNION ALL
SELECT '2018-09-05' UNION ALL
SELECT '2018-09-06' UNION ALL
SELECT '2018-09-07' UNION ALL
SELECT '2018-09-08'
) cal
LEFT JOIN yourTable t
ON cal.date = t.date
),
cte2 AS (
SELECT
t1.date,
t1.price,
t1.id,
(SELECT MAX(t2.date) FROM cte t2
WHERE t2.date <= t1.date AND t2.price IS NOT NULL) AS nearest_date
FROM cte t1
)
SELECT
(SELECT t2.id FROM yourTable t2 WHERE t2.date = t1.nearest_date) id,
(SELECT t2.price FROM yourTable t2 WHERE t2.date = t1.nearest_date) price,
t1.date
FROM cte2 t1
ORDER BY
t1.date;
Demo
Note: To make this work on MySQL versions earlier than 8+, you would need to inline the CTEs above. It would result in verbose code, but, it should still work.
Since you are using MariaDB, it is rather trivial:
MariaDB [test]> SELECT '2019-01-01' + INTERVAL seq-1 DAY FROM seq_1_to_31;
+-----------------------------------+
| '2019-01-01' + INTERVAL seq-1 DAY |
+-----------------------------------+
| 2019-01-01 |
| 2019-01-02 |
| 2019-01-03 |
| 2019-01-04 |
| 2019-01-05 |
| 2019-01-06 |
(etc)
There are variations on this wherein you generate a large range of dates, but then use a WHERE to chop to what you need. And use LEFT JOIN with the sequence 'derived table' on the 'left'.
Use something like the above as a derived table in your query.

MySQL - count with coalesce and add missing rows

I have one tables with two date columns (Date_open and Date_closed). All I want to do it to count occurrences per day. So to see how many were opened and closed each day. We look at the last 7 days from today. The problem is that some dates are not present in either of the columns and I can not find a way to either link tables with sub query (example code 1) or get coalesce work (example code 2)?
The table looks like that:
+------+------------+-------------+------+
| code | Date_open | Date_closed | Prio |
+------+------------+-------------+------+
| 1 | 2018-01-08 | 2018-01-08 | A |
| 2 | 2018-01-01 | 2018-01-08 | B |
| 3 | 2018-01-06 | 2018-01-07 | C |
| 4 | 2018-01-06 | 2018-01-06 | A |
| 5 | 2018-01-04 | 2018-01-06 | B |
| 6 | 2018-01-03 | 2018-01-01 | C |
| 7 | 2018-01-03 | 2018-01-02 | C |
| 8 | 2018-01-03 | 2018-01-02 | C |
+------+------------+-------------+------+
And the results I want are as follows:
Date OpenNo CloseNo
2018-01-01 1 1
2018-01-02 2
2018-01-03 3
2018-01-04 1
2018-01-05
2018-01-06 2 2
2018-01-07 1
2018-01-08 1 2
The first code I tried was:
SELECT *
FROM
(SELECT t1.Date_open,
COUNT(t1.Date_open) AS 'OpenNo'
FROM
Tbl AS t1
GROUP BY t1.Date_open)
AS A
JOIN
(SELECT t2.Date_closed,
COUNT(t2.Date_closed) AS 'CloseNo'
FROM
Tbl AS t2
GROUP BY t2.Date_closed)
AS B ON A.Date_open = B.Date_closed;
This code works as long as there is data for each day.
The second code I tried was:
SELECT
COALESCE (Date_open, Date_closed) AS Date1,
COUNT(Date_closed) AS ClosedNo,
COUNT(Date_open) AS OpenNo
FROM tbl
GROUP BY Date1;
Both do not work. Any ideas please?
Below is the code to create tbl.
create table Tbl(
code int(10) primary key,
Date_open DATE not null,
Date_closed DATE not null,
Prio varchar(10));
insert into Tbl values (1,'2018-01-08','2018-01-08' ,'A');
insert into Tbl values (2,'2018-01-01','2018-01-08' ,'B');
insert into Tbl values (3,'2018-01-06','2018-01-07' ,'C');
insert into Tbl values (4,'2018-01-06','2018-01-06' ,'A');
insert into Tbl values (5,'2018-01-04','2018-01-06' ,'B');
insert into Tbl values (6,'2018-01-03','2018-01-01' ,'C');
insert into Tbl values (7,'2018-01-03','2018-01-02' ,'C');
insert into Tbl values (8,'2018-01-03','2018-01-02' ,'C');
You may use a calendar table, and then left join to your current table twice to generate the counts for each date:
SELECT
d.dt,
COALESCE(t1.open_cnt, 0) AS OpenNo,
COALESCE(t2.closed_cnt, 0) AS CloseNo
FROM
(
SELECT '2018-01-01' AS dt UNION ALL
SELECT '2018-01-02' UNION ALL
SELECT '2018-01-03' UNION ALL
SELECT '2018-01-04' UNION ALL
SELECT '2018-01-05' UNION ALL
SELECT '2018-01-06' UNION ALL
SELECT '2018-01-07' UNION ALL
SELECT '2018-01-08'
) d
LEFT JOIN
(
SELECT Date_open, COUNT(*) AS open_cnt
FROM Tbl
GROUP BY Date_open
) t1
ON d.dt = t1.Date_open
LEFT JOIN
(
SELECT Date_closed, COUNT(*) AS closed_cnt
FROM Tbl
GROUP BY Date_closed
) t2
ON d.dt = t2.Date_closed
GROUP BY
d.dt
ORDER BY
d.dt;
Demo
The reason I aggregate the open and closed date counts in separate subqueries is that if were to try to just do a straight join across all tables involved, we would have to deal with double counting.
Edit:
If you wanted to just use the current date and seven days immediately preceding it, then here is a CTE which would do that:
WITH dates (
SELECT CURDATE() AS dt UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 1 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 2 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 3 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 4 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 5 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 6 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 7 DAY)
)
You could inline the above into my original query which is aliased as d, and it should work.
Coalesce can be confusing - it returns the first non-null value from the list you provide to it.
I don't know if this question requires a super-complex answer.
To get the count of open and closed for each unique date, the following could work.
SELECT
COALESCE (Date_open, Date_closed) AS Date1,
SUM(IF(Date_closed != null,1,0)) AS ClosedNo,
SUM(IF(Date_open != null,1,0)) AS OpenNo
FROM tbl
GROUP BY Date1;

Count of dates that are 5 days from the previous

I have a table with dates and users. I need help trying to figure out how to count only the dates that are 5 days after the previous date.
id|users_id|date
--------------------------------
1 | 1 | 2013-08-01 00:00:00
2 | 2 | 2013-08-03 00:00:00
3 | 1 | 2013-08-04 00:00:00
4 | 1 | 2013-08-06 00:00:00
5 | 2 | 2013-08-06 00:00:00
6 | 2 | 2013-08-10 00:00:00
7 | 2 | 2013-08-11 00:00:00
With the following example, I should get 2 for user 1 and 2 for user 2. I'm tried to do a subquery, but I was unable to pass in the timestamp to compare against. Any help would be much appreciated. Thank you.
Here was some of my example queries.
SELECT
tbl1.users_id,
COUNT(tbl2.date)
FROM table tbl1
LEFT JOIN table tbl2
ON tbl2.users_id = tbl1.users_id
AND tbl2.date > DATE_ADD(tbl1.date, INTERVAL 5 DAY)
GROUP BY tbl1.users_id;
SELECT
tbl1.users_id,
COUNT(tbl2.date)
FROM table tbl1
LEFT JOIN (
SELECT users_id, date
FROM table
WHERE date > DATE_ADD(tbl1.date, INTERVAL 5 DAY)
) tbl2 ON tbl1.users_id = tbl1.users_id
GROUP BY tbl1.users_id;
The last approach obviously doesn't work, since I can't put tbl1's date in the subquery.
Not tested, but something like this maybe:-
SELECT tbl1.users_id, COUNT(tbl2.date)
FROM (SELECT users_id, date, #Counter:=IF(users_id = #PrevId, #Counter + 1, 1) AS SequenceCtr, #PrevId:=users_id
FROM atable
CROSS JOIN (SELECT #Counter:=0, #PrevId:=0) Sub1
ORDER BY users_id, date) AS tbl1
LEFT OUTER JOIN (SELECT users_id, date, #Counter:=IF(users_id = #PrevId, #Counter + 1, 1) AS SequenceCtr, #PrevId:=users_id
FROM atable
CROSS JOIN (SELECT #Counter:=0, #PrevId:=0) Sub1
ORDER BY users_id, date) AS tbl2
ON tbl1.users_id = tbl2.users_id
AND tbl1.SequenceCtr + 1 = tbl2.SequenceCtr
AND tbl2.date > DATE_ADD(tbl1.date, INTERVAL 5 DAY)
GROUP BY tbl1.users_id;
Couple of subselects to get the list of dates but with a sequence number added. Then join as you have done, but also joining where the sequence numbers are 1 different.
EDIT - had a quick test on SQL fiddle and it seems to work:-
http://sqlfiddle.com/#!2/6c69b/1

SUM a pair of COUNTs from two tables based on a time variable

Been searching for an answer to this for the better part of an hour without much luck. I have two regional tables laid out with the same column names and I can put out a result list for either table based on the following query (swap Table2 for Table1):
SELECT Table1.YEAR, FORMAT(COUNT(Table1.id),0) AS Total
FROM Table1
WHERE Table1.variable='Y'
GROUP BY Table1.YEAR
Ideally I'd like to get a result that gives me a total sum of the counts by year, so instead of:
| REGION 1 | | REGION 2 |
| YEAR | Total | | YEAR | Total |
| 2010 | 5 | | 2010 | 1 |
| 2009 | 2 | | 2009 | 3 |
| | | | 2008 | 4 |
I'd have:
| MERGED |
| YEAR | Total |
| 2010 | 6 |
| 2009 | 5 |
| 2008 | 4 |
I've tried a variety of JOINs and other ideas but I think I'm caught up on the SUM and COUNT issue. Any help would be appreciated, thanks!
SELECT `YEAR`, FORMAT(SUM(`count`), 0) AS `Total`
FROM (
SELECT `Table1`.`YEAR`, COUNT(*) AS `count`
WHERE `Table1`.`variable` = 'Y'
GROUP BY `Table1`.`YEAR`
UNION ALL
SELECT `Table2`.`YEAR`, COUNT(*) AS `count`
WHERE `Table2`.`variable` = 'Y'
GROUP BY `Table2`.`YEAR`
) AS `union`
GROUP BY `YEAR`
You should use an UNION:
SELECT
t.YEAR,
COUNT(*) as TOTAL
FROM (
SELECT *
FROM Table1
UNION ALL
SELECT *
FROM Table2
) t
WHERE t.variable='Y'
GROUP BY t.YEAR;
Select year, sum(counts) from (
SELECT Table1.YEAR, FORMAT(COUNT(Table1.id),0) AS Total
FROM Table1
WHERE Table1.variable='Y'
GROUP BY Table1.YEAR
UNION ALL
SELECT Table2.YEAR, FORMAT(COUNT(Table2.id),0) AS Total
FROM Table2
WHERE Table2.variable='Y'
GROUP BY Table2.YEAR ) GROUP BY year
To improve upon Shehzad's answer:
SELECT YEAR, FORMAT(SUM(counts),0) AS total FROM (
SELECT Table1.YEAR, COUNT(Table1.id) AS counts
FROM Table1
WHERE Table1.variable='Y'
GROUP BY Table1.YEAR
UNION ALL
SELECT Table2.YEAR, COUNT(Table2.id) AS counts
FROM Table2
WHERE Table2.variable='Y'
GROUP BY Table2.YEAR ) AS newTable GROUP BY YEAR

MySQL JOIN based on highest date and non-unique columns

I need some help with a MySQL query I'm working on. I have data as follows.
Table 1
id date1 text number
---|------------|--------|-------
1 | 2012-12-12 | hi | 399
2 | 2011-11-11 | so | 399
5 | 2010-10-10 | what | 555
3 | 2009-09-09 | bye | 300
4 | 2008-08-08 | you | 300
Table 2
id number date2 ref
---|--------|------------|----
1 | 399 | 2012-06-06 | 40
2 | 399 | 2011-06-06 | 50
5 | 555 | 2011-03-03 | 60
For each row in Table 1, I want to get zero or one ref values from Table 2. There should be a row in the result for each row in Table 1. The number column isn't unique to either table, so the join must be made using the date1 & date2 columns, where date2 is the highest value for the number without exceeding date1 for that number.
The desired result from the above example would be like so.
date1 text number ref
------------|--------|--------|-----
2012-12-12 | hi | 399 | 40
2011-11-11 | so | 399 | 50
2010-10-10 | what | 555 | null
2009-09-09 | bye | 300 | null
2008-08-08 | you | 300 | null
You can see in the result's first row, ref is 40 was chosen because in table2 the record with ref=40 had a date2 that that was less than date1, and the highest date that met that condition.
In the result's second row, ref is 50 was chosen because in table2 the record with ref=50 had a date2 that that was less than date1, and the highest date that met that condition.
The rest of the results have null refs because date1 is always less or a corresponding number doesn't exist in table2.
I've got to a certain point but I'm stuck. The query I have so far is like this.
SELECT date1, text, number, ref
FROM table1
LEFT JOIN (
SELECT *
FROM (
SELECT *
FROM table2
WHERE date2 <= '2012-12-12'
ORDER BY date2 DESC
) tmp
GROUP BY msisdn
) tmp ON table1.number = table2.number;
The problem is that the hard coded date won't do, it should be based on date1, but I can't use date1 because it's in the outer query. Is there a way I can make this work?
I tried similar example with different tables just now and was able to get what you wanted. Below is a similar query modified to fit your needs. You might want to change < with <= if that is what you are looking for.
SELECT a.date1, a.text, b.ref
FROM table1 a LEFT JOIN table2 b ON
( a.number = b.number
AND a.date1 > b.date2
AND b.date2 = ( SELECT MAX(x.date2)
FROM table2 x
WHERE x.number = b.number
AND x.date2 < a.date1)
)
Untested:
SELECT t1.date1,
t1.text,
t1.number,
(SELECT a.ref
FROM TABLE_2 a
JOIN (SELECT t.number,
MAX(t.date2) AS max_date
FROM TABLE_2 t
WHERE t.number = t1.number
AND t.date2 <= t1.date1
GROUP BY t.number) b ON b.number = a.number
AND b.max_date = a.date2)
FROM TABLE_1 t1
The issue is the use of t1 in the derived table of the subselect...