I have a table with dates and users. I need help trying to figure out how to count only the dates that are 5 days after the previous date.
id|users_id|date
--------------------------------
1 | 1 | 2013-08-01 00:00:00
2 | 2 | 2013-08-03 00:00:00
3 | 1 | 2013-08-04 00:00:00
4 | 1 | 2013-08-06 00:00:00
5 | 2 | 2013-08-06 00:00:00
6 | 2 | 2013-08-10 00:00:00
7 | 2 | 2013-08-11 00:00:00
With the following example, I should get 2 for user 1 and 2 for user 2. I'm tried to do a subquery, but I was unable to pass in the timestamp to compare against. Any help would be much appreciated. Thank you.
Here was some of my example queries.
SELECT
tbl1.users_id,
COUNT(tbl2.date)
FROM table tbl1
LEFT JOIN table tbl2
ON tbl2.users_id = tbl1.users_id
AND tbl2.date > DATE_ADD(tbl1.date, INTERVAL 5 DAY)
GROUP BY tbl1.users_id;
SELECT
tbl1.users_id,
COUNT(tbl2.date)
FROM table tbl1
LEFT JOIN (
SELECT users_id, date
FROM table
WHERE date > DATE_ADD(tbl1.date, INTERVAL 5 DAY)
) tbl2 ON tbl1.users_id = tbl1.users_id
GROUP BY tbl1.users_id;
The last approach obviously doesn't work, since I can't put tbl1's date in the subquery.
Not tested, but something like this maybe:-
SELECT tbl1.users_id, COUNT(tbl2.date)
FROM (SELECT users_id, date, #Counter:=IF(users_id = #PrevId, #Counter + 1, 1) AS SequenceCtr, #PrevId:=users_id
FROM atable
CROSS JOIN (SELECT #Counter:=0, #PrevId:=0) Sub1
ORDER BY users_id, date) AS tbl1
LEFT OUTER JOIN (SELECT users_id, date, #Counter:=IF(users_id = #PrevId, #Counter + 1, 1) AS SequenceCtr, #PrevId:=users_id
FROM atable
CROSS JOIN (SELECT #Counter:=0, #PrevId:=0) Sub1
ORDER BY users_id, date) AS tbl2
ON tbl1.users_id = tbl2.users_id
AND tbl1.SequenceCtr + 1 = tbl2.SequenceCtr
AND tbl2.date > DATE_ADD(tbl1.date, INTERVAL 5 DAY)
GROUP BY tbl1.users_id;
Couple of subselects to get the list of dates but with a sequence number added. Then join as you have done, but also joining where the sequence numbers are 1 different.
EDIT - had a quick test on SQL fiddle and it seems to work:-
http://sqlfiddle.com/#!2/6c69b/1
Related
I have one tables with two date columns (Date_open and Date_closed). All I want to do it to count occurrences per day. So to see how many were opened and closed each day. We look at the last 7 days from today. The problem is that some dates are not present in either of the columns and I can not find a way to either link tables with sub query (example code 1) or get coalesce work (example code 2)?
The table looks like that:
+------+------------+-------------+------+
| code | Date_open | Date_closed | Prio |
+------+------------+-------------+------+
| 1 | 2018-01-08 | 2018-01-08 | A |
| 2 | 2018-01-01 | 2018-01-08 | B |
| 3 | 2018-01-06 | 2018-01-07 | C |
| 4 | 2018-01-06 | 2018-01-06 | A |
| 5 | 2018-01-04 | 2018-01-06 | B |
| 6 | 2018-01-03 | 2018-01-01 | C |
| 7 | 2018-01-03 | 2018-01-02 | C |
| 8 | 2018-01-03 | 2018-01-02 | C |
+------+------------+-------------+------+
And the results I want are as follows:
Date OpenNo CloseNo
2018-01-01 1 1
2018-01-02 2
2018-01-03 3
2018-01-04 1
2018-01-05
2018-01-06 2 2
2018-01-07 1
2018-01-08 1 2
The first code I tried was:
SELECT *
FROM
(SELECT t1.Date_open,
COUNT(t1.Date_open) AS 'OpenNo'
FROM
Tbl AS t1
GROUP BY t1.Date_open)
AS A
JOIN
(SELECT t2.Date_closed,
COUNT(t2.Date_closed) AS 'CloseNo'
FROM
Tbl AS t2
GROUP BY t2.Date_closed)
AS B ON A.Date_open = B.Date_closed;
This code works as long as there is data for each day.
The second code I tried was:
SELECT
COALESCE (Date_open, Date_closed) AS Date1,
COUNT(Date_closed) AS ClosedNo,
COUNT(Date_open) AS OpenNo
FROM tbl
GROUP BY Date1;
Both do not work. Any ideas please?
Below is the code to create tbl.
create table Tbl(
code int(10) primary key,
Date_open DATE not null,
Date_closed DATE not null,
Prio varchar(10));
insert into Tbl values (1,'2018-01-08','2018-01-08' ,'A');
insert into Tbl values (2,'2018-01-01','2018-01-08' ,'B');
insert into Tbl values (3,'2018-01-06','2018-01-07' ,'C');
insert into Tbl values (4,'2018-01-06','2018-01-06' ,'A');
insert into Tbl values (5,'2018-01-04','2018-01-06' ,'B');
insert into Tbl values (6,'2018-01-03','2018-01-01' ,'C');
insert into Tbl values (7,'2018-01-03','2018-01-02' ,'C');
insert into Tbl values (8,'2018-01-03','2018-01-02' ,'C');
You may use a calendar table, and then left join to your current table twice to generate the counts for each date:
SELECT
d.dt,
COALESCE(t1.open_cnt, 0) AS OpenNo,
COALESCE(t2.closed_cnt, 0) AS CloseNo
FROM
(
SELECT '2018-01-01' AS dt UNION ALL
SELECT '2018-01-02' UNION ALL
SELECT '2018-01-03' UNION ALL
SELECT '2018-01-04' UNION ALL
SELECT '2018-01-05' UNION ALL
SELECT '2018-01-06' UNION ALL
SELECT '2018-01-07' UNION ALL
SELECT '2018-01-08'
) d
LEFT JOIN
(
SELECT Date_open, COUNT(*) AS open_cnt
FROM Tbl
GROUP BY Date_open
) t1
ON d.dt = t1.Date_open
LEFT JOIN
(
SELECT Date_closed, COUNT(*) AS closed_cnt
FROM Tbl
GROUP BY Date_closed
) t2
ON d.dt = t2.Date_closed
GROUP BY
d.dt
ORDER BY
d.dt;
Demo
The reason I aggregate the open and closed date counts in separate subqueries is that if were to try to just do a straight join across all tables involved, we would have to deal with double counting.
Edit:
If you wanted to just use the current date and seven days immediately preceding it, then here is a CTE which would do that:
WITH dates (
SELECT CURDATE() AS dt UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 1 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 2 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 3 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 4 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 5 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 6 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(), INTERVAL 7 DAY)
)
You could inline the above into my original query which is aliased as d, and it should work.
Coalesce can be confusing - it returns the first non-null value from the list you provide to it.
I don't know if this question requires a super-complex answer.
To get the count of open and closed for each unique date, the following could work.
SELECT
COALESCE (Date_open, Date_closed) AS Date1,
SUM(IF(Date_closed != null,1,0)) AS ClosedNo,
SUM(IF(Date_open != null,1,0)) AS OpenNo
FROM tbl
GROUP BY Date1;
Consider the tables:
table
no | date
--------------------------------
1 | 2015-03-17 00:00:00.000
1 | 2015-03-17 00:00:00.000
1 | 2015-03-17 00:00:00.000
2 | 2015-03-01 00:00:00.000
2 | 2016-03-01 00:00:00.000
2 | 2016-03-01 00:00:00.000
What is the most efficient self-join query I can make, in order to produce the records that returns only the first 3 records (no. = 1) considering the condition is that the date must fall before 2016.
For instance, document no.2 will not show at all, because one of its date is > 2016, however document no.1 will show for all 3 records, because all 3 dates are < 2016
I tried the following:
SELECT a.no, a.date
FROM table a
INNER JOIN table b ON b.no = a.no AND b.date < '2016' --pseudocode for date comparison
However, the returned results are
no | date
--------------------------------
1 | 2015-03-17 00:00:00.000
1 | 2015-03-17 00:00:00.000
1 | 2015-03-17 00:00:00.000
2 | 2015-03-01 00:00:00.000
There's are couple of ways of doing it without even using JOIN! Here's one:
select * from tbl
where `no` not in
(
select `no` from tbl
where `date` >= '2016-01-01 00:00:00.000'
)
First you can get the list of id's that should return and then use the self join as below:
select b.*
from table b
join (select id from table group by id having year(max(date)) <2016) x
on (a.no = x.no);
SELECT t1.* FROM table t1
JOIN
(
SELECT no, max(date) as max_date FROM table
GROUP BY no
HAVING max_date < '2016-01-01'
) t2
ON t1.no = t2.no
Maybe I don't understand something but from your requirements - you don't need self-join.
For instance, document no.2 will not show at all, because one of its date is > 2016, however document no.1 will show for all 3 records, because all 3 dates are < 2016
You need anti-join.
SELECT a.no, a.date
FROM table A
WHERE
NOT EXISTS(
SELECT * FROM table B
WHERE
B.no = A.no
AND B.date < DATE('2016-01-01')
)
I have a table with the following data (merely an example, actual table has 600,000 rows) (aid = access id [primary key] and id = user id [foreign key]):
aid | id | date
332 | 1 | 2016-12-15
331 | 4 | 2016-12-15
330 | 3 | 2016-12-15
329 | 1 | 2016-12-14
328 | 1 | 2016-12-14
327 | 2 | 2016-12-14
326 | 3 | 2016-12-13
325 | 2 | 2016-12-13
324 | 1 | 2016-12-13
323 | 1 | 2016-12-12
322 | 3 | 2016-12-12
321 | 1 | 2016-12-12
Each id is a users primary key, and every time they access something in my system I log them in this table (with the date in the format as shown, and their id). A user can be logged multiple times a day.
I'm looking to: return the total number of times the thing has been accessed in a day and return the total number of NEW users who have accessed the thing in a day, for the last 8 days (something will always be logged each day, so using "LIMIT 8" is fine for getting only the last 8 days).
My SQL currently looks like:
SELECT COUNT(id), COUNT(distinct id), date
FROM table
GROUP BY date
ORDER BY date DESC
LIMIT 8;
That SQL does the first part correctly, but I can't figure out how to get it to return the number of users who have never accessed the thing until that day.
Desired results would be, the one "newuser" represents the user with id "4" as they have never accessed the thing before:
COUNT(id) | newusers | date
3 | 1 | 2016-12-15
3 | 0 | 2016-12-14
3 | 0 | 2016-12-13
3 | 0 | 2016-12-12
Sorry if I didn't explain this clear enough.
To get new users you want the first day an id appeared:
select id, min(date)
from t
group by id;
The rest is just a join and group by:
select d.date, cnt, count(dd.id) as newusers
from (select date, count(*) as cnt
from t
group by date
) d left join
(select id, min(date) as mindate
from t
group by id
) dd
on d.date = dd.mindate
group by d.date, d.cnt
limit 8;
To get the number of new users you need to compare them to a set of ids over the past 8 days
My MySQL is a bit rusty, so you might have to correct the syntax.
SELECT COUNT(id)
FROM table
WHERE id NOT IN (
SELECT DISTINCT id
FROM table
WHERE date BETWEEN DATE(DATE_SUB(NOW(), INTERVAL 8 DAY)) AND DATE(DATE_SUB(NOW(), INTERVAL 1 DAY))
)
I'll leave it as a task for you to combine it with your other query ;)
Hi if your date column in database is datetime/date or other date representing format you can do something like this:
for getting all users who accessed something in 8 days:
Select id, date from table
where date BETWEEN DATE_ADD(NOW(), INTERVAL -9 DAY) AND NOW()
I think, you can do whatever grouping you want on that.
To get new users, you can either go with self join or with sub select
selfjoin:
select t.id, t.date from table as t
LEFT join table as t2
ON t.id = t2.id
AND t.date BETWEEN DATE_ADD(NOW(), INTERVAL -1 DAY) AND NOW()
AND t2.date NOT BETWEEN DATE_ADD(NOW(), INTERVAL -9 DAY) AND NOW()
WHERE t2.id IS NULL
i used left join to match all access from users and then in where excluded those rows. However self joins are slow, and even slower with LEFT join
subselect:
select id, date from table
where date BETWEEN DATE_ADD(NOW(), INTERVAL -1 DAY) AND NOW()
AND id NOT IN (
SELECT id FROM table
WHERE date BETWEEN DATE_ADD(NOW(), INTERVAL -2 DAY) AND DATE_ADD(NOW(), INTERVAL -1 DAY)
)
I know those betweens with date_adds are not exactly nice looking, but i hope it will help you more than grouping dates
I would suggest using date with time for more information, but its entirely up to meaning of yours data
I have a table with :
user_id | order_date
---------+------------
12 | 2014-03-23
12 | 2014-01-24
14 | 2014-01-26
16 | 2014-01-23
15 | 2014-03-21
20 | 2013-10-23
13 | 2014-01-25
16 | 2014-03-23
13 | 2014-01-25
14 | 2014-03-22
A Active user is someone who has logged in last 12 months.
Need output as
Period | count of Active user
----------------------------
Oct-2013 - 1
Jan-2014 - 5
Mar-2014 - 10
The Jan 2014 value - includes Oct -2013 1 record and 4 non duplicate record for Jan 2014)
You can use a variable to calculate the running total of active users:
SELECT Period,
#total:=#total+cnt AS `Count of Active Users`
FROM (
SELECT CONCAT(MONTHNAME(order_date), '-', YEAR(order_date)) AS Period,
COUNT(DISTINCT user_id) AS cnt
FROM mytable
GROUP BY Period
ORDER BY YEAR(order_date), MONTH(order_date) ) t,
(SELECT #total:=0) AS var
The subquery returns the number of distinct active users per Month/Year. The outer query uses #total variable in order to calculate the running total of active users' count.
Fiddle Demo here
I've got two queries that do the thing. I am not sure which one's the fastest. Check them aginst your database:
SQL Fiddle
Query 1:
select per.yyyymm,
(select count(DISTINCT o.user_id) from orders o where o.order_date >=
(per.yyyymm - INTERVAL 1 YEAR) and o.order_date < per.yyyymm + INTERVAL 1 MONTH) as `count`
from
(select DISTINCT LAST_DAY(order_date) + INTERVAL 1 DAY - INTERVAL 1 MONTH as yyyymm
from orders) per
order by per.yyyymm
Results:
| yyyymm | count |
|---------------------------|-------|
| October, 01 2013 00:00:00 | 1 |
| January, 01 2014 00:00:00 | 5 |
| March, 01 2014 00:00:00 | 6 |
Query 2:
select DATE_FORMAT(order_date, '%Y-%m'),
(select count(DISTINCT o.user_id) from orders o where o.order_date >=
(LAST_DAY(o1.order_date) + INTERVAL 1 DAY - INTERVAL 13 MONTH) and
o.order_date <= LAST_DAY(o1.order_date)) as `count`
from orders o1
group by DATE_FORMAT(order_date, '%Y-%m')
Results:
| DATE_FORMAT(order_date, '%Y-%m') | count |
|----------------------------------|-------|
| 2013-10 | 1 |
| 2014-01 | 5 |
| 2014-03 | 6 |
The best thing I could do is this:
SELECT Date, COUNT(*) as ActiveUsers
FROM
(
SELECT DISTINCT userId, CONCAT(YEAR(order_date), "-", MONTH(order_date)) as Date
FROM `a`
ORDER BY Date
)
AS `b`
GROUP BY Date
The output is the following:
| Date | ActiveUsers |
|---------|-------------|
| 2013-10 | 1 |
| 2014-1 | 4 |
| 2014-3 | 4 |
Now, for every row you need to sum up the number of active users in previous rows.
For example, here is the code in C#.
int total = 0;
while (reader.Read())
{
total += (int)reader['ActiveUsers'];
Console.WriteLine("{0} - {1} active users", reader['Date'].ToString(), reader['ActiveUsers'].ToString());
}
By the way, for the March of 2014 the answer is 9 because one row is duplicated.
Try this, but thise doesn't handle the last part: The Jan 2014 value - includes Oct -2013
select TO_CHAR(order_dt,'MON-YYYY'), count(distinct User_ID ) cnt from [orders]
where User_ID in
(select User_ID from
(select a.User_ID from [orders] a,
(select a.User_ID,count (a.order_dt) from [orders] a
where a.order_dt > (select max(b.order_dt)-365 from [orders] b where a.User_ID=b.User_ID)
group by a.User_ID
having count(order_dt)>1) b
where a.User_ID=b.User_ID) a
)
group by TO_CHAR(order_dt,'MON-YYYY');
This is what I think you are looking for
SET #cnt = 0;
SELECT Period, #cnt := #cnt + total_active_users AS total_active_users
FROM (
SELECT DATE_FORMAT(order_date, '%b-%Y') AS Period , COUNT( id) AS total_active_users
FROM t
GROUP BY DATE_FORMAT(order_date, '%b-%Y')
ORDER BY order_date
) AS t
This is the output that I get
Period total_active_users
Oct-2013 1
Jan-2014 6
Mar-2014 10
You can also do COUNT(DISTINCT id) to get the unique Ids only
Here is a SQL Fiddle
I have a table where one column is the date:
+----------+---------------------+
| id | date |
+----------+---------------------+
| 5 | 2012-12-10 10:12:37 |
+----------+---------------------+
| 4 | 2012-12-10 09:09:55 |
+----------+---------------------+
| 3 | 2012-12-09 21:12:35 |
+----------+---------------------+
| 2 | 2012-12-09 20:15:07 |
+----------+---------------------+
| 1 | 2012-12-09 20:01:42 |
+----------+---------------------+
What I need, is to count the rows which are for example whitin 3 hours to each other. In this example I want to join the upper row with the 2nd row, and the 3rd row with the 4th and 5th rows. So my output should be like this:
+----------+---------------------+---------+
| id | date | count |
+----------+---------------------+---------+
| 5 | 2012-12-10 10:12:37 | 2 |
+----------+---------------------+---------+
| 3 | 2012-12-09 21:12:35 | 3 |
+----------+---------------------+---------+
How could I do this?
I think you need a self-join for this:
select t.id, t.date, COUNT(t2.id)
from t left outer join
t t2
on t.date between t2.date - interval 3 hour and t2.date + interval 3 hour
group by t.id, t.date
(This is untested code so it might have a syntax error.)
If you are trying to divide everything into 3-hour intervals, you can do something like:
select max(t.date), t.id, count(*)
from (select t.*,
(date(date)*100 + floor(hour(date)/3)*3) as interval
from t
) t
group by interval
I am not sure how to do this with My SQL but i am able to build a set of queries in SQL Server 2005 which will provide the intended results. Here is the working sample, its very complex and may be overly complex but that's how i was able to get the desired result:
WITH BaseData AS
(
SELECT 5 AS ID, '2012-12-10 10:12:37' AS Date
UNION ALL
SELECT 4 AS ID, '2012-12-10 09:09:55' AS Date
UNION ALL
SELECT 3 AS ID, '2012-12-09 21:12:35' AS Date
UNION ALL
SELECT 2 AS ID, '2012-12-09 20:15:07' AS Date
UNION ALL
SELECT 1 AS ID, '2012-12-09 20:01:42' AS Date
),
BaseDataWithRowNum AS
(
SELECT ID,DATE, ROW_NUMBER() OVER (ORDER BY Date DESC) AS RowNum
FROM BaseData
),
InterRelatedDates AS
(
SELECT B1.RowNum AS RowNum1,B2.RowNum AS RowNum2
FROM BaseDataWithRowNum B1
INNER JOIN BaseDataWithRowNum B2
ON B1.Date BETWEEN B2.Date AND DATEADD(hh,3,B2.Date)
AND B1.RowNum < B2.RowNum
AND B1.ID != B2.ID
),
InterRelatedDatesWithinMultipleGroups AS
(
SELECT G1.RowNum1,G2.RowNum2
FROM InterRelatedDates G1
LEFT JOIN InterRelatedDates G2
ON G1.RowNum2 = G2.RowNum2
AND G1.RowNum1 != G2.RowNum1
)
SELECT BN.ID,
BN.Date,
CountExcludingOriginalGrouppingRecord +1 AS C
FROM
(
SELECT RowNum1 AS RowNum,COUNT(1) AS CountExcludingOriginalGrouppingRecord
FROM
(
-- If a row was used in only one group then it is ok. use as it is
SELECT D1.RowNum1
FROM InterRelatedDatesWithinMultipleGroups AS D1
WHERE D1.RowNum2 IS NULL
UNION ALL
-- In case a row was selected in two groups, choose the one with higher date
SELECT Min(D1.RowNum1)
FROM InterRelatedDatesWithinMultipleGroups AS D1
WHERE D1.RowNum2 IS NOT NULL
GROUP BY D1.RowNum2
) T
GROUP BY RowNum1
) T2
INNER JOIN BaseDataWithRowNum BN
ON BN.RowNum = T2.RowNum