Records count per day, including 0 values - mysql

I'm trying to get a query that will show number of visits per day for the last 7 days. Query that I come up with works but it has limitation I do not know how to get rid of.
Imagine, it is August 4th, 2019. Our table visits keeps timestamps of users visits to a website:
ID | timestamp
1 | 2019-08-03
2 | 2019-08-03
3 | 2019-08-02
4 | 2019-07-31
5 | 2019-07-31
6 | 2019-07-31
7 | 2019-07-31
8 | 2019-07-30
9 | 2019-07-30
10 | 2019-07-28
Objective: get number of visits to a website per day for the last 7 days. So the result should be something like:
DATE | NumberOfVisitis
2018-08-04 | 0
2018-08-03 | 2
2018-08-02 | 1
2018-08-01 | 0
2018-07-31 | 4
2018-07-30 | 1
2018-07-29 | 0
My query includes only dates registered in DB (it excludes days with no visits). This makes sense as query is data dependent, instead of calendar.
SELECT DATE_FORMAT(`timestamp`, "%Y%m/%d") AS Date, COUNT(`id`) AS
NumberOfVisitis FROM `visits` WHERE `timestamp` >= DATE_ADD(NOW(),
INTERVAL -7 DAY) GROUP BY DAY(`timestamp`) ORDER BY `timestamp` DESC
Can you please let me know how can I modify my query to include days with no visits in the query result?

MySQL lacks anything like Postgres's generate_series so we have to fake it.
Simplest thing to do is to make a table with a bunch of numbers in it. This will be useful for generating lots of things.
create table numbers ( number serial );
insert into numbers () values (), (), (), (), (), (), ();
From that we can generate a list of the last 7 days.
select date_sub(date(now()), interval number-1 day) as date
from numbers
order by number
limit 7
Then using that as a CTE (or a subquery) we left join it with visits. A left join means all dates will be present.
with dates as (
select date_sub(date(now()), interval number-1 day) as date
from numbers
order by number
limit 7
)
select date, coalesce(sum(id), 0)
from dates
left join visits on date = timestamp
group by date
order by date

Related

How to add day gaps to mysql query without calendar table

I would like to receive the sum of all requests of the last 10 days grouped by date per day.
If there was no request on a day, the corresponding date should appear with sumrequests = 0.
My current query (today is the date 2020-01-10):
SELECT
count( 0 ) AS sumrequests,
cast( requests.created_at AS date ) AS created
FROM
requests
WHERE
(
requests.created_at
BETWEEN ( curdate() - INTERVAL 10 DAY )
AND ( curdate() + INTERVAL 1 DAY ))
GROUP BY
cast(requests.created_at AS date)
I then receive the following list:
sumrequests | created
--------------------------
3 | 2020-01-05
100 | 2020-01-08
But it should give back:
sumrequests | created
--------------------------
0 | 2020-01-01
0 | 2020-01-02
0 | 2020-01-03
0 | 2020-01-04
3 | 2020-01-05
0 | 2020-01-06
0 | 2020-01-07
100 | 2020-01-08
0 | 2020-01-09
0 | 2020-01-10
How can I get this without an additional calendar table.
Thanks for help!
For just 10 days of data, you can simply enumerate the numbers; using this derived number table, you can generate the corresponding date range, left join it with the table and aggregate.
SELECT
COALESCE(count(r.created_at), 0) AS sumrequests,
CURDATE() - INTERVAL (n.i) DAY AS created
FROM (
select 0 i union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6 union all select 7
union all select 8 union all select 9 union all select 10
) n
LEFT JOIN requests r
ON r.created_at >= CURDATE() - INTERVAL n.i DAY
AND r.created_at < CURDATE() - INTERVAL (n.i - 1) DAY
GROUP BY n.i
ORDER BY n.i DESC
Side notes:
generally you want to avoid applying functions in the join or filtering conditions, since it prevents the use of an index; I modified your filters to not use CAST()
Since we are left joining, we need to count something that is coming from the requests table, hence we use COUNT(r.created_at) instead of COUNT(0)

SUM timestampdiff of multiple durations per day

I'm having a table from my thermostat.
It records data as follows.
So when it switches on, I get a timestate with Status 1 meaning on, Status 0 mean heating switches off. Additionally it gives me with every on/off the total heatings per day.
Date | Status | Total_heatings
2019-01-20 10:00:00 | 1 | 1
2019-01-20 10:10:00 | 0 | 1
2019-01-20 14:00:00 | 1 | 2
2019-01-20 14:25:00 | 0 | 2
2019-01-20 18:00:00 | 1 | 3
2019-01-20 18:15:00 | 0 | 3
2019-01-21 01:00:00 | 1 | 1
2019-01-21 01:30:00 | 0 | 1
2019-01-21 06:00:00 | 1 | 2
2019-01-21 06:15:00 | 0 | 2
I'm trying to get the total duration by day. I tried the below script, which gives me the durations for the multiple heating sessions for each day.
When I use SUM(TIMESTAMPDIFF(Minute,Min(Date),MAX(Date))) it throws an error because of wrong usage of grouping.
SELECT
DATE_FORMAT(Date, '%d.%m') AS 'day',
TIMESTAMPDIFF(MINUTE,MIN(Date),MAX(Date)) AS 'Duration'
FROM thermostat
WHERE (Date BETWEEN '2019-01-21 00:00:00' + INTERVAL -7 DAY AND '2019-01-21 00:00:00')
GROUP BY DAY(Date),Total_heatings;
All I would need is to get a SUM by day of these various heating sessions per day.
So the result should have the following:
Day | Duration
20.01 | 50
21.01 | 45
Now I'm stuck with not being able to further summing all heating session per day, like total duration each day.
Thanks a lot for any pointers and help.
This query will work for MySQL versions before 8.0. It uses a SELF JOIN to find matching heater off rows for a given heater on row. Where a matching row doesn't exist, it uses either the end of the day or the current time, whichever is lower.
SELECT DATE_FORMAT(t1.Date, '%d.%m') AS `day`,
SUM(TIMESTAMPDIFF(MINUTE, t1.Date, COALESCE(t2.Date, LEAST(NOW(), DATE(t1.Date) + INTERVAL 1 DAY)))) AS Duration,
MAX(t1.Total_heatings) AS Total_heatings
FROM thermostat t1
LEFT JOIN thermostat t2 ON t2.Status = 0 AND t2.Total_heatings = t1.Total_heatings AND DATE(t2.Date) = DATE(t1.Date)
WHERE t1.Status = 1 AND DATE(t1.Date) BETWEEN '2019-01-21' - INTERVAL 7 DAY AND '2019-01-21'
GROUP BY `day`
Output:
day Duration Total_heatings
20.01 50 3
21.01 45 2
Demo on dbfiddle
If you are using MySQL 8, you can use window function LAG to access the previous switch. In the outer query, you can filter on intervals where the previous status was on.
SELECT
DATE_FORMAT(x.date, '%d.%m'),
SUM(TIMESTAMPDIFF( minute, x.date, x.last_date) duration
FROM (
SELECT
t.*,
LAG(t.date) OVER (PARTITION BY DATE_FORMAT(t.date, '%d.%m') ORDER BY t.date) last_date,
LAG(t.status) OVER (PARTITION BY DATE_FORMAT(t.date, '%d.%m') ORDER BY t.date) last_status
FROM mytable t
) x
WHERE x.last_status = 1
GROUP BY DATE_FORMAT(x.date, '%d.%m')
ORDER BY 1
In this db fiddle, this matches your expected output.
Using window function available in MySQL-8.0 and MariaDB-10.2:
select DATE(ts) as 'day', sum(ontime) as 'on time'
from (
select status, lead(ts,1,ts) over w - ts as 'ontime'
from (
select unix_timestamp(ts) as ts, status
from t
order by ts
) x
window w as (order by ts)
) y
where status=1
group by 'day';

Finding date where conditions within 30 days has elapsed

For my website, I have a loyalty program where a customer gets some goodies if they've spent $100 within the last 30 days. A query like below:
SELECT u.username, SUM(total-shipcost) as tot
FROM orders o
LEFT JOIN users u
ON u.userident = o.user
WHERE shipped = 1
AND user = :user
AND date >= DATE(NOW() - INTERVAL 30 DAY)
:user being their user ID. Column 2 of this result gives how much a customer has spent in the last 30 days, if it's over 100, then they get the bonus.
I want to display to the user which day they'll leave the loyalty program. Something like "x days until bonus expires", but how do I do this?
Take today's date, March 16th, and a user's order history:
id | tot | date
-----------------------
84 38 2016-03-05
76 21 2016-02-29
74 49 2016-02-20
61 42 2015-12-28
This user is part of the loyalty program now but leaves it on March 20th. What SQL could I do which returns how many days (4) a user has left on the loyalty program?
If the user then placed another order:
id | tot | date
-----------------------
87 12 2016-03-09
They're still in the loyalty program until the 20th, so the days remaining doesn't change in this instance, but if the total were 50 instead, then they instead leave the program on the 29th (so instead of 4 days it's 13 days remaining). For what it's worth, I care only about 30 days prior to the current date. No consideration for months with 28, 29, 31 days is needed.
Some create table code:
create table users (
userident int,
username varchar(100)
);
insert into users values
(1, 'Bob');
create table orders (
id int,
user int,
shipped int,
date date,
total decimal(6,2),
shipcost decimal(3,2)
);
insert into orders values
(84, 1, 1, '2016-03-05', 40.50, 2.50),
(76, 1, 1, '2016-02-29', 22.00, 1.00),
(74, 1, 1, '2016-02-20', 56.31, 7.31),
(61, 1, 1, '2015-12-28', 43.10, 1.10);
An example output of what I'm looking for is:
userident | username | days_left
--------------------------------
1 Bob 4
This is using March 16th as today for use with DATE(NOW()) to remain consistent with the previous bits of the question.
The following is basically how to do what you want. Note that references to "30 days" are rough estimates and what you may be looking for is "29 days" or "31 days" as works to get the exact date that you want.
Retrieve the list of dates and amounts that are still active, i.e., within the last 30 days (as you did in your example), as a table (I'll call it Active) like the one you showed.
Join that new table (Active) with the original table where a row from Active is joined to all of the rows of the original table using the date fields. Compute a total of the amounts from the original table. The new table would have a Date field from Active and a Totol field that is the sum of all the amounts in the joined records from the original table.
Select from the resulting table all records where the Amount is greater than 100.00 and create a new table with Date and the minimum Amount of those records.
Compute 30 days ahead from those dates to find the ending date of their loyalty program.
You would need to take the following steps (per user):
join the orders table with itself to calculate sums for different (bonus) starting dates, for any of the starting dates that are in the last 30 days
select from those records only those starting dates which yield a sum of 100 or more
select from those records only the one with the most recent starting date: this is the start of the bonus period for the selected user.
Here is a query to do that:
SELECT u.userident,
u.username,
MAX(base.date) AS bonus_start,
DATE(MAX(base.date) + INTERVAL 30 DAY) AS bonus_expiry,
30-DATEDIFF(NOW(), MAX(base.date)) AS bonus_days_left
FROM users u
LEFT JOIN (
SELECT o.user,
first.date AS date,
SUM(o.total-o.shipcost) as tot
FROM orders first
INNER JOIN orders o
ON o.user = first.user
AND o.shipped = 1
AND o.date >= first.date
WHERE first.shipped = 1
AND first.date >= DATE(NOW() - INTERVAL 30 DAY)
GROUP BY o.user,
first.date
HAVING SUM(o.total-o.shipcost) >= 100
) AS base
ON base.user = u.userident
GROUP BY u.username,
u.userident
Here is a fiddle.
With this input as orders:
+----+------+---------+------------+-------+----------+
| id | user | shipped | date | total | shipcost |
+----+------+---------+------------+-------+----------+
| 61 | 1 | 1 | 2015-12-28 | 42 | 0 |
| 74 | 1 | 1 | 2016-02-20 | 49 | 0 |
| 76 | 1 | 1 | 2016-02-29 | 21 | 0 |
| 84 | 1 | 1 | 2016-03-05 | 38 | 0 |
| 87 | 1 | 1 | 2016-03-09 | 50 | 0 |
+----+------+---------+------------+-------+----------+
The above query will return this output (when executed on 2016-03-20):
+-----------+----------+-------------+--------------+-----------------+
| userident | username | bonus_start | bonus_expiry | bonus_days_left |
+-----------+----------+-------------+--------------+-----------------+
| 1 | John | 2016-02-29 | 2016-03-30 | 10 |
+-----------+----------+-------------+--------------+-----------------+
Simple solution
Seeing how you do your first query, I guessed that when you are at the point where you look for the "expiration date", you already know that the user meets the 100 points over last 30 days. Then you can do this :
SELECT DATE_ADD(MIN(date),INTERVAL 30 DAY)
FROM orders o
WHERE shipped = 1
AND user = :user
AND date >= (DATE(NOW() - INTERVAL 30 DAY))
It takes the minimum order date of a user over the last 30 days, and add 30 days to the result.
But that really is a poor design to achieve what you want.
You would better to think further and implement what's next.
Advanced solution
In order to reproduce all the following solution, I have used the Fiddle that Trincot kindly built, and expanded it to test on more data : 4 users having 4 orders.
SQL FIddle http://sqlfiddle.com/#!9/668939/1
Step 1 : Design
The following query will return all the users meeting the loyalty program criteria, along with their earlier order date within 30 days and the loyalty program expiration date calculated from the earlier date, and the number of days before it expires.
SELECT O.user, u.username, SUM(total-shipcost) as tot, MIN(date) AS mindate,
DATE_ADD(MIN(date),INTERVAL 30 DAY) AS expirationdate,
DATEDIFF(DATE_ADD(MIN(date),INTERVAL 30 DAY), DATE(NOW())) AS daysleft
FROM orders o
LEFT JOIN users u
ON u.userident = o.user
WHERE shipped = 1
AND date >= DATE(NOW() - INTERVAL 30 DAY)
GROUP BY user
HAVING tot >= 100;
Now, create a VIEW with the above query
CREATE VIEW loyalty_program AS
SELECT O.user, u.username, SUM(total-shipcost) as tot, MIN(date) AS mindate,
DATE_ADD(MIN(date),INTERVAL 30 DAY) AS expirationdate,
DATEDIFF(DATE_ADD(MIN(date),INTERVAL 30 DAY), DATE(NOW())) AS daysleft
FROM orders o
LEFT JOIN users u
ON u.userident = o.user
WHERE shipped = 1
AND date >= DATE(NOW() - INTERVAL 30 DAY)
GROUP BY user
HAVING tot >= 100;
It is important to understand that this is only a one-shot action on your database.
Step 2 : Use your new VIEW
Once you have the view, you can get easily, for all users, the "state" of the loyalty program:
SELECT * FROM loyalty_program
user username tot mindate expirationdate daysleft
1 John 153 February, 28 2016 March, 29 2016 9
2 Joe 112 February, 24 2016 March, 25 2016 5
3 Jack 474 February, 23 2016 March, 24 2016 4
4 Averel 115 February, 22 2016 March, 23 2016 3
For a specific user, you can get the date you are looking for like this:
SELECT expirationdate FROM loyalty_program WHERE username='Joe'
You can also request all the users for which the expiration date is today
SELECT user FROM loyalty_program WHERE expirationdate=DATE(NOW))
But there are other easy possibilities that you'll discover after having played with your VIEW.
Conclusion
Make your life easier: learn to use VIEWS !
I am assuming your table looks like this:
user | id | total | date
-------------------------------
12 84 38 2016-03-05
12 76 21 2016-02-29
23 74 49 2016-02-20
23 61 42 2015-12-28
then try this:
SELECT x.user, x.date, x.id, x.cum_sum, d,date, DATEDIFF(NOW(), x.date) from (SELECT a.user, a.id, a.date, a.total,
(SELECT SUM(b.total) FROM order_table b WHERE b.date <= a.date and a.user=b.user ORDER BY b.user, b.id DESC) AS cum_sum FROM order_table a where a.date>=DATE(NOW() - INTERVAL 30 DAY) ORDER BY a.user, a.id DESC) as x
left join
(SELECT c.user, c.date as start_date, c.id from (SELECT a.user, a.id, a.date, a.total,
(SELECT SUM(b.total) FROM order_table b WHERE b.date <= a.date and a.user=b.user ORDER BY b.user, b.id DESC) AS cum_sum FROM order_table a where a.date>=DATE(NOW() - INTERVAL 30 DAY) ORDER BY a.user, a.id DESC) as c WHERE FLOOR(c.cum_sum/100)=MIN(FLOOR(c.cum_sum/100)) and MOD(c.cum_sum,100)=MAX(MOD(c.cum_sum,100)) group by concat(c.user, "_", c.id)) as d on concat(x.user, "_", x.id)=concat(d.user, "_", d.id) where x.date=d.date;
You will get a table something like this:
user | Date | cum_sum | start_date | Time_left
----------------------------------------------------
12 2016-03-05 423 2016-03-05 24
13 2016-02-29 525 2016-02-29 12
23 2016-02-20 944 2016-02-20 3
29 2015-12-28 154 2015-12-28 4
i have not tested this. But what i am trying to do is to create a table in descending order of id and user, and get a cumulative total column along with it. I have created another table by using this table with cumulative total, with relevant date (i.e. date from which date difference is to be calculated) for each user. I have left joined these two tables, and put in the condition x.date=d.date. I have put start_date and date in the table to check if the query is working.
Also, this is not the most optimum way of writing this code, but i have tried to stay as safe as possible by using sub queries, since i did not have the data to test this. Let me know if you face any error.

date_add and repeat forever

I got an alert table for users, in which we have to send alerts to users in user defined intervals like 0 ( only once), 3 months, 6 months, 1 year
So I designed a table like this
id | user_id | alert_date | repeat_int
-----+--------------+-------------------------+-------------
12 | 747 | 2013-04-19 00:00:00 | 0
13 | 746 | 2013-03-19 00:00:00 | 1
14 | 745 | 2012-04-19 00:00:00 | 0
15 | 744 | 2013-04-19 00:00:00 | 0
16 | 743 | 2013-05-19 00:00:00 | 0
We are sending alert just a day before "alert_date"
With the following query I can fetch the data
SELECT al.id,
al.user_id,
al.alert_date,
al.repeat_int AS repunit
FROM alerts AS al
WHERE DATE_ADD(alert_date,INTERVAL repeat_int MONTH)=date_add(CURRENT_DATE,INTERVAL 1 DAY)
OR date(al.alert_date)=date_add(CURRENT_DATE,INTERVAL 1 DAY)
Its working file but my real problem is
The repeat will only works once, we need it repeat every interval
ie. If alert date is 2012-03-14 and repeat_int is 0 - Need to work only once
but if alert date is 2012-03-14 and repeat_int is 1 - Need to work in every 14th from 2012-03-14
and if the alert date is 2012-03-14 and repeat_int is 3 - Need to work in every three month's 14. ie alert on 2012-03-14, 2012-06-14, 2012-09-14 etc...
Is there any way to do that?
Update
The OP has changed his schema in response to comments, so the query is essentially:
SELECT *
FROM alerts
WHERE CURRENT_DATE + INTERVAL 1 DAY = COALESCE(next_alert_date, alert_date);
This handles "next_alert_date" being NULL on the very first run.
Original answer
For the original schema:
SELECT *
FROM alerts
JOIN (SELECT CURRENT_DATE + INTERVAL 1 DAY AS tomorrow) d
WHERE -- We want to alert if
-- 1. Tomorrow is the alert_date
tomorrow = alert_date
OR
--
-- 2. Tomorrow is "repeat_int" months removed from alert_date, falling on
-- the same day of the month or on the end of the month if the original
-- alert_date day of month is later in the month than is possible for us
-- now. E.g., 2013-01-31 repeated monthly is adjusted to 2013-02-28.
(
PERIOD_DIFF(DATE_FORMAT(tomorrow, '%Y%m'), DATE_FORMAT(alert_date, '%Y%m'))
MOD repeat_int = 0
AND
-- Make sure we are at the same day of the month
( (DAYOFMONTH(tomorrow) = DAYOFMONTH(alert_date)
OR
-- Or, if the day of the alert is beyond the last day of our month,
-- that we are at the end of our month.
(LAST_DAY(alert_date) > LAST_DAY(tomorrow)
AND
DAYOFMONTH(tomorrow) = LAST_DAY(tomorrow)) )
);

Showing today's current rank and yesterday's

I have a table with IDs, rank, chart_date, and pageviews. It's based on a cron job that is run nightly and compiles the number of pageviews for that ID.
For instance:
ID | RANK | PAGEVIEWS | CHART_DATE
5 1 100 2012-10-14
9 2 75 2012-10-14
13 3 25 2012-10-14
9 1 123 2012-10-13
5 2 74 2012-10-13
19 3 13 2012-10-13
So I'm grabbing today's chart based on 2012-10-14 and ranking the data by 1-3. But I also want to show the rank where the ID was on the previous date.
For instance, on 2012-10-14 ID 5 was ranked 1 but on 2012-10-13 it was ranked 2.
Can I do this with one query? Or do I have to loop thru the results based on today and do a query for each ID?
Can I do this with one query?
You can, but you need a JOIN between the table with today's date and the table with yesterday's date:
SELECT today.*, yesterday.rank
FROM yourtable AS today
JOIN yourtable AS yesterday
ON (today.id = yesterday.id
AND today.chart_date = date(now())
AND yesterday.chart_date = date(date_sub(now(), interval 1 day))
)
ORDER BY today.rank DESC;
You can even show the difference:
SELECT today.*, yesterday.rank AS yest, yesterday.rank-today.rank AS incr
FROM yourtable AS today
LEFT JOIN yourtable AS yesterday
ON (today.id = yesterday.id
AND today.chart_date = date(now())
AND yesterday.chart_date = date(date_sub(now(), interval 1 day))
)
ORDER BY today.rank DESC;
ID | RANK | PAGEVIEWS | CHART_DATE | YEST | INCR
5 1 100 2012-10-14 2 | 1
9 2 75 2012-10-14 1 | -1
13 3 25 2012-10-14 4 | 1
(LEFT JOIN ensures today's data is there even if yesterday's isn't).
Untested but something like this should work:
select today.id, today.rank, yesterday.rank
from mytable as today
left join mytable as yesterday on today.id = yesterday.id
where today.chart_date = 2012-10-14
order by pageviews desc limit 3