I am trying to find number of users every month.
This is my SQL which I learn from another question.
The part for creating number of month is easy to understand but it is long. I am wondering is there a neater way to write the same SQL. Thanks.
SELECT
meses.MONTH,
COUNT(Users.user_ID) AS num_of_user
FROM
(
SELECT
1 AS MONTH
UNION
SELECT
2 AS MONTH
UNION
SELECT
3 AS MONTH
UNION
SELECT
4 AS MONTH
UNION
SELECT
5 AS MONTH
UNION
SELECT
6 AS MONTH
UNION
SELECT
7 AS MONTH
UNION
SELECT
8 AS MONTH
UNION
SELECT
9 AS MONTH
UNION
SELECT
10 AS MONTH
UNION
SELECT
11 AS MONTH
UNION
SELECT
12 AS MONTH
) AS meses
LEFT JOIN
Users
ON
meses.month = MONTH(Users.joint_date) AND YEAR(Users.joint_date) = '2000'
GROUP BY
meses.MONTN
In MySQL 8.0, you can use a recursive query to generate the series.
I would also recommend filtering against literal dates rather than applying date function on the column being filtered: this is much more efficient, and can take advantage of an index on users(joint_date).
with dates as (
select '2020-01-01' dt
union all select dt + interval 1 month from dates where dt + interval 1 month < '2021-01-01'
)
select d.dt, count(u.user_id) as num_of_users
from dates d
left join users u
on u.joint_date >= d.dt
and u.joint_date < d.dt + interval 1 month
group by d.dt
In earlier versions, you do need to enumerate the dates, using union. However I would still recommend the literal date technique. That would look like:
select '2020-01-01' + interval n.n month as dt, count(u.user_id) as num_of_users
from (select 0 n union all select 2 ... union all select 11) n
left join users u
on u.joint_date >= '2020-01-01' + interval n.n month
and u.joint_date < '2020-01-01' + interval (n.n + 1) month
group by n.n
Related
I am looking for a query in MYSQL that would allow me to obtain the equivalent date for each month of the current year from old dates so for example The date: '2005-01-31' I would like to see the following populated into 12 separate month fields:
Jan - '2021-01-31',
Feb - '2021-02-28',
Mar - '2021-03-31'
I have attempted the following query however this only populates the same month of the old date but does however show the current equivalent day and year:
Select DATE_FORMAT(DATE_ADD('2005-01-31', INTERVAL (YEAR(CURRENT_DATE()) - YEAR('2005-01-31') ) YEAR), '%Y-%m-%d') `date`;
'2021-01-31'
An example for a few months would be much appreciated and I should be able to adapt for the rest of the calendar year myself.
WITH RECURSIVE
cte AS ( SELECT 0 n
UNION ALL
SELECT n + 1 FROM cte WHERE n < 11 )
SELECT DATE_FORMAT(#date, '2021-%m-%d') + INTERVAL n MONTH `date`
FROM cte
fiddle
does not work for me on MySQL 5.x
SELECT DATE_FORMAT(#date, '2021-%m-%d') + INTERVAL n MONTH `date`
FROM (SELECT 0 n UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION
SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION
SELECT 8 UNION SELECT 9 UNION SELECT 10 UNION SELECT 11 ) cte
fiddle
I have a table like this two
Table A
date amount B_id
'2020-1-01' 3000000 1
'2019-8-01' 15012 1
'2019-6-21' 90909 1
'2020-1-15' 84562 1
--------
Table B
id type
1 7
2 5
I have to show sum of amount until the last date of each month for the last 12 month.
The query i have prepared is like this..
SELECT num2.last_dates,
(SELECT SUM(amount) FROM A
INNER JOIN B ON A.B_id = B.id
WHERE B.type = 7 AND A.date<=num2.last_dates
),
(SELECT SUM(amount) FROM A
INNER JOIN B ON A.B_id = B.id
WHERE B.type = 5 AND A.date<=num2.last_dates)
FROM
(SELECT last_dates
FROM (
SELECT LAST_DAY(CURDATE() - INTERVAL CUSTOM_MONTH MONTH) last_dates
FROM(
SELECT 1 CUSTOM_MONTH UNION
SELECT 0 UNION
SELECT 2 UNION
SELECT 3 UNION
SELECT 4 UNION
SELECT 5 UNION
SELECT 6 UNION
SELECT 7 UNION
SELECT 8 UNION
SELECT 9 UNION
SELECT 10 UNION
SELECT 11 UNION
SELECT 12 )num
) num1
)num2
ORDER BY num2.last_dates
This gives me the result like this which is exactly how i need it. I need this query to execute faster. Is there any better way to do what i am trying to do?
2019-05-31 33488.69 109.127800
2019-06-30 263.690 1248932.227800
2019-07-31 274.690 131.827800
2019-08-31 627.690 13.687800
2019-09-30 1533.370000 08.347800
2019-10-31 1444.370000 01.327800
2019-11-30 5448.370000 247.227800
2019-12-31 61971.370000 016.990450
2020-01-31 19550.370000 2535.185450
2020-02-29 986.370000 405.123300
2020-03-31 1152.370000 26.793300
2020-04-30 9404.370000 11894.683300
2020-05-31 3404.370000 17894.683300
I'd use conditional aggregation, and pre-aggregate the monthly totals in one pass, instead of doing twenty-six individual passes repeatedly through the same data.
I'd start with something like this:
SELECT CASE WHEN A.date < DATE(NOW()) + INTERVAL -14 MONTH
THEN LAST_DAY( DATE(NOW()) + INTERVAL -14 MONTH )
ELSE LAST_DAY( A.date )
END AS _month_end
, SUM(IF( B.type = 5 , B.amount , NULL)) AS tot_type_5
, SUM(IF( B.type = 7 , B.amount , NULL)) AS tot_type_7
FROM A
JOIN B
ON B.id = A.B_id
WHERE B.type IN (5,7)
GROUP
BY _month_end
(column amount isn't qualified in original query, so just guessing here which table that is from. adjust as necessary. best practice is to qualify all column references.
That gets us the subtotals for each month, in a single pass through A and B.
We can get that query tested and tuned.
Then we can incorporate that as an inline view in an outer query which adds up those monthly totals. (I'd do an outer join, just in case rows are missing, sow we don't wind up omitting rows.)
Something like this:
SELECT d.dt + INTERVAL -i.n MONTH + INTERVAL -1 DAY AS last_date
, SUM(IFNULL(t.tot_type_5,0)) AS rt_type_5
, SUM(IFNULL(t.tot_type_7,0)) AS rt_type_7
FROM ( -- first day of next month
SELECT DATE(NOW()) + INTERVAL -DAY(DATE(NOW()))+1 DAY + INTERVAL 1 MONTH AS dt
) d
CROSS
JOIN ( -- thirteen integers, integers 0 thru 12
SELECT 0 AS n
UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8
UNION ALL SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL SELECT 12
) i
LEFT
JOIN ( -- totals by month
SELECT CASE WHEN A.date < DATE(NOW()) + INTERVAL -14 MONTH
THEN LAST_DAY( DATE(NOW()) + INTERVAL -14 MONTH )
ELSE LAST_DAY( A.date )
END AS _month_end
, SUM(IF( B.type = 5 , B.amount , NULL)) AS tot_type_5
, SUM(IF( B.type = 7 , B.amount , NULL)) AS tot_type_7
FROM A
JOIN B
ON B.id = A.B_id
WHERE B.type IN (5,7)
GROUP
BY _month_end
) t
ON t._month_end < d.dt
GROUP BY d.dt + INTERVAL -i.n MONTH + INTERVAL -1 DAY
ORDER BY d.dt + INTERVAL -i.n MONTH + INTERVAL -1 DAY DESC
The design is meant to do one swoop through the A JOIN B set. We're expecting to get about 14 rows back. And we're doing a semi-join, duplicating the oldest months multiple times, so approx . 14 x 13 / 2 = 91 rows, that get collapsed into 13 rows.
The big rock in terms of performance is going to be materializing that inline view query.
This is how I'd probably approach this in MySQL 8 with SUM OVER:
Get the last 12 months.
Use these months to add empty month rows to the original data, as MySQL doesn't support full outer joins.
Get the running totals for all months.
Show only the last twelve months.
The query:
with months (date) as
(
select last_day(current_date - interval 1 month) union all
select last_day(current_date - interval 2 month) union all
select last_day(current_date - interval 3 month) union all
select last_day(current_date - interval 4 month) union all
select last_day(current_date - interval 5 month) union all
select last_day(current_date - interval 6 month) union all
select last_day(current_date - interval 7 month) union all
select last_day(current_date - interval 8 month) union all
select last_day(current_date - interval 9 month) union all
select last_day(current_date - interval 10 month) union all
select last_day(current_date - interval 11 month) union all
select last_day(current_date - interval 12 month)
)
, data (date, amount, type) as
(
select last_day(a.date), a.amount, b.type
from a
join b on b.id = a.b_id
where b.type in (5, 7)
union all
select date, null, null from months
)
select
date,
sum(sum(case when type = 5 then amount end)) over (order by date) as t5,
sum(sum(case when type = 7 then amount end)) over (order by date) as t7
from data
group by date
order by date
limit 12;
Demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=ddeb3ab3e086bfc182f0503615fba74b
I don't know whether this is faster than your own query or not. Just give it a try. (You'd get my query much faster by adding a generated column for last_day(date) to your table and use this. If you need this often, this may be an option.)
You are getting some complicated answers. I think it is easier. Start with knowing we can easily sum for each month:
SELECT SUM(amount) as monthtotal,
type,
MONTH(date) as month,
YEAR(date) as year
FROM A LEFT JOIN B on A.B_id=B.id
GROUP BY type,month,year
From that data, we can use a variable to get running total. Best to do by initializing the variable, but not necessary. We can get the data necessary like this
SET #running := 0;
SELECT (#running := #running + monthtotal) as running, type, LAST_DAY(CONCAT(year,'-',month,'-',1))
FROM
(SELECT SUM(amount) as monthtotal,type,MONTH(date) as month,YEAR(date) as year FROM A LEFT JOIN B on A.B_id=B.id GROUP BY type,month,year) AS totals
ORDER BY year,month
You really need to have a connector that supports multiple statements, or make multiple calls to initialize the variable. Although you can null check the variable and default to 0, you still have an issue if you run the query a second time.
Last thing, if you really want the types to be summed separately:
SET #running5 := 0;
SET #running7 := 0;
SELECT
LAST_DAY(CONCAT(year,'-',month,'-',1)),
(#running5 := #running5 + (CASE WHEN type=5 THEN monthtotal ELSE 0 END)) as running5,
(#running7 := #running7 + (CASE WHEN type=7 THEN monthtotal ELSE 0 END)) as running7
FROM
(SELECT SUM(amount) as monthtotal,type,MONTH(date) as month,YEAR(date) as year FROM A LEFT JOIN B on A.B_id=B.id GROUP BY type,month,year) AS totals
ORDER BY year,month
We still don't show months where there is no data. I'm not sure that is a requirement. But this should only need one pass of table A.
Also, make sure the id on table B is indexed.
I'm writing this query where it gets a row value and it will return the number of records for each day for that row between two given dates and returns 0 if there is no records for that day.
I've written a query which does this for the past week.
Current Query:
select d.day, count(e.event) as count
from (
select 0 day union all
select 1 union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6
) d
left join event e
on e.timestamp >= current_date - interval d.day day
and e.timestamp < current_date - interval (d.day - 1) day
and e.event = ?
group by d.day
The problem is this returns only the results for a fixed number of days.. I want to be able to give it two dates (start and end dates) and get the record counts for each day where I don't know the number of dates in between.
You could use/create a bona-fide calendar table. Something like this:
SELECT
d.day,
COUNT(e.timestamp) AS cnt
FROM
(
SELECT '2020-01-01' AS day UNION ALL
SELECT '2020-01-02' UNION ALL
...
SELECT '2020-12-31'
) d
LEFT JOIN event e
ON e.timestamp >= d.day AND e.timestamp < DATE_ADD(d.day, INTERVAL 1 DAY)
WHERE
d.day BETWEEN <start_date> AND <end_date>
GROUP BY
d.day;
I have covered only the calendar year 2020, but you may extend to cover whatever range you want.
The following query returns the visitors and pageviews of last 7 days. However, if there are no results (let's say it is a fresh account), nothing is returned.
How to edit this in order to return 0 in days that there are no entries?
SELECT Date(timestamp) AS day,
Count(DISTINCT hash) AS visitors,
Count(*) AS pageviews
FROM behaviour
WHERE company_id = 1
AND timestamp >= Subdate(Curdate(), 7)
GROUP BY day
Assuming that you always have at least one record in the table for each of the last 7 days (regardless of the company_id), then you can use conditional aggregation as follows:
select
date(timestamp) as day,
count(distinct case when company_id = 1 then hash end) as visitors,
sum(company_id = 1) as pageviews
from behaviour
where timestamp >= curdate() - interval 7 day
group by day
Note that I changed you query to use standard date arithmetics, which I find easier to understand that date functions.
Otherwise, you would need to move the condition on the date from the where clause to the aggregate functions:
select
date(timestamp) as day,
count(distinct case when timestamp >= curdate() - interval 7 day and company_id = 1 then hash end) as visitors,
sum(timestamp >= curdate() - interval 7 day and company_id = 1) as pageviews
from behaviour
group by day
If your table is big, this can be expensive so I would not recommend that.
Alternatively, you can generate a derived table of dates and left join it with your original query:
select
curdate - interval x.n day day,
count(distinct b.hash) visitors,
count(b.hash) page_views
from (
select 1 n union all select 2 union all select 3 union all select 4
union all select 5 union all select 6 union all select 7
) x
left join behavior b
on b.company_id = 1
and b.timestamp >= curdate() - interval x.n day
and b.timestamp < curdate() - interval (x.n - 1) day
group by x.n
Use a query that returns all the dates from today minus 7 days to today and left join the table behaviour:
SELECT t.timestamp AS day,
Count(DISTINCT b.hash) AS visitors,
Count(b.timestamp) AS pageviews
FROM (
SELECT Subdate(Curdate(), 7) timestamp UNION ALL SELECT Subdate(Curdate(), 6) UNION ALL
SELECT Subdate(Curdate(), 5) UNION ALL SELECT Subdate(Curdate(), 4) UNION ALL SELECT Subdate(Curdate(), 3) UNION ALL
SELECT Subdate(Curdate(), 2) UNION ALL SELECT Subdate(Curdate(), 1) UNION ALL SELECT Curdate()
) t LEFT JOIN behaviour b
ON Date(b.timestamp) = t.timestamp AND b.company_id = 1
GROUP BY day
Use IFNULL:
IFNULL(expr1, 0)
From the documentation:
If expr1 is not NULL, IFNULL() returns expr1; otherwise it returns expr2. IFNULL() returns >a numeric or string value, depending on the context in which it is used.
You can use next trick:
First, get query that return 1 dummy row: SELECT 1;
Next use LEFT JOIN to connect summary row(s) without condition. This join will return values in case data exists on NULL values in other case.
Last select from joined queries onle what we need and convert NULL's to ZERO's
using IFNULL dunction.
SELECT
IFNULL(b.day,0) AS DAY,
IFNULL(b.visitors,0) AS visitors,
IFNULL(b.pageviews,0) AS pageviews
FROM (
SELECT 1
) a
LEFT JOIN (
SELECT DATE(TIMESTAMP) AS DAY,
COUNT(DISTINCT HASH) AS visitors,
COUNT(*) AS pageviews
FROM behaviour
WHERE company_id = 1
AND TIMESTAMP >= SUBDATE(CURDATE(), 7)
GROUP BY DAY
) b ON 1 = 1;
I have a table with sell orders and I want to list the COUNT of sell orders per day, between two dates, without leaving date gaps.
This is what I have currently:
SELECT COUNT(*) as Norders, DATE_FORMAT(date, "%M %e") as sdate
FROM ORDERS
WHERE date <= NOW()
AND date >= NOW() - INTERVAL 1 MONTH
GROUP BY DAY(date)
ORDER BY date ASC;
The result I'm getting is as follows:
6 May 1
14 May 4
1 May 5
8 Jun 2
5 Jun 15
But what I'd like to get is:
6 May 1
0 May 2
0 May 3
14 May 4
1 May 5
0 May 6
0 May 7
0 May 8
.....
0 Jun 1
8 Jun 2
.....
5 Jun 15
Is that possible?
Creating a range of dates on the fly and joining that against you orders table:-
SELECT sub1.sdate, COUNT(ORDERS.id) as Norders
FROM
(
SELECT DATE_FORMAT(DATE_SUB(NOW(), INTERVAL units.i + tens.i * 10 + hundreds.i * 100 DAY), "%M %e") as sdate
FROM (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9)units
CROSS JOIN (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9)tens
CROSS JOIN (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9)hundreds
WHERE DATE_SUB(NOW(), INTERVAL units.i + tens.i * 10 + hundreds.i * 100 DAY) BETWEEN DATE_SUB(NOW(), INTERVAL 1 MONTH) AND NOW()
) sub1
LEFT OUTER JOIN ORDERS
ON sub1.sdate = DATE_FORMAT(ORDERS.date, "%M %e")
GROUP BY sub1.sdate
This copes with date ranges of up to 1000 days.
Note that it could be made more efficient easily depending on the type of field you are using for your dates.
EDIT - as requested, to get the count of orders per month:-
SELECT aMonth, COUNT(ORDERS.id) as Norders
FROM
(
SELECT DATE_FORMAT(DATE_SUB(NOW(), INTERVAL months.i MONTH), "%Y%m") as sdate, DATE_FORMAT(DATE_SUB(NOW(), INTERVAL months.i MONTH), "%M") as aMonth
FROM (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 10 UNION SELECT 11)months
WHERE DATE_SUB(NOW(), INTERVAL months.i MONTH) BETWEEN DATE_SUB(NOW(), INTERVAL 12 MONTH) AND NOW()
) sub1
LEFT OUTER JOIN ORDERS
ON sub1.sdate = DATE_FORMAT(ORDERS.date, "%Y%m")
GROUP BY aMonth
You are going to need to generate a virtual (or physical) table, containing every date in the range.
That can be done as follows, using a sequence table.
SELECT mintime + INTERVAL seq.seq DAY AS orderdate
FROM (
SELECT CURDATE() - INTERVAL 1 MONTH AS mintime,
CURDATE() AS maxtime
FROM obs
) AS minmax
JOIN seq_0_to_999999 AS seq ON seq.seq < TIMESTAMPDIFF(DAY,mintime,maxtime)
Then, you join this virtual table to your query, as follows.
SELECT IFNULL(orders.Norders,0) AS Norders, /* show zero instead of null*/
DATE_FORMAT(alldates.orderdate, "%M %e") as sdate
FROM (
SELECT mintime + INTERVAL seq.seq DAY AS orderdate
FROM (
SELECT CURDATE() - INTERVAL 1 MONTH AS mintime,
CURDATE() AS maxtime
FROM obs
) AS minmax
JOIN seq_0_to_999999 AS seq
ON seq.seq < TIMESTAMPDIFF(DAY,mintime,maxtime)
) AS alldates
LEFT JOIN (
SELECT COUNT(*) as Norders, DATE(date) AS orderdate
FROM ORDERS
WHERE date <= NOW()
AND date >= NOW() - INTERVAL 1 MONTH
GROUP BY DAY(date)
) AS orders ON alldates.orderdate = orders.orderdate
ORDER BY alldates.orderdate ASC
Notice that you need the LEFT JOIN so the rows in your output result set will be preserved even if there's no data in your ORDERS table.
Where do you get this sequence table seq_0_to_999999? You can make it like this.
DROP TABLE IF EXISTS seq_0_to_9;
CREATE TABLE seq_0_to_9 AS
SELECT 0 AS seq UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9;
DROP VIEW IF EXISTS seq_0_to_999;
CREATE VIEW seq_0_to_999 AS (
SELECT (a.seq + 10 * (b.seq + 10 * c.seq)) AS seq
FROM seq_0_to_9 a
JOIN seq_0_to_9 b
JOIN seq_0_to_9 c
);
DROP VIEW IF EXISTS seq_0_to_999999;
CREATE VIEW seq_0_to_999999 AS (
SELECT (a.seq + (1000 * b.seq)) AS seq
FROM seq_0_to_999 a
JOIN seq_0_to_999 b
);
You can find an explanation of all this in more detail at http://www.plumislandmedia.net/mysql/filling-missing-data-sequences-cardinal-integers/
If you're using MariaDB version 10+, these sequence tables are built in.
First create a Calendar Table
SELECT coalesce(COUNT(O.*),0) as Norders, DATE_FORMAT(C.date, "%M %e") as sdate
FROM Calendar C
LEFT JOIN ORDERS O ON C.date=O.date
WHERE O.date <= NOW() AND O.date >= NOW() - INTERVAL 1 MONTH
GROUP BY DAY(date)
ORDER BY date ASC;