I have a table with :
user_id | order_date
---------+------------
12 | 2014-03-23
12 | 2014-01-24
14 | 2014-01-26
16 | 2014-01-23
15 | 2014-03-21
20 | 2013-10-23
13 | 2014-01-25
16 | 2014-03-23
13 | 2014-01-25
14 | 2014-03-22
A Active user is someone who has logged in last 12 months.
Need output as
Period | count of Active user
----------------------------
Oct-2013 - 1
Jan-2014 - 5
Mar-2014 - 10
The Jan 2014 value - includes Oct -2013 1 record and 4 non duplicate record for Jan 2014)
You can use a variable to calculate the running total of active users:
SELECT Period,
#total:=#total+cnt AS `Count of Active Users`
FROM (
SELECT CONCAT(MONTHNAME(order_date), '-', YEAR(order_date)) AS Period,
COUNT(DISTINCT user_id) AS cnt
FROM mytable
GROUP BY Period
ORDER BY YEAR(order_date), MONTH(order_date) ) t,
(SELECT #total:=0) AS var
The subquery returns the number of distinct active users per Month/Year. The outer query uses #total variable in order to calculate the running total of active users' count.
Fiddle Demo here
I've got two queries that do the thing. I am not sure which one's the fastest. Check them aginst your database:
SQL Fiddle
Query 1:
select per.yyyymm,
(select count(DISTINCT o.user_id) from orders o where o.order_date >=
(per.yyyymm - INTERVAL 1 YEAR) and o.order_date < per.yyyymm + INTERVAL 1 MONTH) as `count`
from
(select DISTINCT LAST_DAY(order_date) + INTERVAL 1 DAY - INTERVAL 1 MONTH as yyyymm
from orders) per
order by per.yyyymm
Results:
| yyyymm | count |
|---------------------------|-------|
| October, 01 2013 00:00:00 | 1 |
| January, 01 2014 00:00:00 | 5 |
| March, 01 2014 00:00:00 | 6 |
Query 2:
select DATE_FORMAT(order_date, '%Y-%m'),
(select count(DISTINCT o.user_id) from orders o where o.order_date >=
(LAST_DAY(o1.order_date) + INTERVAL 1 DAY - INTERVAL 13 MONTH) and
o.order_date <= LAST_DAY(o1.order_date)) as `count`
from orders o1
group by DATE_FORMAT(order_date, '%Y-%m')
Results:
| DATE_FORMAT(order_date, '%Y-%m') | count |
|----------------------------------|-------|
| 2013-10 | 1 |
| 2014-01 | 5 |
| 2014-03 | 6 |
The best thing I could do is this:
SELECT Date, COUNT(*) as ActiveUsers
FROM
(
SELECT DISTINCT userId, CONCAT(YEAR(order_date), "-", MONTH(order_date)) as Date
FROM `a`
ORDER BY Date
)
AS `b`
GROUP BY Date
The output is the following:
| Date | ActiveUsers |
|---------|-------------|
| 2013-10 | 1 |
| 2014-1 | 4 |
| 2014-3 | 4 |
Now, for every row you need to sum up the number of active users in previous rows.
For example, here is the code in C#.
int total = 0;
while (reader.Read())
{
total += (int)reader['ActiveUsers'];
Console.WriteLine("{0} - {1} active users", reader['Date'].ToString(), reader['ActiveUsers'].ToString());
}
By the way, for the March of 2014 the answer is 9 because one row is duplicated.
Try this, but thise doesn't handle the last part: The Jan 2014 value - includes Oct -2013
select TO_CHAR(order_dt,'MON-YYYY'), count(distinct User_ID ) cnt from [orders]
where User_ID in
(select User_ID from
(select a.User_ID from [orders] a,
(select a.User_ID,count (a.order_dt) from [orders] a
where a.order_dt > (select max(b.order_dt)-365 from [orders] b where a.User_ID=b.User_ID)
group by a.User_ID
having count(order_dt)>1) b
where a.User_ID=b.User_ID) a
)
group by TO_CHAR(order_dt,'MON-YYYY');
This is what I think you are looking for
SET #cnt = 0;
SELECT Period, #cnt := #cnt + total_active_users AS total_active_users
FROM (
SELECT DATE_FORMAT(order_date, '%b-%Y') AS Period , COUNT( id) AS total_active_users
FROM t
GROUP BY DATE_FORMAT(order_date, '%b-%Y')
ORDER BY order_date
) AS t
This is the output that I get
Period total_active_users
Oct-2013 1
Jan-2014 6
Mar-2014 10
You can also do COUNT(DISTINCT id) to get the unique Ids only
Here is a SQL Fiddle
Related
I have got the previous year working members and subtracted previous year relieving employees, then got the previous month relieving list and subtracted it from the result set. Then added the newly added members in a current month.
SQL Fiddle Link
I am sensing that there lot of improvements we can do to the current query. But right now I am out of ideas, Can someone kindly help on this?
IF I have interpreted your existing query correctly, I suggest the following:
select
mnth.num, count(*)
from (
select 1 AS num union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9 union all select 10 union all select 11 union all select 12
) mnth
left join (
select
e.emp_id
, case
when e.hired_date < date_format(current_date(), '%Y-01-01') then 1
else month(e.hired_date)
end AS start_month
, case
when es.relieving_date < date_format(current_date(), '%Y-01-01') then 0
when es.relieving_date >= date_format(current_date(), '%Y-01-01') then month(es.relieving_date)
else month(current_date())
end AS end_month
from employee e
left join employee_separation es on e.emp_id = es.emp_id
) emp on mnth.num between emp.start_month and emp.end_month
where mnth.num <= month(current_date())
group by
mnth.num
;
This produced the following result (current_date() on Nov 21 2017
| num | count(*) |
|-----|----------|
| 1 | 6 |
| 2 | 7 |
| 3 | 8 |
| 4 | 9 |
| 5 | 10 |
| 6 | 9 |
| 7 | 10 |
| 8 | 11 |
| 9 | 12 |
| 10 | 13 |
| 11 | 14 |
DEMO
Depending on data volumes adding a where clause in the emp subquery may help, this also affect a case expression:
, case
when es.relieving_date >= date_format(current_date(), '%Y-01-01') then month(es.relieving_date)
else month(current_date())
end AS end_month
from employee e
left join employee_separation es on e.emp_id = es.emp_id
where es.relieving_date >= date_format(current_date(), '%Y-01-01')
I think what you need to do is to get all the employees who are already working from the employee table with:
SELECT * FROM employee WHERE hired_date<= CURRENT_DATE;
Then get the list of employees whose relieving date is still in the future using:
SELECT * FROM employee_separation WHERE relieving_date > CURRENT_DATE;
Then join the two results and group by the month and year of the reliving date as shown below:
SELECT DATE_FORMAT(B.relieving_date, "%Y-%M") RELIEVING_DATE, COUNT(*)
NUMBER_OF_ACTIVE_MEMBERS FROM
(SELECT * FROM employee WHERE hired_date <= CURRENT_DATE) A INNER JOIN
(SELECT * FROM employee_separation WHERE relieving_date > CURRENT_DATE) B
ON A.emp_id=B.emp_id
GROUP BY DATE_FORMAT(B.relieving_date , "%Y-%M");
Here is a Demo on sql fiddle.
I have the following table
customerID | orderID | orderDate
----------------------------------
1 | 67 | 2015-12-15
1 | 66 | 2015-10-20
1 | 65 | 2015-1-7
2 | 64 | 2014-9-6
2 | 63 | 2014-7-8
3 | 62 | 2015-1-15
I need to identify all the customerIDs that have at least 3 distinct orderIDs within a 12 month period in 2014 and 2015
Hmmm. You could do something like this:
select distinct customerId
from t
where 3 <= (select count(*)
from t t2
where t2.customerId = t.customerId and
t2.date >= t.date and
t2.date < date_add(t.date, interval 12 month)
);
An index on (customerId, date) would help performance. And, you might need count(distinct OrderId) in the subquery, but that doesn't seem necessary given your sample data.
Try this:
SELECT customerID, order_count FROM (SELECT customerID, COUNT(DISTINCT orderID) AS
order_count WHERE YEAR(orderDate) = 2014 GROUP BY customerID) AS
table_orders WHERE order_count >= 3
You could change first WHERE clausule in order to change date range, i suggest you a approach counting in all 2014 year
Scenario
I have a query developed from this question, were part of the optimisation was to create a MySql view which is used for generating statistics for users and sales, the problem is that when there is no result for one of the SELECT rows it gets omitted from the resulting table.
Question
How can a tell MySql to set a default value (e.g 0) if no rows are found for any of the SELECT?
Code
This is the code for creating the view
CREATE OR REPLACE VIEW user_events AS
SELECT 'Complete profiles' AS type, created_at
FROM users
WHERE completed_registration = 1
UNION ALL SELECT 'Incomplete profiles', created_at
FROM users
WHERE completed_registration = 0 AND verified_email = 1
UNION ALL SELECT 'Unverified profiles', created_at
FROM users
WHERE verified_email = 0
UNION ALL SELECT 'Onsite Teachers', created_at
FROM onsite_teachers
UNION ALL SELECT 'Onsite Teachers hired', created_at
FROM purchases
INNER JOIN purchased_profiles
ON purchased_profiles.purchase_id = purchases.id
AND purchased_profiles.profile_type = 'onsite_teacher'
WHERE purchases.transaction_status = 'completed'
UNION ALL SELECT 'Translators', created_at
FROM translators
UNION ALL SELECT 'Translators hired', created_at
FROM purchases
INNER JOIN purchased_profiles
ON purchased_profiles.purchase_id = purchases.id
AND purchased_profiles.profile_type = 'translator'
WHERE purchases.transaction_status = 'completed'
UNION ALL SELECT 'Interpreters', created_at
FROM interpreters
UNION ALL SELECT 'Interpreters hired', created_at
FROM purchases
INNER JOIN purchased_profiles
ON purchased_profiles.purchase_id = purchases.id
AND purchased_profiles.profile_type = 'interpreter'
WHERE purchases.transaction_status = 'completed';
And this is the code for querying the totals for the last 6 months including the current month.
SELECT
type,
COUNT(CASE
WHEN created_at >= CAST(CURDATE() - INTERVAL (DAYOFMONTH(CURDATE()) - 1) DAY AS DATETIME)
THEN 1
END) AS 0_month_ago,
COUNT(CASE
WHEN created_at BETWEEN CAST(CURDATE() - INTERVAL (DAYOFMONTH(CURDATE()) - 1) DAY - INTERVAL 1 MONTH AS DATETIME)
AND CAST(LAST_DAY(CURDATE() - INTERVAL 1 MONTH) + INTERVAL 1 DAY AS DATETIME)
THEN 1
END) AS 1_month_ago,
COUNT(CASE
WHEN created_at BETWEEN CAST(CURDATE() - INTERVAL (DAYOFMONTH(CURDATE()) - 1) DAY - INTERVAL 2 MONTH AS DATETIME)
AND CAST(LAST_DAY(CURDATE() - INTERVAL 2 MONTH) + INTERVAL 1 DAY AS DATETIME)
THEN 1
END) AS 2_months_ago,
COUNT(CASE
WHEN created_at BETWEEN CAST(CURDATE() - INTERVAL (DAYOFMONTH(CURDATE()) - 1) DAY - INTERVAL 3 MONTH AS DATETIME)
AND CAST(LAST_DAY(CURDATE() - INTERVAL 3 MONTH) + INTERVAL 1 DAY AS DATETIME)
THEN 1
END) AS 3_months_ago,
COUNT(CASE
WHEN created_at BETWEEN CAST(CURDATE() - INTERVAL (DAYOFMONTH(CURDATE()) - 1) DAY - INTERVAL 4 MONTH AS DATETIME)
AND CAST(LAST_DAY(CURDATE() - INTERVAL 4 MONTH) + INTERVAL 1 DAY AS DATETIME)
THEN 1
END) AS 4_months_ago,
COUNT(CASE
WHEN created_at BETWEEN CAST(CURDATE() - INTERVAL (DAYOFMONTH(CURDATE()) - 1) DAY - INTERVAL 5 MONTH AS DATETIME)
AND CAST(LAST_DAY(CURDATE() - INTERVAL 5 MONTH) + INTERVAL 1 DAY AS DATETIME)
THEN 1
END) AS 5_months_ago
FROM
user_events
GROUP BY
type;
Current output
If you look closely there is no Interpreters hired and Translators hired rows, I want for this rows to be set and zeroed out if they return null
+=========================+===============+===============+===============+==============+===============+===============+
| type | 0_month_ago | 1_month_ago | 2_month_ago | 3_month_ago | 4_month_ago | 5_month_ago |
+=========================+===============+===============+===============+==============+===============+===============+
| Complete profiles | 7 | 20 | 14 | 25 | 30 | 7 |
+-------------------------+---------------+---------------+---------------+--------------+---------------+---------------+
| Incomplete profiles | 12 | 27 | 56 | 45 | 48 | 23 |
+-------------------------+---------------+---------------+---------------+--------------+---------------+---------------+
| Unverified profiles | 3 | 16 | 23 | 5 | 0 | 9 |
+-------------------------+---------------+---------------+---------------+--------------+---------------+---------------+
| Onsite Teachers | 11 | 36 | 8 | 15 | 46 | 12 |
+-------------------------+---------------+---------------+---------------+--------------+---------------+---------------+
| Onsite Teachers hired | 0 | 0 | 12 | 9 | 3 | 0 |
+-------------------------+---------------+---------------+---------------+--------------+---------------+---------------+
| Interpreters | 4 | 21 | 27 | 46 | 45 | 28 |
+-------------------------+---------------+---------------+---------------+--------------+---------------+---------------+
| Translators | 7 | 20 | 19 | 27 | 19 | 42 |
+-------------------------+---------------+---------------+---------------+--------------+---------------+---------------+
The user_events view is a kind of a log, which looks like
type | created_at
======================+==============
Interpreters hired | 2014-12-12
Interpreters hired | 2014-12-14
Interpreters hired | 2014-12-16
Interpreters hired | 2015-01-02
However, if no interpreter has ever been hired, then there will be no rows with type = 'Interpreters hired'. In that case, the counting query cannot possibly fabricate an Interpreters hired row out of thin air.
A solution is to ensure that an Interpreters hired row appears in the user_events view, no matter what. You could create such fictitious rows with no created_at date. That way, there will always be something to GROUP BY, but not necessarily anything to COUNT().
CREATE OR REPLACE VIEW user_events AS
SELECT 'Complete profiles' AS type, created_at
FROM users
WHERE completed_registration = 1
UNION ALL SELECT 'Incomplete profiles', created_at
FROM users
WHERE completed_registration = 0 AND verified_email = 1
UNION ALL SELECT 'Unverified profiles', created_at
FROM users
WHERE verified_email = 0
UNION ALL SELECT 'Onsite Teachers', created_at
FROM onsite_teachers
UNION ALL SELECT 'Onsite Teachers hired', created_at
FROM purchases
INNER JOIN purchased_profiles
ON purchased_profiles.purchase_id = purchases.id
AND purchased_profiles.profile_type = 'onsite_teacher'
WHERE purchases.transaction_status = 'completed'
UNION ALL SELECT 'Onsite Teachers hired', NULL
UNION ALL SELECT 'Translators', created_at
FROM translators
UNION ALL SELECT 'Translators hired', created_at
FROM purchases
INNER JOIN purchased_profiles
ON purchased_profiles.purchase_id = purchases.id
AND purchased_profiles.profile_type = 'translator'
WHERE purchases.transaction_status = 'completed'
UNION ALL SELECT 'Translators hired', NULL
UNION ALL SELECT 'Interpreters', created_at
FROM interpreters
UNION ALL SELECT 'Interpreters hired', created_at
FROM purchases
INNER JOIN purchased_profiles
ON purchased_profiles.purchase_id = purchases.id
AND purchased_profiles.profile_type = 'interpreter'
WHERE purchases.transaction_status = 'completed'
UNION ALL SELECT 'Interpreters hired', NULL;
I have a table look like below....
ID HID Date UID
1 1 2012-01-01 1002
2 1 2012-01-24 2005
3 1 2012-02-15 5152
4 2 2012-01-01 6252
5 2 2012-01-19 10356
6 3 2013-01-06 10989
7 3 2013-03-25 25001
8 3 2014-01-14 35798
How can i group by HID, Year, Month and count(UID) and add a cumulative_sum (which is count of UID). So the final result look like this...
HID Year Month Count cumulative_sum
1 2012 01 2 2
1 2012 02 1 3
2 2012 01 2 2
3 2013 01 1 1
3 2013 03 1 2
3 2014 01 1 3
What's the best way to accomplish this using query?
I made assumptions about the original data set. You should be able to adapt this to the revised dataset - although note that the solution using variables (instead of my self-join) is faster...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(ID INT NOT NULL
,Date DATE NOT NULL
,UID INT NOT NULL PRIMARY KEY
);
INSERT INTO my_table VALUES
(1 ,'2012-01-01', 1002),
(1 ,'2012-01-24', 2005),
(1 ,'2012-02-15', 5152),
(2 ,'2012-01-01', 6252),
(2 ,'2012-01-19', 10356),
(3 ,'2013-01-06', 10989),
(3 ,'2013-03-25', 25001),
(3 ,'2014-01-14', 35798);
SELECT a.*
, SUM(b.count) cumulative
FROM
(
SELECT x.id,YEAR(date) year,MONTH(date) month, COUNT(0) count FROM my_table x GROUP BY id,year,month
) a
JOIN
(
SELECT x.id,YEAR(date) year,MONTH(date) month, COUNT(0) count FROM my_table x GROUP BY id,year,month
) b
ON b.id = a.id AND (b.year < a.year OR (b.year = a.year AND b.month <= a.month)
)
GROUP
BY a.id, a.year,a.month;
+----+------+-------+-------+------------+
| id | year | month | count | cumulative |
+----+------+-------+-------+------------+
| 1 | 2012 | 1 | 2 | 2 |
| 1 | 2012 | 2 | 1 | 3 |
| 2 | 2012 | 1 | 2 | 2 |
| 3 | 2013 | 1 | 1 | 1 |
| 3 | 2013 | 3 | 1 | 2 |
| 3 | 2014 | 1 | 1 | 3 |
+----+------+-------+-------+------------+
If you don't mind an extra column in the result, you can simplify (and accelerate) the above, as follows:
SELECT x.*
, #running:= IF(#previous=x.id,#running,0)+x.count cumulative
, #previous:=x.id
FROM
( SELECT x.id,YEAR(date) year,MONTH(date) month, COUNT(0) count FROM my_table x GROUP BY id,year,month ) x
,( SELECT #cumulative := 0,#running:=0) vals;
The code turns out kind of messy, and it reads as follows:
SELECT
HID,
strftime('%Y', `Date`) AS Year,
strftime('%m', `Date`) AS Month,
COUNT(UID) AS Count,
(SELECT
COUNT(UID)
FROM your_db A
WHERE
A.HID=B.HID
AND
(strftime('%Y', A.`Date`) < strftime('%Y', B.`Date`)
OR
(strftime('%Y', A.`Date`) = strftime('%Y', B.`Date`)
AND
strftime('%m', A.`Date`) <= strftime('%m', B.`Date`)))) AS cumulative_count
FROM your_db B
GROUP BY HID, YEAR, MONTH
Though by using views, it should become much clearer:
CREATE VIEW temp_data AS SELECT
HID,
strftime('%Y', `Date`) as Year,
strftime('%m', `Date`) as Month,
COUNT(UID) as Count
FROM your_db GROUP BY HID, YEAR, MONTH;
Then your statement will read as follows:
SELECT
HID,
Year,
Month,
`Count`,
(SELECT SUM(`Count`)
FROM temp_data A
WHERE
A.HID = B.HID
AND
(A.Year < B.Year
OR
(A.Year = B.Year
AND
A.Month <= B.Month))) AS cumulative_sum
FROM temp_data B;
This is my table named period.
id | year | month
222 | 2014 | 2
345 | 2013 | 5
33 | 2014 | 1
224 | 2014 | 2
I want get only id what have latest month (2014-02). Result should be 222, 224.
I wrote following query.
SELECT id, MAX(year*100 + month) FROM period
But it is returning following result.
222| 201402
How can i get my result
SELECT x.*
FROM period x
JOIN
( SELECT year
, month
FROM period
ORDER
BY year DESC
, month DESC
LIMIT 1
) y
ON y.year = x.year
AND y.month = x.month;
You should you the following query:---
SELECT id FROM period where year=(SELECT max(year) from period) and month=(SELECT max(month) from period);