Mysql: find active users who logged in once a week - mysql

I have a table users and another table logins everytime the user log-in into the website we record a row in logins ex.
Users
-----
14 | name1
17 | name2
20 | name3
21 | name4
25 | name5
logins
----
14 | 2015-03-01
14 | 2015-03-07
14 | 2015-03-16
14 | 2015-03-24
14 | 2015-03-30
17 | 2015-03-01
17 | 2015-03-07
17 | 2015-03-16
17 | 2015-03-17
17 | 2015-03-30
20 | 2015-03-01
20 | 2015-03-07
20 | 2015-03-08
20 | 2015-03-16
20 | 2015-03-25
20 | 2015-03-30
if start date is 2015-03-01 and end date is 2015-04-01 then 14 & 20 should be selected while 17 wont be selected since he didn't login in the week of 03-22 to 03-28 so the result would be
Result
------
2

First you get the list of users per week which has logged in at least once, then you count per month the amount of users:
SELECT LoginYear,LoginWeek,COUNT(*) as NumbUsers
FROM (
SELECT Year(logins.date) as LoginYear, Week(logins.date) as LoginWeek, logins.UserID
FROM logins
WHERE logins.date>='2015-03-01'
GROUP BY LoginYear, LoginWeek, logins.UserID
HAVING COUNT(*)>0
) t
GROUP BY LoginYear,LoginWeek;
Week numbering: MySQL can count the weeks in different ways (such as starting on a Sunday/Monday) using the mode: WEEK(date,mode). See the WEEK MySQL documentation.
Update: to get the number of persons which has been logged in at least once every week: first we get the users that were logged in at least once per week in the subquery weektable. Then the users are select which have a week count which equals the total number of weeks in that period (thus having been online each week). Finally we count those users.
SELECT COUNT(*)
FROM (
SELECT UserID
FROM (
SELECT Year(logins.date) as LoginYear, Week(logins.date) as LoginWeek, logins.UserID
FROM logins
WHERE logins.date>='2015-03-01'
GROUP BY LoginYear, LoginWeek, logins.UserID
HAVING COUNT(*)>0
) weektable
GROUP BY UserID
HAVING COUNT(*)>=TIMESTAMPDIFF(WEEK,'2015-03-01',NOW())
) subq;
Note 1: I put the date '2015-03-01' as an example but you can change this or put as a variable.
Note 2: depending on the dates you choose it can be that the week count by TIMESTAMPDIFF is less than the maximum number of weeks (counted by COUNT(*)), since it does not count half weeks. Therefore I put >= in the last line: HAVING COUNT(*)>=TIMESTAMPDIFF(WEEK,'2015-03-01',NOW()).

I cannot test it here at the moment but something like
SELECT COUNT(Users.id) WHERE logins.date>=XXXX AND logins.date<=XXXX GROUP BY Users.id
should work

Related

Count users who have repeatedly called no sooner than 6 days post first call

I have to count how many repeated times a user has called within next 7 days (days have to be flexible) or more.
The query should only consider records with the 7 days earlier than the last date in the table.
My data looks something like this:
call_date user
2017-05-01 100
2017-05-01 500
2017-05-02 200
2017-05-02 300
2017-05-03 300
2017-05-04 100
2017-05-05 400
2017-05-06 500
2017-05-07 600
2017-05-08 200
2017-05-09 700
2017-05-10 500
2017-05-11 400
2017-05-12 300
2017-05-13 100
2017-05-14 200
The desired output of the query is:
call_date user count
2017-05-01 100 2
2017-05-01 500 2
2017-05-02 200 2
2017-05-02 300 2
2017-05-03 300 1
2017-05-04 100 1
2017-05-05 400 2
2017-05-06 500 2
2017-05-07 600 1
Explanation:
While listing the date the first contact should be considered (user 100 called on 2017-05-01, 2017-05-04 and 2017-05-13) but only 2017-05-01 displayed
For user 100, only records within 7 days should be considered hence count of user 100 becomes 2 (2017-05-01 and 2017-05-04; excluding 2017-05-13 since falls out of range) for call_date 2017-05-01
No records after 2017-05-07 are considered because it is the date which is 7 days earlier than the max date i.e. 2017-05-14
This query has to run on 25+ million records hence an optimized query would be added advantage.
I am quite unsure as to how to nail down this problem; a detailed explanation with the query would be much appreciated.
Assuming this is your table definition (I've changed user to user_id to avoid clashing with a reserved keyword):
CREATE TABLE calls
(
call_date date NOT NULL,
user_id integer NOT NULL
/* no primary key. There *can* be duplicate rows, that could be
changed if call_date were instead call_datetime. Then:
PRIMARY KEY (user_id, call_datetime)
Assumed user's cannot make simultaneous calls, nor any faster than
the datetime resolution.
*/
)
;
-- These indexes will help `using index` query plans.
CREATE INDEX idx_calls_user_id_call_date ON calls(user_id, call_date) ;
CREATE INDEX idx_calls_call_date_user_id ON calls(call_date, user_id) ;
... and that we import your data. We can then query the database with:
SELECT
call_date, user_id,
-- Count of the number of calls on `call_date` for `user_id`
count(call_date) AS count_on_date,
-- Count of the number of calls between `call_date` and the next 6 days (including both)
(SELECT count(call_date) FROM calls c1 WHERE c1.user_id = c.user_id AND c1.call_date BETWEEN c.call_date AND c.call_date + interval 6 day) AS count_next_7_days
FROM
calls c
-- The next JOIN is used to retrieve the `reference date`, and do it only once.
-- This will allow to take into account only dates from (2017-05-14 - 13 day) = 2017-05-01 and (2017-05-14 - 7 day) = 2017-05-07
JOIN (SELECT max(call_date) AS ref_date FROM calls) AS d ON c.call_date BETWEEN ref_date - interval 13 day AND ref_date - interval 7 day
GROUP BY
call_date, user_id
ORDER BY
call_date, user_id ;
This query will return:
call_date | user_id | count_on_date | count_next_7_days
:--------- | ------: | ------------: | ----------------:
2017-05-01 | 100 | 1 | 2
2017-05-01 | 500 | 1 | 2
2017-05-02 | 200 | 1 | 2
2017-05-02 | 300 | 1 | 2
2017-05-03 | 300 | 1 | 1
2017-05-04 | 100 | 1 | 1
2017-05-05 | 400 | 1 | 2
2017-05-06 | 500 | 1 | 2
2017-05-07 | 600 | 1 | 1
dbfiddle here
Have you tried DAYOFWEEK() function? This link should be helpful.

MySQL: Generate decreasing daily work_hours at a constant pace

I have this query to extract total_hours, start_date and end_date:
select proj.start_date, proj.end_date, sum(ifnull(work.hours_estimate,0)) as total_hours
from project_table proj
left outer join project_task work on
work.project_id = proj.id
where proj.id = 3
This query gives me a single row of result:
start_date | end_date | total_hour
----------------------------------------
2017-04-24 | 2017-05-15 | 119
What I want is to generate a daily interval of rows, constantly decreasing the total_hours by a certain amount, say 19 hours, and the day increasing by 1 day.
Expected results:
day | hours_left
------------------------
2017-04-24 | 119
2017-04-25 | 100
2017-04-26 | 81
2017-04-27 | 62
2017-04-28 | 43
2017-04-29 | 24
... and so on and so forth until it reaches 2017-05-15 (of course, no negative for hours_left, just zero if negative)
can't seem to figure out how to do this.
QUESTIONS:
1.) Is this possible in MySQL?
2.) If this is possible in MySQL, is it efficient/convinient?
If not, I could just do it in application, as state in the comments

Count unique values from duplicates

I have following data on the table.
Uid | comm | status
-------------------
12 23 eve
15 23 eve
20 23 mon
12 23 mon
20 23 eve
17 23 mon
how do i query to get below result to avoid duplicates and make sure if i count uid for "eve" and same uid appears on "mon" then count only uid for "eve"?
count | status
-------------------
3 eve
1 mon
Thanks for the help!
You can use the following query in order to pick each Uid value once:
SELECT Uid, MIN(status)
FROM mytable
GROUP BY Uid
Output:
Uid MIN(status)
---------------
12 eve
15 eve
17 mon
20 eve
Using the above query you can get at the desired result like this:
SELECT status, count(*)
from (
SELECT Uid, MIN(status) AS status
FROM mytable
GROUP BY Uid ) AS t
GROUP BY status
Demo here

SQL max date related issue

I'm having a bit of an issue with max(date) in SQL.
Basically the problem being that I have to check if latest date entered by id is more than 1 days old and then return that date.
id| user_id| send_date
8 | 90 | 2016-10-21 14:31:14
| 10 | 90 | 2016-10-25 09:56:28
| 11 | 18 | 2016-10-22 09:56:28
| 12 | 19 | 2016-10-21 09:56:28
| 13 | 19 | 2016-10-23 09:56:28
| 13 | 20 | 2016-10-25 09:56:28
This is part of a much longer SQL (just the part that I have a problem with):
SELECT max(h.send_date) as lastSent
FROM history h
WHERE (h.send_date < NOW() - INTERVAL 1 DAY);
Now what happens is that instead of selecting rows where latest entered date is older than 1 day, I get the latest one that is older than 1 day even if there's a newer entry in the table.
Does anyone have an idea how to change it so that SQL would only return the latest date when it's older that 24h and the newest (by user) in the table (in the example, it would have to return nothing because there's an entry less than 24h old)?
Edited the table example a bit. This is what I need to get as a result (user_ids 90 and 20 get's ignored because of 2016-10-25 09:56:28):
18 | 2016-10-22 09:56:28
19 | 2016-10-23 09:56:28
for aggregation function you should use having and not where
SELECT max(h.send_date) as lastSent
FROM history h
having max(h.send_date ) < DATE_SUB(NOW() ,INTERVAL 1 DAY) ;

Finding date where conditions within 30 days has elapsed

For my website, I have a loyalty program where a customer gets some goodies if they've spent $100 within the last 30 days. A query like below:
SELECT u.username, SUM(total-shipcost) as tot
FROM orders o
LEFT JOIN users u
ON u.userident = o.user
WHERE shipped = 1
AND user = :user
AND date >= DATE(NOW() - INTERVAL 30 DAY)
:user being their user ID. Column 2 of this result gives how much a customer has spent in the last 30 days, if it's over 100, then they get the bonus.
I want to display to the user which day they'll leave the loyalty program. Something like "x days until bonus expires", but how do I do this?
Take today's date, March 16th, and a user's order history:
id | tot | date
-----------------------
84 38 2016-03-05
76 21 2016-02-29
74 49 2016-02-20
61 42 2015-12-28
This user is part of the loyalty program now but leaves it on March 20th. What SQL could I do which returns how many days (4) a user has left on the loyalty program?
If the user then placed another order:
id | tot | date
-----------------------
87 12 2016-03-09
They're still in the loyalty program until the 20th, so the days remaining doesn't change in this instance, but if the total were 50 instead, then they instead leave the program on the 29th (so instead of 4 days it's 13 days remaining). For what it's worth, I care only about 30 days prior to the current date. No consideration for months with 28, 29, 31 days is needed.
Some create table code:
create table users (
userident int,
username varchar(100)
);
insert into users values
(1, 'Bob');
create table orders (
id int,
user int,
shipped int,
date date,
total decimal(6,2),
shipcost decimal(3,2)
);
insert into orders values
(84, 1, 1, '2016-03-05', 40.50, 2.50),
(76, 1, 1, '2016-02-29', 22.00, 1.00),
(74, 1, 1, '2016-02-20', 56.31, 7.31),
(61, 1, 1, '2015-12-28', 43.10, 1.10);
An example output of what I'm looking for is:
userident | username | days_left
--------------------------------
1 Bob 4
This is using March 16th as today for use with DATE(NOW()) to remain consistent with the previous bits of the question.
The following is basically how to do what you want. Note that references to "30 days" are rough estimates and what you may be looking for is "29 days" or "31 days" as works to get the exact date that you want.
Retrieve the list of dates and amounts that are still active, i.e., within the last 30 days (as you did in your example), as a table (I'll call it Active) like the one you showed.
Join that new table (Active) with the original table where a row from Active is joined to all of the rows of the original table using the date fields. Compute a total of the amounts from the original table. The new table would have a Date field from Active and a Totol field that is the sum of all the amounts in the joined records from the original table.
Select from the resulting table all records where the Amount is greater than 100.00 and create a new table with Date and the minimum Amount of those records.
Compute 30 days ahead from those dates to find the ending date of their loyalty program.
You would need to take the following steps (per user):
join the orders table with itself to calculate sums for different (bonus) starting dates, for any of the starting dates that are in the last 30 days
select from those records only those starting dates which yield a sum of 100 or more
select from those records only the one with the most recent starting date: this is the start of the bonus period for the selected user.
Here is a query to do that:
SELECT u.userident,
u.username,
MAX(base.date) AS bonus_start,
DATE(MAX(base.date) + INTERVAL 30 DAY) AS bonus_expiry,
30-DATEDIFF(NOW(), MAX(base.date)) AS bonus_days_left
FROM users u
LEFT JOIN (
SELECT o.user,
first.date AS date,
SUM(o.total-o.shipcost) as tot
FROM orders first
INNER JOIN orders o
ON o.user = first.user
AND o.shipped = 1
AND o.date >= first.date
WHERE first.shipped = 1
AND first.date >= DATE(NOW() - INTERVAL 30 DAY)
GROUP BY o.user,
first.date
HAVING SUM(o.total-o.shipcost) >= 100
) AS base
ON base.user = u.userident
GROUP BY u.username,
u.userident
Here is a fiddle.
With this input as orders:
+----+------+---------+------------+-------+----------+
| id | user | shipped | date | total | shipcost |
+----+------+---------+------------+-------+----------+
| 61 | 1 | 1 | 2015-12-28 | 42 | 0 |
| 74 | 1 | 1 | 2016-02-20 | 49 | 0 |
| 76 | 1 | 1 | 2016-02-29 | 21 | 0 |
| 84 | 1 | 1 | 2016-03-05 | 38 | 0 |
| 87 | 1 | 1 | 2016-03-09 | 50 | 0 |
+----+------+---------+------------+-------+----------+
The above query will return this output (when executed on 2016-03-20):
+-----------+----------+-------------+--------------+-----------------+
| userident | username | bonus_start | bonus_expiry | bonus_days_left |
+-----------+----------+-------------+--------------+-----------------+
| 1 | John | 2016-02-29 | 2016-03-30 | 10 |
+-----------+----------+-------------+--------------+-----------------+
Simple solution
Seeing how you do your first query, I guessed that when you are at the point where you look for the "expiration date", you already know that the user meets the 100 points over last 30 days. Then you can do this :
SELECT DATE_ADD(MIN(date),INTERVAL 30 DAY)
FROM orders o
WHERE shipped = 1
AND user = :user
AND date >= (DATE(NOW() - INTERVAL 30 DAY))
It takes the minimum order date of a user over the last 30 days, and add 30 days to the result.
But that really is a poor design to achieve what you want.
You would better to think further and implement what's next.
Advanced solution
In order to reproduce all the following solution, I have used the Fiddle that Trincot kindly built, and expanded it to test on more data : 4 users having 4 orders.
SQL FIddle http://sqlfiddle.com/#!9/668939/1
Step 1 : Design
The following query will return all the users meeting the loyalty program criteria, along with their earlier order date within 30 days and the loyalty program expiration date calculated from the earlier date, and the number of days before it expires.
SELECT O.user, u.username, SUM(total-shipcost) as tot, MIN(date) AS mindate,
DATE_ADD(MIN(date),INTERVAL 30 DAY) AS expirationdate,
DATEDIFF(DATE_ADD(MIN(date),INTERVAL 30 DAY), DATE(NOW())) AS daysleft
FROM orders o
LEFT JOIN users u
ON u.userident = o.user
WHERE shipped = 1
AND date >= DATE(NOW() - INTERVAL 30 DAY)
GROUP BY user
HAVING tot >= 100;
Now, create a VIEW with the above query
CREATE VIEW loyalty_program AS
SELECT O.user, u.username, SUM(total-shipcost) as tot, MIN(date) AS mindate,
DATE_ADD(MIN(date),INTERVAL 30 DAY) AS expirationdate,
DATEDIFF(DATE_ADD(MIN(date),INTERVAL 30 DAY), DATE(NOW())) AS daysleft
FROM orders o
LEFT JOIN users u
ON u.userident = o.user
WHERE shipped = 1
AND date >= DATE(NOW() - INTERVAL 30 DAY)
GROUP BY user
HAVING tot >= 100;
It is important to understand that this is only a one-shot action on your database.
Step 2 : Use your new VIEW
Once you have the view, you can get easily, for all users, the "state" of the loyalty program:
SELECT * FROM loyalty_program
user username tot mindate expirationdate daysleft
1 John 153 February, 28 2016 March, 29 2016 9
2 Joe 112 February, 24 2016 March, 25 2016 5
3 Jack 474 February, 23 2016 March, 24 2016 4
4 Averel 115 February, 22 2016 March, 23 2016 3
For a specific user, you can get the date you are looking for like this:
SELECT expirationdate FROM loyalty_program WHERE username='Joe'
You can also request all the users for which the expiration date is today
SELECT user FROM loyalty_program WHERE expirationdate=DATE(NOW))
But there are other easy possibilities that you'll discover after having played with your VIEW.
Conclusion
Make your life easier: learn to use VIEWS !
I am assuming your table looks like this:
user | id | total | date
-------------------------------
12 84 38 2016-03-05
12 76 21 2016-02-29
23 74 49 2016-02-20
23 61 42 2015-12-28
then try this:
SELECT x.user, x.date, x.id, x.cum_sum, d,date, DATEDIFF(NOW(), x.date) from (SELECT a.user, a.id, a.date, a.total,
(SELECT SUM(b.total) FROM order_table b WHERE b.date <= a.date and a.user=b.user ORDER BY b.user, b.id DESC) AS cum_sum FROM order_table a where a.date>=DATE(NOW() - INTERVAL 30 DAY) ORDER BY a.user, a.id DESC) as x
left join
(SELECT c.user, c.date as start_date, c.id from (SELECT a.user, a.id, a.date, a.total,
(SELECT SUM(b.total) FROM order_table b WHERE b.date <= a.date and a.user=b.user ORDER BY b.user, b.id DESC) AS cum_sum FROM order_table a where a.date>=DATE(NOW() - INTERVAL 30 DAY) ORDER BY a.user, a.id DESC) as c WHERE FLOOR(c.cum_sum/100)=MIN(FLOOR(c.cum_sum/100)) and MOD(c.cum_sum,100)=MAX(MOD(c.cum_sum,100)) group by concat(c.user, "_", c.id)) as d on concat(x.user, "_", x.id)=concat(d.user, "_", d.id) where x.date=d.date;
You will get a table something like this:
user | Date | cum_sum | start_date | Time_left
----------------------------------------------------
12 2016-03-05 423 2016-03-05 24
13 2016-02-29 525 2016-02-29 12
23 2016-02-20 944 2016-02-20 3
29 2015-12-28 154 2015-12-28 4
i have not tested this. But what i am trying to do is to create a table in descending order of id and user, and get a cumulative total column along with it. I have created another table by using this table with cumulative total, with relevant date (i.e. date from which date difference is to be calculated) for each user. I have left joined these two tables, and put in the condition x.date=d.date. I have put start_date and date in the table to check if the query is working.
Also, this is not the most optimum way of writing this code, but i have tried to stay as safe as possible by using sub queries, since i did not have the data to test this. Let me know if you face any error.