Retention Rate with MySQL - mysql

I try to calculate the retention rate with mysql and start with this one:
SELECT
s_order.ordertime,
DATE_SUB(future_orders.ordertime, INTERVAL 90 DAY),
count(distinct s_order.userID) as active_users,
count(distinct future_orders.userID) as retained_users
FROM s_order
LEFT JOIN s_order as future_orders on
s_order.userID = future_orders.userID
AND s_order.ordertime = DATE_SUB(future_orders.ordertime, INTERVAL 90 DAY);
This does not work - I get all users are active and therefore I added DATE_SUB(future_orders.ordertime, INTERVAL 90 DAY), to the selection criteria to see what is going on. However it returns NULL - but why?
As a reference I did take a look at this explanation:
https://www.periscopedata.com/blog/how-to-calculate-cohort-retention-in-sql.html
My table has a structure like
s_orders:
ID | userID | ordertime
I would expect a result how many different users have ordered something in general and how many have ordered something again in the last 90 days, to get the retention of the customers.
Does anybody know what am I doing wrong in MySQL?

DATE_SUB() returns null when the date value is null, so that probably why. Because you are LEFT JOIN-ing, the future_orders record can be null/non-existing

Related

MySql - Find record by ID_PRODUCT without notes - last 20 days

I've got database in MySQL with Output like that.
DATE_ADD | ID_PRODUCT | NOTE
That is TimeLine status for id_product. It could be more than 100 duplicate record by every Id_product in 1 week. What I need to find : only this RECORD without New NOTES in last 21 days. (>20 days)
Means: nobody is changing status of produckt for long time.
How to write the query to find only this record without new note in last 21 days.
And show the note order by date_add desc.
Thanks in advance for help me in that issue.
Right now I wrote smg like that:
Select
N.id_product,
N.date_add,
S.transaction,
N.note
from Sale S
left join note N on N.id_product=S.id_product
Where
(...)
and (N.date_add > DATE_SUB(CURDATE(), INTERVAL 20 DAY)) is null
Group By N.id_product
order by N.date_add desc

Query with three tables, no common column

I've just started a job and my boss wants me to learn mySQL so please bear with me, i've been learning for only 2 days and i'm not that good at it yet.
So i've been given 3 tables and several tasks to do.
The tables are:
mobile_log_messages_sms
mobile_providers
service_instances
And in them i've got to:
Find out how many messages there were in the last 25 days and how
much income did they make
Then i need to group them by day (so per day, exclude hours) and
provider name.
Also i need to ignore all the messages that have an empty string
under the service column
Also i need to ignore the messages that made 0 income and count only
those that have the column service_enabled = 1
And then i need to sort it descending, by date.
in the tables
mobile_log_messages_sms:
message_id - used to count the messages
price - using for price obviously, exlude those with 0
time - date in yyyy/mm/dd hh:mm:ss format
service - exclude all those that have an empty string (or null)
mobile_providers
provider_name - to use to group with
service_instances
enabled - only use if value is 1
I've started with:
SELECT message_id, price, time
FROM mobile_log_messages_sms
WHERE time BETWEEN '2017-02-26 00:00:00'
AND time AND '2017-03-22 00:00:00'
But i need to change the date format and then use the JOIN commands but i don't know how, and i know i need to add more to it, but i'm stumped even at the start. Also the starting just lists the messages but i need to count the total sum of the income (price) per day.
Can anyone point me in the right direction at least since i'm still a noob? Many thanks in advance and sorry if i worded something badly, english is not my first language.
Find out how many messages there were in the last 25 days and how much income did they make
1.
SELECT COUNT(message_id), SUM(price)
FROM mobile_log_messages_sms
WHERE CAST(time AS DATE) BETWEEN DATE_SUB(CURRENT_DATE,INTERVAL 25 DAY)
AND CURRENT_DATE;
2.
SELECT COUNT(message_id), SUM(price)
FROM mobile_log_messages_sms
WHERE CAST(time AS DATE) BETWEEN DATE_SUB(CURRENT_DATE,INTERVAL 25 DAY)
AND CURRENT_DATE
GROUP BY CAST(time AS DATE);
3.
SELECT COUNT(message_id), SUM(price)
FROM mobile_log_messages_sms
WHERE CAST(time AS DATE) BETWEEN DATE_SUB(CURRENT_DATE,INTERVAL 25 DAY)
AND CURRENT_DATE AND service IS NULL
GROUP BY CAST(time AS DATE);
rest can't done with join so make sure that at least one column should be common in tables.

MYSQL First and last datetime within a day

I have a table with 3 days of data (about 4000 rows). The 3 sets of data are all from a 30 minutes session. I want to have the start and ending time of each session.
I currently use this SQL, but it's quite slow (even with only 4000 records). The datetime table is indexed, but I think the index is not properly used because of the conversion from datetime to date.
The tablelayout is fixed, so I cannot change any part of that. The query takes about 20 seconds to run.. (and every day longer and longer). Anyone have some good tips to make it faster?
select distinct
date(a.datetime) datetime,
(select max(b.datetime) from bike b where date(b.datetime) = date(a.datetime)),
(select min(c.datetime) from bike c where date(c.datetime) = date(a.datetime))
from bike a
Maybe I'm missing something, but...
Isn't the result returned by the OP query equivalent to the result from this query:
SELECT DATE(a.datetime) AS datetime
, MAX(a.datetime) AS max_datetime
, MIN(a.datetime) AS min_datetime
FROM bike a
GROUP BY DATE(a.datetime)
Alex, warning, this in typed "freehand" so may have some syntax problems. But kind of shows what I was trying to convey.
select distinct
date(a.datetime) datetime,
(select max(b.datetime) from bike b where b.datetime between date(a.datetime) and (date(a.datetime) + interval 1 day - interval 1 second)),
(select min(c.datetime) from bike c where c.datetime between date(a.datetime) and (date(a.datetime) + interval 1 day - interval 1 second))
from bike a
Instead of comparing date(b.datetime), it allows comparing the actual b.datetime against a range calculated form the a.datetime. Hopefully this helps you out and does not make things murkier.

Mysql maximum rows in a variable timeframe

I'm making a fitness logbook where indoor rowers can log there results.
To make it interesting and motivating I'm implementing an achievement system.
I like to have an achievement that if someone rows more than 90 times within 24 weeks they get that achievement.
Does anybody have some hints in how i can implement this in MYSQL.
The mysql-table for the logbook is pretty straightforward: id, userid, date (timestamp),etc (rest is omitted because it doesn't really matter)
The jist is that the first rowdate and the last one can't exceed the 24 weeks.
I assume from your application that you want the most recent 24 weeks.
In mysql, you do this as:
select lb.userid
from logbook lb
where datediff(now(), lb.date) >= 7*24
group by userid
having count(*) >= 90
If you need it for an arbitrary 24-week period, can you modify the question?
Just do a sql query to count the number of rows a user has between now and 24 weeks ago. This is a pretty straight forward query to run.
Look at using something with datediff in mysql to get the difference between now and 24 weeks ago.
After you have a script set up to do this, set up a cron job to run either every day or every week and do some automation on this.
I think you should create a table achievers which you populate with the achievers of each day.
You can set a recurrent(daily, right before midnight) event in which you run a query like this:
delete from achievers;
insert into achievers (
select userid
from logbook
where date < currenttimestamp and date > currenttimestamp - 24weeks
group by userid
having count(*) >= 90
)
For events in mysql: http://dev.mysql.com/doc/refman/5.1/en/events-overview.html
This query will give you the list of users total activity in 24 weeks
select * from table groupby userid where `date` BETWEEN DATE_SUB( CURDATE( ) ,INTERVAL 168 DAY ) AND CURDATE( ) having count(id) >= 90

Select rows that are less than 5 minutes old using DATE_SUB

I have a table that is getting hundreds of requests per minute. The issue that I'm having is that I need a way to select only the rows that have been inserted in the past 5 minutes. I am trying this:
SELECT count(id) as count, field1, field2
FROM table
WHERE timestamp > DATE_SUB(NOW(), INTERVAL 5 MINUTE)
ORDER BY timestamp DESC
My issue is that it returns 70k+ results and counting. I am not sure what it is that I am doing wrong, but I would love to get some help on this. In addition, if there were a way to group them by minute to have it look like:
| count | field1 | field2 |
----------------------------
I'd love the help and direction on this, so please let me know your thoughts.
You don't really need DATE_ADD/DATE_SUB, date arithmetic is much simpler:
SELECT COUNT(id), DATE_FORMAT(`timestamp`, '%Y-%m-%d %H:%i')
FROM `table`
WHERE `timestamp` >= CURRENT_TIMESTAMP - INTERVAL 5 MINUTE
GROUP BY 2
ORDER BY 2
The following seems like it would work which is mighty close to what you had:
SELECT
MINUTE(date_field) as `minute`,
count(id) as count
FROM table
WHERE date_field > date_sub(now(), interval 5 minute)
GROUP BY MINUTE(date_field)
ORDER BY MINUTE(date_field);
Note the added column to show the minute and the GROUP BY clause that gathers up the results into the corresponding minute. Imagine that you had 5 little buckets labeled with the last 5 minutes. Now imagine you tossed each row that was 4 minutes old into it's own bucket. count() will then count the number of entries found in each bucket. That's a quick visualization on how GROUP BY works. http://www.tizag.com/mysqlTutorial/mysqlgroupby.php seems to be a decent writeup on GROUP BY if you need more info.
If you run that and the number of entries in each minute seems too high, you'll want to do some troubleshooting. Try replacing COUNT(id) with MAX(date_field) and MIN(date_field) so you can get an idea what kind of dates it is capturing. If MIN() and MAX() are inside the range, you may have more data written to your database than you realize.
You might also double check that you don't have dates in the future as they would all be > now(). The MIN()/MAX() checks mentioned above should identify that too if it's a problem.