MySQL listing all entries within x days of first entry - mysql

I have a table orders with the columns id, user_id, created_on and paid_amount. I'm trying to find the entries for each user_id within the first 7 days of their first order. Here's what I have so far:
SELECT user_id, created_on, paid_amount FROM orders WHERE created_on BETWEEN min(created_on) AND DATE_ADD(MIN(created_on), INTERVAL 7 DAY) GROUP BY user_id
I'm guessing that the problem lies in the face that the BETWEEN-command is assigned to a single value instead of the whole table? How could I fix this?
My ultimate goal is to find out the average amount spent by all users within their first 7 days, but I think I can figure out the rest of the steps myself.

This will give you first 7 day records, for each user_id
SELECT orders.* FROM orders
INNER JOIN (
select user_id, min(created_on) as mindt from orders group by user_id
) t
ON orders.user_id = t.user_id AND orders.created_on <= DATE_ADD(t.mindt, INTERVAL 7 DAY)
ORDER BY user_id, created_on
For average paid_amount for each user, in first 7 day, use this:
SELECT orders.user_id, avg(paid_amount) FROM orders
INNER JOIN (
select user_id, min(created_on) as mindt from orders group by user_id
) t
ON orders.user_id = t.user_id AND orders.created_on <= DATE_ADD(t.mindt, INTERVAL 7 DAY)
group by orders.user_id

Related

MySQL Get Orders From Last 12 Weeks Monday to Sunday

I have a table that stores each order made by a user, recording the date it was made , the amount and the user id. I am trying to create a query that returns the weekly transactions from Monday to Sunday for the last 12 weeks for a particular user. I am using the following query:
SELECT COUNT(*) AS Orders,
SUM(amount) AS Total,
DATE_FORMAT(transaction_date,'%m/%Y') AS Week
FROM shop_orders
WHERE user_id = 123
AND transaction_date >= now()-interval 3 month
GROUP BY YEAR(transaction_date), WEEKOFYEAR(transaction_date)
ORDER BY DATE_FORMAT(transaction_date,'%m/%Y') ASC
This produces the following result:
This however does not return the weeks where the user has made 0 orders, does not sum the orders from Monday to Sunday and does not return the weeks ordered from 1 to 12. Is there a way to achieve these things?
One way to accomplish this is with an self outer join (in this case, I use a right outer join, but of course a left outer join would work as well).
To start your weeks on Monday, subtract the result of WEEKDAY from your column transaction_date with DATE_SUB, as proposed in the most upvoted answer here.
SELECT
COALESCE(t1.Orders, 0) AS `Orders`,
COALESCE(t1.Total, 0) AS `Total`,
t2.Week AS `Week`
FROM
(
SELECT
COUNT(*) AS `Orders`,
SUM(amount) AS `Total`,
DATE(DATE_SUB(transaction_date, INTERVAL(WEEKDAY(transaction_date)) DAY)) AS `Week`
FROM
shop_orders
WHERE 1=1
AND user_id = 123
AND transaction_date >= NOW() - INTERVAL 12 WEEK
GROUP BY
3
) t1 RIGHT JOIN (
SELECT
DATE(DATE_SUB(transaction_date, INTERVAL(WEEKDAY(transaction_date)) DAY)) AS `Week`
FROM
shop_orders
WHERE
transaction_date >= NOW() - INTERVAL 12 WEEK
GROUP BY
1
ORDER BY
1
) t2 USING (Week)
To return the weeks with no Orders you have to create a table with all the weeks.
For the order order by the same fields in the group by

Return active users in the last 30 days for each day

I have a table, activity that looks like the following:
date | user_id |
Thousands of users and multiple dates and activity for all of them. I want to pull a query that will, for every day in the result, give me the total active users in the last 30 days. The query I have now looks like the following:
select date, count(distinct user_id) from activity where date > date_sub(date, interval 30 day) group by date
This gives me total unique users on only that day; I can't get it to give me the last 30 for each date. Help is appreciated.
To do this you need a list of the dates and join that against the activities.
As such this should do it. A sub query to get the list of dates and then a count of user_id (or you could use COUNT(*) as I presume user_id cannot be null):-
SELECT date, COUNT(user_id)
FROM
(
SELECT DISTINCT date, DATE_ADD(b.date, INTERVAL -30 DAY) AS date_minus_30
FROM activity
) date_ranges
INNER JOIN activity
ON activity.date BETWEEN date_ranges.date_minus_30 AND date_ranges.date
GROUP BY date
However if there can be multiple records for a user_id on any particular date but you only want the count of unique user_ids on a date you need to count DISTINCT user_id (although note that if a user id occurs on 2 different dates within the 30 day date range they will only be counted once):-
SELECT activity.date, COUNT(DISTINCT user_id)
FROM
(
SELECT DISTINCT date, DATE_ADD(b.date, INTERVAL -30 DAY) AS date_minus_30
FROM activity
) date_ranges
INNER JOIN activity
ON activity.date BETWEEN date_ranges.date_minus_30 AND date_ranges.date
GROUP BY date
A bit cruder would be to just join the activity table against itself based on the date range and use COUNT(DISTINCT ...) to just eliminate the duplicates:-
SELECT a.date, COUNT(DISTINCT a.user_id)
FROM activity a
INNER JOIN activity b
ON a.date BETWEEN DATE_ADD(b.date, INTERVAL -30 DAY) AND b.date
GROUP by a.date

Return a zero for a day with no results

I have a query which returns the total of users who registered for each day. Problem is if a day had no one register it doesn't return any value, it just skips it. I would rather it returned zero
this is my query so far
SELECT count(*) total FROM users WHERE created_at < NOW() AND created_at >
DATE_SUB(NOW(), INTERVAL 7 DAY) AND owner_id = ? GROUP BY DAY(created_at)
ORDER BY created_at DESC
Edit
i grouped the data so i would get a count for each day- As for the date range, i wanted the total users registered for the previous seven days
A variation on the theme "build your on 7 day calendar inline":
SELECT D, count(created_at) AS total FROM
(SELECT DATE_SUB(NOW(), INTERVAL D DAY) AS D
FROM
(SELECT 0 as D
UNION SELECT 1
UNION SELECT 2
UNION SELECT 3
UNION SELECT 4
UNION SELECT 5
UNION SELECT 6
) AS D
) AS D
LEFT JOIN users ON date(created_at) = date(D)
WHERE owner_id = ? or owner_id is null
GROUP BY D
ORDER BY D DESC
I don't have your table structure at hand, so that would need adjustment probably. In the same order of idea, you will see I use NOW() as a reference date. But that's easily adjustable. Anyway that's the spirit...
See for a live demo http://sqlfiddle.com/#!2/ab5cf/11
If you had a table that held all of your days you could do a left join from there to your users table.
SELECT SUM(CASE WHEN U.Id IS NOT NULL THEN 1 ELSE 0 END)
FROM DimDate D
LEFT JOIN Users U ON CONVERT(DATE,U.Created_at) = D.DateValue
WHERE YourCriteria
GROUP BY YourGroupBy
The tricky bit is that you group by the date field in your data, which might have 'holes' in it, and thus miss records for that date.
A way to solve it is by filling a table with all dates for the past 10 and next 100 years or so, and to (outer)join that to your data. Then you will have one record for each day (or week or whatever) for sure.
I had to do this only for MS SqlServer, so how to fill a date table (or perhaps you can do it dynamically) is for someone else to answer.
A bit long winded, but I think this will work...
SELECT count(users.created_at) total FROM
(SELECT DATE_SUB(CURDATE(),INTERVAL 6 DAY) as cdate UNION ALL
SELECT DATE_SUB(CURDATE(),INTERVAL 5 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(),INTERVAL 4 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(),INTERVAL 3 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(),INTERVAL 2 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(),INTERVAL 1 DAY) UNION ALL
SELECT CURDATE()) t1 left join users
ON date(created_at)=t1.cdate
WHERE owner_id = ? or owner_id is null
GROUP BY t1.cdate
ORDER BY t1.cdate DESC
It differs from your query slightly in that it works on dates rather than date times which your query is doing. From your description I have assumed you mean to use whole days and therefore have used dates.

Mysql subtracting values using row selected from min timestamp, grouping by id

I've been at this for a few hours now to no avail, pulling my hair out.
Edit: Im wanting to calculate the difference between the overall_exp column by using the same data from 1 day ago to calculate the greatest 'gain' for each user
Currently I'm take a row, then select a row from 1 day ago based on the first rows timestamp then subtract the overall_exp column from the 2 rows and order by that result whilst grouping by user_id
SQL Fiddle: http://sqlfiddle.com/#!2/501c8
Here is what i currently have, however the logic is completely wrong so im pulling 0 results
SELECT rsn, ts.timestamp, #original_ts := SUBDATE( ts.timestamp, INTERVAL 1 DAY), ts.overall_exp, ts.overall_exp - previous.overall_exp AS gained_exp
FROM tracker AS ts
INNER JOIN (
SELECT user_id, MIN( TIMESTAMP ) , overall_exp
FROM tracker
WHERE TIMESTAMP >= #original_ts
GROUP BY user_id
) previous
ON ts.user_id = previous.user_id
JOIN users
ON ts.user_id = users.id
GROUP BY ts.user_id
ORDER BY gained_exp DESC
You can do this with a self-join:
select t.user_id, max(t.overall_exp - tprev.overall_exp)
from tracker t join
tracker tprev
on tprev.user_id = t.user_id and
date(tprev.timestamp) = date(SUBDATE(t.timestamp, INTERVAL 1 DAY))
group by t.user_id
A key here is converting the timestamps to dates, so the comparison is exact.
Try:
select u.*, max(t.`timestamp`)-min(t.`timestamp`) gain
from users u
left join tracker t
on u.id = t.user_id and
t.`timestamp` >= date_sub(date(now()), interval 1 day) and
t.`timestamp` < date_add(date(now()), interval 1 day)
group by u.id
order by gain desc
SQLFiddle here.

SQL selecting average score over range of dates

I have 3 tables:
doctors (id, name) -> has_many:
patients (id, doctor_id, name) -> has_many:
health_conditions (id, patient_id, note, created_at)
Every day each patient gets added a health condition with a note from 1 to 10 where 10 is a good health (full recovery if you may).
What I want to extract is the following 3 statistics for the last 30 days (month):
- how many patients got better
- how many patients got worst
- how many patients remained the same
These statistics are global so I don't care right now of statistics per doctor which I could extract given the right query.
The trick is that the query needs to extract the current health_condition note and compare with the average of past days (this month without today) so one needs to extract today's note and an average of the other days excluding this one.
I don't think the query needs to define who went up/down/same since I can loop and decide that. Just today vs. rest of the month will be sufficient I guess.
Here's what I have so far which obv. doesn't work because it only returns one result due to the limit applied:
SELECT
p.id,
p.name,
hc.latest,
hcc.average
FROM
pacients p
INNER JOIN (
SELECT
id,
pacient_id,
note as LATEST
FROM
health_conditions
GROUP BY pacient_id, id
ORDER BY created_at DESC
LIMIT 1
) hc ON(hc.pacient_id=p.id)
INNER JOIN (
SELECT
id,
pacient_id,
avg(note) AS average
FROM
health_conditions
GROUP BY pacient_id, id
) hcc ON(hcc.pacient_id=p.id AND hcc.id!=hc.id)
WHERE
date_part('epoch',date_trunc('day', hcc.created_at))
BETWEEN
(date_part('epoch',date_trunc('day', hc.created_at)) - (30 * 86400))
AND
date_part('epoch',date_trunc('day', hc.created_at))
The query has all the logic it needs to distinguish between what is latest and average but that limit kills everything. I need that limit to extract the latest result which is used to compare with past results.
Something like this assuming created_at is of type date
select p.name,
hc.note as current_note,
av.avg_note
from patients p
join health_conditions hc on hc.patient_id = p.id
join (
select patient_id,
avg(note) as avg_note
from health_conditions hc2
where created_at between current_date - 30 and current_date - 1
group by patient_id
) avg on t.patient_id = hc.patient_id
where hc.created_at = current_date;
This is PostgreSQL syntax. I'm not sure if MySQL supports date arithmetics the same way.
Edit:
This should get you the most recent note for each patient, plus the average for the last 30 days:
select p.name,
hc.created_at as last_note_date
hc.note as current_note,
t.avg_note
from patients p
join health_conditions hc
on hc.patient_id = p.id
and hc.created_at = (select max(created_at)
from health_conditions hc2
where hc2.patient_id = hc.patient_id)
join (
select patient_id,
avg(note) as avg_note
from health_conditions hc3
where created_at between current_date - 30 and current_date - 1
group by patient_id
) t on t.patient_id = hc.patient_id
SELECT SUM(delta < 0) AS worsened,
SUM(delta = 0) AS no_change,
SUM(delta > 0) AS improved
FROM (
SELECT patient_id,
SUM(IF(DATE(created_at) = CURDATE(),note,NULL))
- AVG(IF(DATE(created_at) < CURDATE(),note,NULL)) AS delta
FROM health_conditions
WHERE DATE(created_at) BETWEEN CURDATE() - INTERVAL 1 MONTH AND CURDATE()
GROUP BY patient_id
) t