Calculate momentum as weighted average by recency - mysql

I have a subscriptions table with an associated feed_id and creation timestamp. A feed has N subscriptions.
It's easy enough to show the most popular feeds using a group query to count the number of records with each feed_id. But I want to calculate momentum to show most trending feeds.
A simplified algorithm would be:
momentum of feed_id =
10 * (count of subscriptions with created_at in past day)
+ 5 * (count of subscriptions with created_at from 2-7 days ago)
+ 1 * (count of subscriptions with created_at from 7-28 days ago)
How can something like this be done in a single (My)SQL query instead of doing it with 3 queries and programmatically summing the results?

You can use conditional aggregation for this. MySQL treats booleans as integers, with true being "1", so you can just sum the expression for time.
I am guessing it looks something like this:
select feedid,
(10 * sum(createdat >= date_sub(now(), interval 1 day)) +
5 * sum(createdat >= date_sub(now(), interval 7 day) and
createdat < date_sub(now(), interval 1 day)) +
1 * sum(createdat >= date_sub(now(), interval 28 day) and
createdat < date_sub(now(), interval 7 day))
) as momentum
from subscriptions
group by feedid

SELECT 10*COUNT(IF(created_at >= CURDATE(), 1, 0)) +
5*COUNT(IF(created_at BETWEEN DATE_ADD(CURDATE(), - INTERVAL 7 days) AND DATE_ADD(CURDATE(), - INTERVAL 1 day), 1, 0) +
1*COUNT(IF(created_at BETWEEN DATE_ADD(CURDATE(), - INTERVAL 28 days) AND DATE_ADD(CURDATE(), - INTERVAL 8 day), 1, 0)
FROM ...
I'm not 100% sure I've caught the edge conditions (yesterday or 8 days ago) to get exactly the right count. You'll want to test that.
If you're interested in 24-hour periods then just substitute NOW() for CURDATE() and everything will go to DATETIME.

Related

MySQL select complete last month

How to select all data from last month (or 30 days)?
I already found some answers, and mostly gives this solution
SELECT *
FROM gigs
WHERE date > DATE_SUB(CURDATE(), INTERVAL 3 MONTH)
ORDER BY date DESC
But this gives me also the dates from the future
I am only interested in the days from last month or 30 days (not next month and beyond)
Is this what you want?
WHERE date > DATE_SUB(CURDATE(), INTERVAL 1 MONTH) AND date <= CURRENT_DATE
I added a condition so the query filters on date not greater than today. I also modified your code so the date range starts one month ago (you had 3 months).
try this code
SELECT * FROM gigs
WHERE date BETWEEN CURDATE() - INTERVAL 30 DAY AND CURDATE()
ORDER BY date DESC
You are asking for two separate things.
The last 30 days is easy.
date between now() - interval 30 day and now()
Data this month is like this:
date between (last_day(Now() - INTERVAL 1 MONTH) + INTERVAL 1 DAY) and last_day(Now())
Data a few months ago is like this:
date between (last_day(Now() - INTERVAL 4 MONTH) + INTERVAL 1 DAY)
and
(last_day(Now() - INTERVAL 3 MONTH) + INTERVAL 1 DAY)

ORDER BY based on multiple WHERE cases, is this possible?

Events can be a 1 day event or be an on-going event. This means that sometimes events can go for multiple days, weeks, or months.
As it is now, it is possible to sort the query result by END in ascending order (those expiring earlier shows first) or START in ASC (events based on start date). However, in both cases I have limitations that I am trying to reduce as much as I can.
When sorting by END, sometimes events that are ongoing and have already started get pushed to later in the list.
When sorting by START, events that have already started and are ongoing end up taking up the first sections of the list.
Is it possible to chain multiple ORDER BY statements based on logic rather than columns?
For example:
Get events that are expiring within the next 7 days:
SELECT * FROM data WHERE end < NOW() + INTERVAL 7 DAY;
Get events that are still ongoing between 7 days from today and ending within 14 days:
SELECT * FROM data WHERE NOW() + INTERVAL 7 DAY >= start AND end < NOW() + INTERVAL 14 DAY;
Get all remaining events...
SELECT * FROM data WHERE end >= NOW() + INTERVAL 14 DAY;
Basically, is it possible to join these into one query?
SELECT * FROM data
ORDER BY (logic 1), (logic 2), (logic 3);
Alternatively, I did get it working with running 3 separate queries and building up the result array on the server-side, but would like to simplify my code if possible.
Hoping that an end result will always show a list of events that will be expiring within 7 days first, then events that are happening between 7 - 14 days (could be starting or ongoing), then events that are still ongoing or starting after 14 days from today.
Depending on your SQL database, you can do something like this:
SELECT * FROM data
WHERE (end < NOW() + INTERVAL 7 DAY) -- logic 1
or (NOW() + INTERVAL 7 DAY >= start AND end < NOW() + INTERVAL 14 DAY) -- logic 2
or (NOW() + INTERVAL 7 DAY >= start AND end < NOW() + INTERVAL 14 DAY) -- logic 3
order by
case
when (end < NOW() + INTERVAL 7 DAY) then 1
when (NOW() + INTERVAL 7 DAY >= start AND end < NOW() + INTERVAL 14 DAY) then 2
when (NOW() + INTERVAL 7 DAY >= start AND end < NOW() + INTERVAL 14 DAY) then 3
else 4
end asc
;
You may also use union all:
SELECT 1 as sort_order, * FROM data
WHERE (end < NOW() + INTERVAL 7 DAY)
union all
SELECT 2 as sort_order, * FROM data
WHERE (NOW() + INTERVAL 7 DAY >= start AND end < NOW() + INTERVAL 14 DAY)
union all
SELECT 3 as sort_order, * FROM data
where NOW() + INTERVAL 7 DAY >= start AND end < NOW() + INTERVAL 14 DAY
The sort_order is probably not needed, but if you get your result not in the order of select, you may then use a subquery; also your database might forbids you from using order by in an union all.
select *
from (
SELECT 1 as sort_order, * FROM data
WHERE (end < NOW() + INTERVAL 7 DAY)
union all
SELECT 2 as sort_order, * FROM data
WHERE (NOW() + INTERVAL 7 DAY >= start AND end < NOW() + INTERVAL 14 DAY)
union all
SELECT 3 as sort_order, * FROM data
where NOW() + INTERVAL 7 DAY >= start AND end < NOW() + INTERVAL 14 DAY
) order by sort_order asc -- and any other key
I would personally go for the union all if possible (it is more readable).
You could try using a UNION where you select the data set and have a column that has the order you want. e.g.
SELECT 1 as orderby,* FROM data WHERE end < NOW() + INTERVAL 7 DAY;
UNION ALL
SELECT 2, * FROM data WHERE NOW() + INTERVAL 7 DAY >= start AND end < NOW() + INTERVAL 14 DAY
UNION ALL
SELECT 3, * FROM data WHERE end >= NOW() + INTERVAL 14 DAY
ORDER BY orderby, end
P.S. I would suggest you don't use SQL Keywords such as end for column names in your database, that can sometimes cause issues, enddate would be a better column name.
P.P.S. Avoid doing SELECT *, it is better to explicitly list the columns that you want.
I assume you are using MySQL.
use the case when .. then .. end in the select clause, then order by this column.
select *, case
when end < NOW() + INTERVAL 7 DAY then 1
when NOW() + INTERVAL 7 DAY >= start AND end < NOW() + INTERVAL 14 DAY then 2
else 3 as priority
from data
order by priority
Also, you can use the case in the order by clause.
Note: I didn't take care of your business logic, so test it well, just giving you how you can achieve it, hope it helps.

MySQL- How to query to get rows between certain years

I am trying to get back rows that are between year ranges, such as from 0-5 years, 5-10 years, 10-15 etc.
So far, I've only been able to product between 0-5 but need some help on querying between 5-10 years etc.
SELECT *
FROM users
WHERE start_date >= DATE_SUB(NOW(),INTERVAL 5 YEAR)
I've tried using the BETWEEN function, but could be using it incorrectly. Open to suggestions.
I'm not a fan of hard coding values in for the dates because I don't want to go back every few years and change it.
Assuming start_date is DATE datatype (not DATETIME or TIMESTAMP)
five years ago up to today
WHERE start_date > DATE(NOW()) + INTERVAL -5 YEAR
AND start_date <= DATE(NOW()) + INTERVAL 0 YEAR
ten years ago up to five years ago
WHERE start_date > DATE(NOW()) + INTERVAL -10 YEAR
AND start_date <= DATE(NOW()) + INTERVAL -5 YEAR
You can use BETWEEN.
SELECT *
FROM users
WHERE (start_date BETWEEN NOW() AND DATE_SUB(NOW(), INTERVAL 5 YEAR))
and then for your next interval:
SELECT *
FROM users
WHERE (start_date BETWEEN DATE_SUB(NOW(), INTERVAL 5 YEAR) AND DATE_SUB(NOW(), INTERVAL 10 YEAR))
0 - 5
SELECT * FROM users
WHERE
start_date BETWEEN (CURDATE() - INTERVAL 5 YEAR) AND CURDATE();
5 - 10
SELECT * FROM users
WHERE
start_date BETWEEN (CURDATE() - INTERVAL 10 YEAR) AND (CURDATE() - INTERVAL 5 YEAR);
10 - 15
SELECT * FROM users
WHERE
start_date BETWEEN (CURDATE() - INTERVAL 15 YEAR) AND (CURDATE() - INTERVAL 10 YEAR);

MySQL select all dates that are an increment of x days

Is it possible to query for all dates in the future that are an increment of x days?
i.e.
SELECT *
FROM bookings
WHERE date >= CURDATE()
AND
(
date = CURDATE() + INTERVAL 6 DAY
OR date = CURDATE() + INTERVAL 12 DAY
OR date = CURDATE() + INTERVAL 18 DAY
etc.
)
Something like:
SELECT
*
FROM table
WHERE
date >= CURDATE()
AND
DATEDIFF(CURDATE(), date) % 6 = 0
Datediff returns the number of days difference, and % 6 says return the remainder when divided by six.
Yes.
Your logic is flawed, though. You probably meant
SELECT *
FROM table
WHERE
date = CURDATE() + INTERVAL 6 DAY
OR date = CURDATE() + INTERVAL 12 DAY
OR date = CURDATE() + INTERVAL 18 DAY
And don't use table names like "table" and field names like "date" (i.e. reserved words).

select all records created within the hour

startTimestamp < date_sub(curdate(), interval 1 hour)
Will the (sub)query above return all records created within the hour? If not will someone please show me a correct one? The complete query may look as follows:
select * from table where startTimestamp < date_sub(curdate(), interval 1 hour);
Rather than CURDATE(), use NOW() and use >= rather than < since you want timestamps to be greater than the timestamp from one hour ago. CURDATE() returns only the date portion, where NOW() returns both date and time.
startTimestamp >= date_sub(NOW(), interval 1 hour)
For example, in my timezone it is 12:28
SELECT NOW(), date_sub(NOW(), interval 1 hour);
2011-09-13 12:28:53 2011-09-13 11:28:53
All together, what you need is:
select * from table where startTimestamp >= date_sub(NOW(), interval 1 hour);