MySQL: Cascade filtering by date intervals - mysql

Can any help me with such type of query.
I have:
posts table
comments table
They are linked through comments.post_id = posts.post_id columns.
I user can filter post by comments for the past:
1 hour
24 hours
2 days, etc.
If user selected to show posts for the past 1 hour, but there no posts this period, we need to go step by step:
Select posts for past 1 hour, if empty - for past 24 hours, if empty - for past 2 days, if empty - since inception (without any conditions).
Could anyone please help me to build such query?
UPD
"Filter posts by comments" means sort by comments count.
So actually goal is request "Show me posts sorted by comments count that have been left for the past XXX hours".
And if is selected "for the past hour" but there are no posts with comments left for the past 1 hour, we need to fetch posts with comments left for the past 24 hours (sorted by comments count) and so on.
Tables structure
Posts:
post_id
title
content
date_added
Comments
comment_id
content
post_id
date_added
So link is posts.post_id = comments.post_id.
I would like to have next result when user view most commented posts for the past hour:
posts.post_id | comments_count | posts.date_added | group
---------------+----------------+------------------+----------------
156 | 8 | 2013-04-02 | hour
154 | 3 | 2013-04-02 | hour
129 | 1 | 2013-03-10 | 24 hours
13 | 14 | 2013-02-18 | 48 hours
138 | 6 | 2013-03-29 | week
137 | 4 | 2013-03-29 | week
161 | 21 | 2013-04-11 | month
6 | 2 | 2013-01-24 | year
103 | 8 | 2013-03-02 | since inception
Results sorted by:
Top of the list is 2 posts that have been commented due the past hour, and ordered by comments count.
Next we place posts that have been commented due the past day.
Next — posts commented due past 2 days
posts commented due past week, and again they should be ordered by comments count
For the past month
For the past year
In the end of this list we need to place articles that have been commented more than year ago, and they also should be ordered by comments count.
Thanks in advance.

Calculate the most recent comment for each group. Then use this to choose which group you want. You can do this calculation with a subquery:
select p.* c.*
from posts p join
comments c
on p.post_id = posts.post_id join
(select post_id, max(postdate) as postdate
from comments
group by post_id
) cmax
on cmax.post_id = p.post_id
where (case when timestampdiff(minute, now(), cmax.timestamp) <= 60
then timestampdiff(minute, now(), c.timestamp) <= 60
when timestampdiff(minute, now(), cmax.timestamp) <= 60*24
then timestampdiff(minute, now(), c.timestamp) <= 60*24
. . .
)
The syntax for the time comparison depends on whether the values are stored as timestamps or datetimes.

If you want the top 5 posts in the last hour, assuming your date_added fields are timestamps, you can use:
SELECT post_id,
count(comment_id) as comments_count,
posts.date_added, 'hour' as grouptime
FROM posts
INNER JOIN comments
ON posts.post_id = comments.post_id
WHERE TIMESTAMPDIFF(HOUR, comments.date_added, NOW()) = 0
GROUP BY posts.post_id
ORDER BY count(comment_id) DESC
LIMIT 5
If you want all of them, just remove LIMIT. For the last 24 hours:
SELECT post_id,
count(comment_id) as comments_count,
posts.date_added, '24 hours' as grouptime
FROM posts
INNER JOIN comments
ON posts.post_id = comments.post_id
WHERE TIMESTAMPDIFF(HOUR, comments.date_added, NOW()) < 24
GROUP BY posts.post_id
ORDER BY count(comment_id) DESC
LIMIT 5
and so on for different time periods.
If you want to get all of them in one go, use UNION between these queries.

Related

Select nearest date in the interval

I'm trying to select rows in which 3+ posts is in the interval 14 days.
For example:
User | id_post | date
1 | 12 | 2018-01-01
1 | 13 | 2018-01-05
1 | 14 | 2018-01-21
1 | 15 | 2018-01-27
1 | 16 | 2018-01-29
2 | 17 | 2018-01-01
2 | 18 | 2018-01-20
2 | 19 | 2018-02-17
2 | 20 | 2018-03-07
2 | 21 | 2018-04-29
User = OwnerUserId
date = CreationDate
In this case I need to return just User 1 because he has posts which are in 14 days.
Please, help me how I can get it. Thank you
Update: A user should have posts which were published in the interval of 14 days. It can be more, for example if the last day is in 2019 but in 2018 there was 3posts published within 14 days - it's ok
now i have (data get from data.stackexchange stackoverflow) and tried to apply
select OwnerUserId from Posts as p
where OwnerUserId in (select Users.id from Users WHERE YEAR (Users.CreationDate) >= 2017)
AND YEAR (p.CreationDate) >= 2018
AND p.Tags like '%sql%'
join (select OwnerUserId, CreationDate as startdate, dateadd(day,14,CreationDate) as enddate
from Posts) as r
on p.OwnerUserId = r.OwnerUserId and p.CreationDate between r.startdate and r.enddate
group by p.OwnerUserId, CreationDate
having count(*) >= 3
but it replies
Incorrect syntax near the keyword 'join'.
Incorrect syntax near the keyword 'as'.
I'm a begginner here and in the sql, so i dont exactly know how to combine my previous 'filtr' and current join with date
I'll not tell you the solution, but give you some pseudo-code and you figure out how to code it in SQL-
a) You should restrict your data for just 14 days.
b) Now, make groupings by User and find the count of records/lines present (for each User).
c) Now, again do a filter check to find users whose count of records is greater than 3.
Now, tell us which SQL keywords will be used for each points above.
I think something like
select p.user_id
from posts p
join (select user_id, xdate start_date, date_add(xdate, interval 14 day) end_date
from posts) r
on p.user_id = r.user_id and p.xdate between r.start_date and r.end_date
group by user_id, start_date
having count(*) >= 3
can help. It may not be the best possible solution, but it works.
Check it on SQL Fiddle
If you just want to select users by id you may try
Select id_post, date from yourtable where user = 2 order by id DESC limit 10;
You should have Colum called id with auto increment so new posts will have higher id so when it's sorted in descending it will start with post with higher id also you should have index on that id colum auto increment and index
If you don't want to use the above method then you will do that with date range like this
$date = gmdate() - (3600*24); 24 is 24 hours past
Select id_post, title from mutable where add_date > 'value of $date'
In both cases you should have index on user id
The second query is what you need but you should get the date from the equation first then apply it to the query
First, I think you mean user 1 not 2.
In MySQL 8+, this is pretty easy. If you want the first such post:
select t.*
from (select t.*,
lead(date, 2) over (partition by user order by date) as next_date2
from t
) t
where next_date2 <= date + interval 14 day;

How to determine a query for a specific time interval in MySQL?

I'm running some crontabs which trigger R-Scripts where I load Google Analytics Data for a specific time interval. Usually its the interval:
Today - 1 to Today - 14 days which corresponds to the following statement:
subset(mydata, date >= Sys.Date()-14 & date <= Sys.Date()-1)
I would like to add some MySQL-Query to that R-Scriptin order to get some data, which uses the same time interval. My tables have the following form:
`pictures` `music` `likes`
id date_of_upload id pictures_id id pictures_id
1 2012-01-16 50 1283 287 12
2 2012-02-17 25 736 2366 39
... ... ... ... ... ...
6000 2016-01-23
My query has the following form where I would like to meet the upper time interval:
SELECT
COUNT(p.id) AS pictures,
COUNT(m.id) AS songs,
COUNT(l.id) AS likes,
CAST(p.date_of_upload AS DATE) AS Posted
FROM pictures p
LEFT JOIN
music m ON p.id = m.pictures_id
LEFT JOIN
likes l ON p.id = l.pictures_id
WHERE p.date_of_upload > DATE_ADD(CURRENT_DATE(), INTERVAL - 14 DAY)
But that doesn't seem to be the right implementation for the time interval.
The required output may look as following:
posted songs likes picture
2016-01-23 20 30 3
2016-01-22 10 8 1
2016-01-21
...
2016-01-07
I think the simplest solution is to use COUNT(DISTINCT):
SELECT COUNT(DISTINCT p.id) AS pictures,
COUNT(DISTINCT m.id) AS songs,
COUNT(DISTINCT l.id) AS likes,
CAST(p.date_of_upload AS DATE) AS Posted
FROM pictures p LEFT JOIN
music m
ON p.id = m.pictures_id LEFT JOIN
likes l
ON p.id = l.pictures_id
WHERE p.date_of_upload > DATE_ADD(CURRENT_DATE(), INTERVAL - 14 DAY)
The problem is probably that you are getting Cartesian products between the two tables -- a separate row for each combination of pictures, music, and likes.
COUNT(DISTINCT) is the easiest way, but if you have large values, then it is inefficient.

Mysql Unique records, where multiple records exist

I am struggling with a Mysql call and was hoping to borrow your expertise.
I believe that what I want may only be possible using two selects and I have not yet done one of these and am struggling to wrap my head around this.
I have a table like so:
+------------------+----------------------+-------------------------------+
| username | acctstarttime | acctstoptime |
+------------------+----------------------+-------------------------------+
| bill | 22.04.2014 | 23.04.2014 |
+------------------+----------------------+-------------------------------+
| steve | 16.09.2014 | |
+------------------+----------------------+-------------------------------+
| fred | 12.08.2014 | |
+------------------+----------------------+-------------------------------+
| bill | 24.04.2014 | |
+------------------+----------------------+-------------------------------+
I wish to select only unique records from the username column ie I only want one record for bill and I need the one with most recent start_date, providing they were weren't in the last three months (end_date is not important to me here) else I do not want any data. In summary I just need anyone where there most recent start date is over 3 months old.
The command I am using currently is:
SELECT DISTINCT(username), ra.acctstarttime AS 'Last IP', ra.acctstoptime
FROM radacct AS ra
WHERE ra.acctstarttime < DATE_SUB(now(), interval 3 month)
GROUP BY ra.username
ORDER BY ra.acctstarttime DESC
However, this simply gives me details about the date_start for that particular customer where they had a start date over 3 months ago.
I have tired a few other combinations of this and have tried a command with a double select but I'm currently hitting brick walls. Any help or a push in the right direction would be much appreciated.
Update
I have created the following:
http://sqlfiddle.com/#!2/f47b2/1
Effectively I should only see 1 row when the query is as it should be. This would be the row for bill. As he is the only one that does not have a start date within the last three months. The result I would expect to see is the following:
24 bill April, 11 2014 12:11:40+0000 (null)
As this is the latest start date for bill, but this start date is not within the last three months. Hopefully this will help clarify. Many thanks for your help thus far.
http://sqlfiddle.com/#!2/f47b2/14
This is another example. If the acctstartdate for bill would show as the April entry, then I could add my where clause for the last three months and this would give me my desired result.
SQLFiddle
http://sqlfiddle.com/#!2/444432/9 (MySQL 5.5)
I am looking at the question in 2 ways based on the current text:
I only want one record for bill and I need the one with most recent start_date, providing they were in the last three months (end_date is not important to me here) else I do not want any data
Structure
create table test
(
username varchar(20),
date_start date
);
Data
Username date_start
--------- -----------
bill 2014-09-25
bill 2014-09-22
bill 2014-05-26
andy 2014-05-26
tim 2014-09-25
tim 2014-05-26
What we want
Username date_start
--------- -----------
bill 2014-09-25
tim 2014-09-25
Query
select *
from test a
inner join
(
select username, max(date_start) as max_date_start
from test
where date_start > date_sub(now(), interval 3 month)
group by username
) b
on
a.username = b.username
and a.date_start = b.max_date_start
where
date_start > date_sub(now(), interval 3 month)
Explanation
For the most recent last 3 months, let's get maximum start date for each user. To limit the records to the latest 3 months we use where date_start > date_sub(now(), interval 3 month) and to find the maximum start date for each user we use group by username.
We, then, join main data with this small subset based on user and max date to get the desired result.
Another angle
If we desire to NOT look at the latest 3 months and instead find the most recent date for each user, we would be looking at this kind of data:
What we want
Username date_start
--------- -----------
bill 2014-05-26
tim 2014-05-26
andy 2014-05-26
Query
select *
from test a
inner join
(
select username, max(date_start) as max_date_start
from test
where date_start < date_sub(now(), interval 3 month)
group by username
) b
on
a.username = b.username
and a.date_start = b.max_date_start
where
date_start < date_sub(now(), interval 3 month)
Hopefully you can change these queries to your liking.
EDIT
Based on your good explanation, here's the query
SQLFiddle: http://sqlfiddle.com/#!2/f47b2/17
select *
from activity a
-- find max dates for users for records with dates after 3 months
inner join
(
select username, max(acctstarttime) as max_date_start
from activity
where acctstarttime < date_sub(now(), interval 3 month)
group by username
) b
on
a.username = b.username
and a.acctstarttime = b.max_date_start
-- find usernames who have data in the recent three months
left join
(
select username, count(*)
from activity
where acctstarttime >= date_sub(now(), interval 3 month)
group by username
) c
on
a.username = c.username
where
acctstarttime < date_sub(now(), interval 3 month)
-- choose users who DONT have data from recent 3 months
and c.username is null
Let me know if you would like me to add explanation
Try this:
select t.*
from radacct t
join (
select ra.username, max(ra.acctstarttime) as acctstarttime
from radacct as ra
WHERE ra.acctstarttime < DATE_SUB(now(), interval 3 month)
) s on t.username = s.username and t.acctstarttime = s.acctstarttime
SQLFiddle

MySQL query help with grouping and adding

I have a table called user_logins which tracks user logins into the system. It has three columns, login_id, user_id, and login_time
login_id(INT) | user_id(INT) | login_time(TIMESTAMP)
------------------------------------------------------
1 | 4 | 2010-8-14 08:54:36
1 | 9 | 2010-8-16 08:56:36
1 | 9 | 2010-8-16 08:59:19
1 | 3 | 2010-8-16 09:00:24
1 | 1 | 2010-8-16 09:01:24
I am looking to write a query that will determine the number of unique logins for each day if that day has a login and only for the past 30 days from the current date. So for the output should look like this
logins(INT) | login_date(DATE)
---------------------------
1 | 2010-8-14
3 | 2010-8-16
in the result table 2010-8-16 only has 3 because the user_id 9 logged in twice that day and him logging into the system only counts as 1 login for that day. I am only looking for unique logins for a particular day. Remember I only want the past 30 days so its like a snapshot of the last month of user logins for a system.
I have attempted to create the query with little success what I have so far is this,
SELECT
DATE(login_time) as login_date,
COUNT(login_time) as logins
FROM
user_logins
WHERE
login_time > (SELECT DATE(SUBDATE(NOW())-1)) FROM DUAL)
AND
login_time < LAST_DAY(NOW())
GROUP BY FLOOR(login_time/86400)
I know this is wrong and this returns all logins only starting from the beginning of the current month and doesn't group them correctly. Some direction on how to do this would be greatly appreciated. Thank you
You need to use COUNT(DISTINCT ...):
SELECT
DATE(login_time) AS login_date,
COUNT(DISTINCT login_id) AS logins
FROM user_logins
WHERE login_time > NOW() - interval 30 day
GROUP BY DATE(login_time)
I was a little unsure what you wanted for your WHERE clause because your question seems to contradict itself. You may need to modify the WHERE clause depending on what you want.
As Mark suggests you can use COUNT(DISTINCT...
Alternatively:
SELECT login_day, COUNT(*)
FROM (
SELECT DATE_FORMAT(login_time, '%D %M %Y') AS login_day,
user_id
FROM user_logins
WHERE login_time>DATE_SUB(NOW(), INTERVAL 1 MONTH)
GROUP BY DATE_FORMAT(login_time, '%D %M %Y'),
user_id
)
GROUP BY login_day

How to select a column value that corresponds to a row returned by a MySQL aggregate function?

I have a table like
date user_id page_id
2010-06-19 16:00:00 1 4
2010-06-19 16:00:00 3 4
2010-06-20 07:10:00 1 1
2010-06-20 12:00:10 1 2
2010-06-20 12:00:10 1 3
2010-06-20 13:05:00 2 1
2010-06-20 14:10:00 3 1
2010-06-21 17:00:00 2 1
I want to write a query that will return the last page_id for those users who haven't visited in the last day.
So, I can find who hasn't visited in the last day with:
SELECT user_id, MAX(page_id)
FROM page_views GROUP BY user_id
HAVING MAX(date) < DATE_SUB(NOW(), INTERVAL 1 DAY);
However, how can I find the last viewed page_id for these users? i.e. I want to know which page_id corresponds to the value in the same row as MAX(date). In the case where there are multiple page views per date, I can just select the MAX(page_id).
The expected output from above should be (if NOW() returns 2010-06-21 18:00:00):
user_id page_id
1 3
3 1
user_id 1 last visited over a day ago
at 2010-06-20 12:00:10, and the
MAX(page_id) was 3.
user_id 2 last
visited less than a day ago, so they
are ignored.
user_id 3 last visited
over a day ago, and their most recent
page_id was 1.
How can I achieve this? I need to use only SQL. I'm using a MySQL derivative that requires all columns in the SELECT clause to be declared in the GROUP BY clause (it's a little more standards compliant).
Thanks.
I could see different approaches.
For example:
select a.user_id, a.page_id
from page_views a
inner join (SELECT user_id, MAX(date) as date
FROM page_views GROUP BY user_id
HAVING MAX(date) < DATE_SUB(NOW(), INTERVAL 1 DAY) ) b on a.user_id = b.user_id
and a.date = b.date
It could be implemented more effective in MS SQL or Oracle with windowed functions.
Another idea:
select a.user_id, a.page_id
from page_views a
where date < DATE_SUB(NOW(), INTERVAL 1 DAY)
and not exist(select 1 from page_views b
where a.user_id = b.user_id and b.date > a.date)