Optimizing MySQL Query With Complicated Predicate - mysql

Thanks for trying to help. I have a table called posts where there is a row for each post. A post has a post_date (in unix timestamp format) and an author_id.
I am trying to get the number of unique authors that posted 5 or more times within a month grouped by month and year.
The query I have now for unique authors by month and year (without the filter for 5 or more posts within that month) is:
select
month(from_unixtime(p.post_date)) as month,
year(from_unixtime(p.post_date)) as year,
count(distinct p.author_id)
from posts p
group by month,year
order by year desc,month desc
Can you please help me add a filter that will only count authors that had 5 or more posts in that given month?
UPDATE
The following query works but is extremely slow, even when adding predicates for just this month and year. Can someone think of a better way to do it?
select
month(from_unixtime(p.post_date)) as month,
year(from_unixtime(p.post_date)) as year,
count(distinct p.author_id)
from posts p
where (
select count(*) from posts p2 where p2.author_id=p.author_id
and month(from_unixtime(p.post_date))=month(from_unixtime(p2.post_date))
and year(from_unixtime(p.post_date))=year(from_unixtime(p2.post_date))
)>=5
group by month,year order by year desc,month desc

You can use having count >5 .It will solve the problem
select month(from_unixtime(p.post_date)) as month,
year(from_unixtime(p.post_date)) as year,
count( p.author_id) c from posts p
group by author , month,year having c >5
order by year desc,month desc

Related

mysql GROUP BY but do not count with values that are already included in previous row

I have this SELECT but how can I reduce repetitive user_id that made action in 2017 and also in 2020. If there is the same user_id in 2017 so do not count with this user_id in 2020. Is it possible to rewrite in mysql? I just need to have unique user_ids in every group that do not exist in the other group. Thank you for every help :)
SELECT YEAR(l.datetime_created) AS year,
COUNT(l.user_id) as count_of_users
FROM users_locations AS l
GROUP BY month;
Perhaps you are looking for COUNT DISTINCT?
Just add distinct.
SELECT YEAR(l.datetime_created) AS year,
-- add distinct term
COUNT(distinct(l.user_id)) as count_of_users
FROM users_locations AS l
GROUP BY YEAR(l.datetime_created);
If you want by month:
SELECT YEAR(l.datetime_created) AS year,
MONTH(l.datetime_created) AS month,
-- add distinct term
COUNT(distinct(l.user_id)) as count_of_users
FROM users_locations AS l
GROUP BY YEAR(l.datetime_created), MONTH(l.datetime_created)
order by YEAR(l.datetime_created), MONTH(l.datetime_created);

Subquery returns more than 1 row, how can I solve?

Question: Show the category of competitions that have always been hosted in the same country during May 2010. What is wrong with my query?
select Category
from competition
where Date >= '2010-01-01' and Date <= '2010-12-31'
group by Country, Category
having count(*) = (select count(*)
from competition
where Date >= '2010-01-01' and Date <= '2010-12-31'
group by Category)
You don't need two queries. Just use one query that checks that the count of countries is 1.
select category, count(DISTINCT country) AS country_count
from competition
where Date BETWEEN '2010-05-01' and '2010-05-31'
group by Category
HAVING country_count = 1
I also corrected the dates to be just May, not the whole year 2010.
Remove the GROUP BY if you that is making you return more than 1 row (in your HAVING CLAUSE. If you give me an example dataset and what you want I can help you more
I'd try something like this to start with:
SELECT COUNTRY
, CATEGORY
, COUNT(COUNTRY)
FROM COMPETITION
WHERE DATE BETWEEN '2010-04-30' AND '2010-06-01'
ORDER BY CATEGORY DESC, COUNT(COUNTRY) DESC
;
Your original query's date limits are just for the year of 2010 but you specified you only wanted May 2010. If the Date column is a date or datetime time you'll need to cast the string to the appropriate datatype.
Your question asked "always hosted by one country" - do you know that a competition is only going to be hosted by one country during that particular month? If you do, you're pretty much done. If you don't, however, then you need to clarify what your criteria really are

Get number of orders made for each hour in a certain year

In an e-commerce website, I have a database that contains the following fields:
id
date_purchased
...
The field date_purchased has the current format : 2018-02-14 16:27:37(year-month-day hours-minutes-seconds)
I would like to get, for the year 2018 for example, ordered by ASC, the number of orders made for each hour.
I can't figure out how to order by a certain year, and count the number of orders made each hour of that year.
Something like :
SELECT count(*)
FROM table
WHERE (
SELECT *
FROM table
GROUP BY DATEPART(hour, [date_purchased]) ASC
)
GROUP BY year(date_purchased) ASC
I think the following would fulfill your requirements in mysql:
SELECT HOUR(date_purchased) hour_purchased, count(*) hour_Count
FROM yourtable
WHERE YEAR(date_purchased) = 2018
GROUP BY HOUR(date_purchased)
ORDER BY hour_count DESC

MySQL query - latest month/year

My SQL isn't the greatest, obviously, but what I'm trying to do is get the latest date in a database by finding the maximum year and month in an entry. Right now I have:
select max(Month), max(Year) from posts where postID = 25;
...but that results in the latest month and the latest year, though they're not part of the same entry. How can I make sure month and year are from one entry and not separate?
SELECT Month, Year FROM posts WHERE postID = 25 ORDER BY Year DESC, Month DESC LIMIT 1

MySQL Aggregate function in other aggregate function

I'm having a table with posts. Like (id int, date datetime).
How can I select average posts per day count for each month with one sql request?
Thank you!
This should do it for you:
select month, avg(posts_per_day)
from (select day(date), month(date) as month, count(*) as posts_per_day
from posts group by 1,2) x
group by 1
Explanation: Because you are doing an aggregate on an aggregate, there is no getting around doing a query on a query:
The inner query calculates the number per day and captures the month.
The outer query averages this count , grouping by month.
You can get the number of posts per month like this:
SELECT COUNT(*) AS num_posts_per_month FROM table GROUP BY MONTH(date);
Now we need the number of days in a month:
SELECT COUNT(*) / DATEDIFF(MAKEDATE(YEAR(date), MONTH(date)) + INTERVAL 1 MONTH, MAKEDATE(YEAR(date), MONTH(date))) AS avg_over_month
FROM table GROUP BY MONTH(date);
This will get the average number of posts per day during the calendar month of the post. That is, averages during the current month will continue to rise until the end of the month. If you want real averages during the current month, you have to put in a conditional to get the true number of elapsed days.