Get top item for each year - mysql

I have a datatable with some records. Using mysql I am able to get a result grouped by a specific period (year) and users and ordered (in descending order) by number of species.
SELECT YEAR(entry_date) AS period, uid AS user, COUNT(DISTINCT pid) AS species
FROM records
WHERE YEAR(entry_date)<YEAR(CURDATE())
GROUP BY period, uid
ORDER by period, species DESC
Please see attached picture of the result. But what if I only want the get the TOP USER (and number of species) for EACH year (the red marked rows)? How can I achieve that?
I am able to handle this later in my php code but it would be nice to have this sortered out already in mysql query.
Thanks for your help!

If you are running MySQL 8.0, you can use RANK() to rank records in years partitions by their count of species, and then filter on the top record per group:
SELECT *
FROM (
SELECT
YEAR(entry_date) AS period,
uid AS user,
COUNT(DISTINCT pid) AS species,
RANK() OVER(PARTITION BY YEAR(entry_date) ORDER BY COUNT(DISTINCT pid) DESC) rn
FROM records
WHERE entry_date < DATE_FORMAT(CURRENT_DATE, '%Y-01-01')
GROUP BY period, uid
) t
WHERE rn = 1
ORDER by period
This preserves top ties, if any. Note that uses an index-friendly filter on the dates in the WHERE clause.
In earlier versions, an equivalent option is to filter with a HAVING clause and a correlated subquery:
SELECT
YEAR(entry_date) AS period,
uid AS user,
COUNT(DISTINCT pid) AS species
FROM records r
WHERE entry_date < DATE_FORMAT(CURRENT_DATE, '%Y-01-01')
GROUP BY period, uid
HAVING COUNT(DISTINCT pid) = (
SELECT COUNT(DISTINCT r1.pid) species1
FROM records r1
WHERE YEAR(r1.entry_date) = period
GROUP BY r1.uid
ORDER BY species1 DESC
LIMIT 1
)
ORDER by period

Related

Display count of column excluding min and max values

I want to count how many unique occurrences of an activity occurs in the table (FRIENDS) below. Then, I want to print the activities whom which their occurrences are not the maximum or minimum value of all occurrences.
***ID/Name/Activity***
1/James/Horse Riding
2/Eric/Eating
3/Sean/Eating
4/John/Horse Riding
5/Chris/Eating
6/Jessica/Paying
Ex:
Horse Riding occur 140 times
Playing occurs 170 times
Eating occurs 120 times
Walking occurs 150 times
Running occurs 200 times
The max occurrence here is Running, occurring 200 times, and the minimum occurrence here is Eating, occurring 120 times.
Therefore, I want to display
Horse Riding
Playing
Walking
In no particular order.
This is a code I have so far, but I keep getting a syntax error. When I don't get a syntax error, I get a "Every derived table must have its own alias error." I am new to SQL so I appreciate any advice I can get.
SELECT ACTIVITY, count(ACTIVITY) as Occurences FROM FRIENDS,
(SELECT MAX(Occur) AS Ma,MIN(Occur) AS Mi FROM (SELECT ACTIVITY, count(ACTIVITY) as Occur
FROM FRIENDS GROUP by City)) as T
GROUP BY City HAVING Occurences!=T.Ma AND Occurences!=T.Mi ORDER BY Occurences DESC
In MySQL 8.0, you can do this with aggregation and window functions:
select *
from (
select activity, count(*) cnt,
rank() over(order by count(*)) rn_asc,
rank() over(order by count(*) desc) rn_desc
from mytable
group by activity
) t
where rn_asc > 1 and rn_desc > 1
The subquery counts the occurences of each activity, and ranks them in both ascending and descending oders. All that is left to do is exclude the top and bottom records. If there are top ties (or bottoms), the query evicts them.
In earlier versions, an option is a having clause:
select activity, count(*) cnt
from mytable t
group by activty
having count(*) > (select count(*) from mytable group by activity order by count(*) limit 1)
and count(*) < (select count(*) from mytable group by activity order by count(*) desc limit 1)

Who to the number of users who have had one transaction per day?

Here is my query:
select count(1) from
(select count(1) num, user_id from pos_transactions pt
where date(created_at) <= '2020-6-21'
group by user_id
having num = 1) x
It gives me the number of users who have had 1 transaction until 2020-6-21. Now I want to group it also per date(created_at). I mean, I want to get a list of dates (such as 2020-6-21, 2020-6-22 etc ..) plus the number of users who have had 1 transaction in that date (day).
Any idea how can I do that?
EDIT: The result of query above is correct, the issue is, it's manually now. I mean, I have to increase 2020-6-21 by hand. I want to make it automatically. In other words, I want a list of all dates (from 2020-6-21 til now) contains the number of users who have had 1 transaction until that date.
If you want the number of users who had one transaction on each day, then you need to aggregate by the date as well:
select dte, count(*)
from (select date(created_at) as dte, user_id
from pos_transactions pt
where date(created_at) <= '2020-6-21'
group by dte, user_id
having count(*) = 1
) du
group by dte;

PHP/MYSQL Displaying hiscores with a group by working but not as intended

Output is displaying things by the username, and the correct score is there, but it wont grab the other data with it on the same row.
I've tried adding a group by with proof as well but it's not working. Adding a group by proof will just add another record in the output showing two hiscores when I really only want one to show per a user.
SELECT MAX(total) AS total, sUsername, proof, approved
FROM userrankings
WHERE category = 0
GROUP BY sUsername
ORDER BY total DESC
This is the output:
Rank: 1
User: Test User
Score: 2414
Proof: html site 1
However, in the database the score is correct, but the proof section should be HTML 2 because the 2nd entry has the highest score, not the first entry, but with the group by sUsername, it's forcing to only grab the very first entry rather than the entry I need it to be displaying.
I understand that, for each user, you want to pull out the record that has the highest total.
If you are running MySQL 8.0, you can use window function ROW_NUMBER() to rank the records of each user in a subquery, and do the filtering in the outer query:
SELECT total, sUsername, proof, approved
FROM (
SELECT
total,
sUsername,
proof,
approved,
ROW_NUMBER() OVER(PARTITION BY sUsername ORDER BY total DESC ) rn
FROM userrankings
WHERE category = 0
) x WHERE rn = 1
ORDER BY total DESC
On previous versions of MySQL, one solution is to use a correlated subquery with a NOT EXISTS condition to ensure that only the relevant records are displayed,, like:
SELECT total, sUsername, proof, approved
FROM userrankings u
WHERE
category = 0
AND NOT EXISTS (
SELECT 1
FROM userrankings u1
WHERE
u1.category = 0
AND u1.sUsername = u.sUsername
AND u1.total > u.total
)
ORDER BY total DESC
You need to GROUP_CONCAT all the proofs:
SELECT MAX(total) AS total, sUsername, GROUP_CONCAT(proof ORDER BY total DESC), approved
FROM userrankings
WHERE category = 0
GROUP BY sUsername
ORDER BY total DESC
And looking at ORDER BY total DESC, the first of proof should be your result

MySQL limit 5 per month

I try to show the 'top 5' per month of worked hours.
I have the following query:
SELECT
concat(m.firstname, " ",m.lastname) AS name,
SEC_TO_TIME(SUM(TIME_TO_SEC(TIMEDIFF(pl.end_activity,pl.start_activity)))) AS activity,
month(start_activity) AS month,
year(start_activity) AS year
FROM
log AS pl
INNER JOIN
employee AS m
ON
m.employee = pl.employee
GROUP BY
name,
year,
month,
ORDER BY
year,
month,
activity
I tried: limit 0,5 bit it gives me only the first 5 records of all. How can I show 5 records ordered by month?
In MySQL version 8.0.2 and above, we can utilize Window Functions. We can utilize Row_Number() window function to determine row numbers within a partition of concatenated expression of year and month. Ordering within the partition is done based on the descending order of activity.
We can then use this result-set as a Derived Table, and consider row number up-to 5. This will give us 5 rows per month, having top activity values.
SELECT dt.*
FROM
(
SELECT
concat(m.firstname, " ",m.lastname) AS name,
SEC_TO_TIME(SUM(TIME_TO_SEC(TIMEDIFF(pl.end_activity,pl.start_activity)))) AS activity,
month(start_activity) AS month,
year(start_activity) AS year,
ROW_NUMBER() OVER (PARTITION BY CONCAT(year(start_activity), month(start_activity))
ORDER BY SEC_TO_TIME(SUM(TIME_TO_SEC(TIMEDIFF(pl.end_activity,pl.start_activity)))) DESC) AS row_no
FROM
log AS pl
INNER JOIN
employee AS m
ON
m.employee = pl.employee
GROUP BY
name,
year,
month
) AS dt
WHERE dt.row_no <= 5
ORDER BY
dt.year,
dt.month,
dt.activity

Select all rows with the same aggregated value

There is a task: develop a fragment of the Web site that provides work with one table.
Attributes of the table:
Day of the week,
Time of the beginning of the lesson,
Subject name,
Number of the audience,
Full name of the teacher.
We need to make a query: determine the day of the week with the largest number of entries, if there are more than one maximum (ie, they are the same), then output them all. I did the query as follows:
SELECT COUNT (*) cnt, day
FROM schedule
GROUP BY day
ORDER BY cnt DESC
LIMIT 1;
But if there are several identical maxima, then only one is displayed. How to write a query which returns them all?
You can use your query as a subquery in the HAVING clause, e.g.:
SELECT day, count(*) as cnt
FROM schedule
GROUP BY day
HAVING count(*) = (
SELECT count(*) as cnt
FROM schedule
GROUP BY day
ORDER BY cnt DESC
LIMIT 1
)
ORDER BY day