Display count of column excluding min and max values - mysql

I want to count how many unique occurrences of an activity occurs in the table (FRIENDS) below. Then, I want to print the activities whom which their occurrences are not the maximum or minimum value of all occurrences.
***ID/Name/Activity***
1/James/Horse Riding
2/Eric/Eating
3/Sean/Eating
4/John/Horse Riding
5/Chris/Eating
6/Jessica/Paying
Ex:
Horse Riding occur 140 times
Playing occurs 170 times
Eating occurs 120 times
Walking occurs 150 times
Running occurs 200 times
The max occurrence here is Running, occurring 200 times, and the minimum occurrence here is Eating, occurring 120 times.
Therefore, I want to display
Horse Riding
Playing
Walking
In no particular order.
This is a code I have so far, but I keep getting a syntax error. When I don't get a syntax error, I get a "Every derived table must have its own alias error." I am new to SQL so I appreciate any advice I can get.
SELECT ACTIVITY, count(ACTIVITY) as Occurences FROM FRIENDS,
(SELECT MAX(Occur) AS Ma,MIN(Occur) AS Mi FROM (SELECT ACTIVITY, count(ACTIVITY) as Occur
FROM FRIENDS GROUP by City)) as T
GROUP BY City HAVING Occurences!=T.Ma AND Occurences!=T.Mi ORDER BY Occurences DESC

In MySQL 8.0, you can do this with aggregation and window functions:
select *
from (
select activity, count(*) cnt,
rank() over(order by count(*)) rn_asc,
rank() over(order by count(*) desc) rn_desc
from mytable
group by activity
) t
where rn_asc > 1 and rn_desc > 1
The subquery counts the occurences of each activity, and ranks them in both ascending and descending oders. All that is left to do is exclude the top and bottom records. If there are top ties (or bottoms), the query evicts them.
In earlier versions, an option is a having clause:
select activity, count(*) cnt
from mytable t
group by activty
having count(*) > (select count(*) from mytable group by activity order by count(*) limit 1)
and count(*) < (select count(*) from mytable group by activity order by count(*) desc limit 1)

Related

Querying for type breakdown based on COUNT results

I have the following table <state_table> that tracks entries per minute of an id and its state:
minute id type
------ -- ----
1 A solid
1 A solid
1 A solid
1 A liquid
1 B solid
1 B solid
1 B liquid
.... 1000+ rows ...
1 ZZX liquid
1 ZZZ liquid
2 A solid
2 A solid
2 A liquid
With the following query, I can get the top 1000 pairs based on occurrence:
With TempIds AS (
SELECT
state_table.minute as minute,
state_table.id as id,
COUNT(*)
FROM
state_table
GROUP BY 1,2
) SELECT
TempId.minute,
TempId.id,
TempId.count
FROM
TempIds
ORDER BY 3 DESC
LIMIT 1000
;
e.g.
minute id count
------ -- ----
2 B 1002
3 A 990
1 C 800
3 B 798
How can I modify my query to get the type of the id?
For example, there are 1002 <minute=2,id=B> rows. Is there a way to get there are 402 solids and 600 liquids?
minute id count type
------ -- ---- -----
2 B 402 solid
2 B 600 liquid
3 A 330 solid
3 A 660 liquid
The only way I can think of is a fairly complex nested query:
With TempTop AS (
With TempIds AS (
SELECT
state_table.minute as minute,
state_table.id as id,
COUNT(*)
FROM
state_table
GROUP BY 1,2
) SELECT
TempId.minute as minute,
TempId.id as id,
TempId.count
FROM
TempIds
ORDER BY 3 DESC
LIMIT 1000
)
) SELECT
state_table.minute,
state_table.id,
state_table.type,
COUNT(*)
FROM
state_table, TempTop
WHERE
state_table.minute = TempTop.minute
AND state_table.id = TempTop.id
;
Is there a simpler way to make this query?
Goal:
For the Top 1000 most frequent pairs, get the breakdown of the type.
Your query is
SELECT minute, id, COUNT(*)
FROM state_table
GROUP BY minute, id
ORDER BY COUNT(*) DESC
LIMIT 1000;
and in this process you lose the types, because you want the 1000 top minute/id pairs, so you cannot simply group by minute, id, type instead.
If this is only about the types 'solid' and 'liquid', you can apply conditional aggregation to get the separate counts along:
SELECT
minute, id,
COUNT(*) AS total,
SUM(type = 'solid') AS solid,
SUM(type = 'liquid') AS liquid
FROM state_table
GROUP BY minute, id
ORDER BY COUNT(*) DESC
LIMIT 1000;
Summing up the boolean expressions works in MySQL, because true equals 1 and false equals 0 there.
The problem with the above queries are ties, by the way. If there are two pairs with the same 1000th count, you pick one arbitrarily instead of showing only 999 or 1001 pairs then in order to treat both tying pairs the same. So I'd probably re-write the queries using DENSE_RANK in order to properly handle ties.
For the more complicated case where the types are unknown at the point of writing the query, you need rows instead of columns, just as already shown in your request. In that case you really need to group by minute, id, type first. The easiest way to get the total counts is with SUM OVER then. Then rank the pairs as mentioned with DENSE_RANK and keep the top 1000.
SELECT minute, id, type, cnt
FROM
(
SELECT
minute, id, type, cnt,
DENSE_RANK() OVER (ORDER BY total DESC) AS rnk
FROM
(
SELECT
minute, id, type,
COUNT(*) AS cnt,
SUM(COUNT(*)) OVER(PARTITION BY minute, id) AS total
FROM state_table
GROUP BY minute, id, type
) counted
) ranked
WHERE rnk <= 1000
ORDER BY rnk, minute, id, type;
This can get you more than 1000 minute/id pairs in case of ties. You can reduce this with RANK instead of DENSE_RANK. If these two approaches still don't get the number you want, you may have to count minute/id pairs separately in a subquery instead:
select minute, id, type, COUNT(*)
from state_table
WHERE (minute, id) IN
(
SELECT minute, id
FROM state_table
GROUP BY minute, id
ORDER BY COUNT(*) DESC
LIMIT 1000
)
GROUP BY minute, id, type
ORDER BY
SUM(COUNT(*)) OVER (PARTITION BY minute, id) DESC,
minute, id, type;
Best way I can think of is, you still have to use 2 CTEs but not nested ones.
You have 3 tasks to achieve,
CTE will count the records.
second one will use row_number() window function to assign row numbers based on decreasing order or count.
last select will do a join between state table and second cte with a where clause specifying row_nunber <= 1000 and grouping based on ID , State and Minute.
With TempIdsCount AS (
SELECT
state_table.minute as minute,
state_table.id as id,
COUNT(*) rw_cnt
FROM
state_table
GROUP BY
1,
2
) TempIdsRowNum AS (
SELECT
minute,
id,
Row_number() over (order by rw_cnt desc) rw_num
FROM
TempIdsCount
)
SELECT
st.minute,
st.id,
st.type,
Count(*) cnt
FROM
state_table st
Join TempIdsRowNum trn on st.minute = trn.minute
and st.id = trn.id
Where
rw_num <= 1000
Group by
st.minute,
st.id,
st.type

Who to the number of users who have had one transaction per day?

Here is my query:
select count(1) from
(select count(1) num, user_id from pos_transactions pt
where date(created_at) <= '2020-6-21'
group by user_id
having num = 1) x
It gives me the number of users who have had 1 transaction until 2020-6-21. Now I want to group it also per date(created_at). I mean, I want to get a list of dates (such as 2020-6-21, 2020-6-22 etc ..) plus the number of users who have had 1 transaction in that date (day).
Any idea how can I do that?
EDIT: The result of query above is correct, the issue is, it's manually now. I mean, I have to increase 2020-6-21 by hand. I want to make it automatically. In other words, I want a list of all dates (from 2020-6-21 til now) contains the number of users who have had 1 transaction until that date.
If you want the number of users who had one transaction on each day, then you need to aggregate by the date as well:
select dte, count(*)
from (select date(created_at) as dte, user_id
from pos_transactions pt
where date(created_at) <= '2020-6-21'
group by dte, user_id
having count(*) = 1
) du
group by dte;

Get top item for each year

I have a datatable with some records. Using mysql I am able to get a result grouped by a specific period (year) and users and ordered (in descending order) by number of species.
SELECT YEAR(entry_date) AS period, uid AS user, COUNT(DISTINCT pid) AS species
FROM records
WHERE YEAR(entry_date)<YEAR(CURDATE())
GROUP BY period, uid
ORDER by period, species DESC
Please see attached picture of the result. But what if I only want the get the TOP USER (and number of species) for EACH year (the red marked rows)? How can I achieve that?
I am able to handle this later in my php code but it would be nice to have this sortered out already in mysql query.
Thanks for your help!
If you are running MySQL 8.0, you can use RANK() to rank records in years partitions by their count of species, and then filter on the top record per group:
SELECT *
FROM (
SELECT
YEAR(entry_date) AS period,
uid AS user,
COUNT(DISTINCT pid) AS species,
RANK() OVER(PARTITION BY YEAR(entry_date) ORDER BY COUNT(DISTINCT pid) DESC) rn
FROM records
WHERE entry_date < DATE_FORMAT(CURRENT_DATE, '%Y-01-01')
GROUP BY period, uid
) t
WHERE rn = 1
ORDER by period
This preserves top ties, if any. Note that uses an index-friendly filter on the dates in the WHERE clause.
In earlier versions, an equivalent option is to filter with a HAVING clause and a correlated subquery:
SELECT
YEAR(entry_date) AS period,
uid AS user,
COUNT(DISTINCT pid) AS species
FROM records r
WHERE entry_date < DATE_FORMAT(CURRENT_DATE, '%Y-01-01')
GROUP BY period, uid
HAVING COUNT(DISTINCT pid) = (
SELECT COUNT(DISTINCT r1.pid) species1
FROM records r1
WHERE YEAR(r1.entry_date) = period
GROUP BY r1.uid
ORDER BY species1 DESC
LIMIT 1
)
ORDER by period

PHP/MYSQL Displaying hiscores with a group by working but not as intended

Output is displaying things by the username, and the correct score is there, but it wont grab the other data with it on the same row.
I've tried adding a group by with proof as well but it's not working. Adding a group by proof will just add another record in the output showing two hiscores when I really only want one to show per a user.
SELECT MAX(total) AS total, sUsername, proof, approved
FROM userrankings
WHERE category = 0
GROUP BY sUsername
ORDER BY total DESC
This is the output:
Rank: 1
User: Test User
Score: 2414
Proof: html site 1
However, in the database the score is correct, but the proof section should be HTML 2 because the 2nd entry has the highest score, not the first entry, but with the group by sUsername, it's forcing to only grab the very first entry rather than the entry I need it to be displaying.
I understand that, for each user, you want to pull out the record that has the highest total.
If you are running MySQL 8.0, you can use window function ROW_NUMBER() to rank the records of each user in a subquery, and do the filtering in the outer query:
SELECT total, sUsername, proof, approved
FROM (
SELECT
total,
sUsername,
proof,
approved,
ROW_NUMBER() OVER(PARTITION BY sUsername ORDER BY total DESC ) rn
FROM userrankings
WHERE category = 0
) x WHERE rn = 1
ORDER BY total DESC
On previous versions of MySQL, one solution is to use a correlated subquery with a NOT EXISTS condition to ensure that only the relevant records are displayed,, like:
SELECT total, sUsername, proof, approved
FROM userrankings u
WHERE
category = 0
AND NOT EXISTS (
SELECT 1
FROM userrankings u1
WHERE
u1.category = 0
AND u1.sUsername = u.sUsername
AND u1.total > u.total
)
ORDER BY total DESC
You need to GROUP_CONCAT all the proofs:
SELECT MAX(total) AS total, sUsername, GROUP_CONCAT(proof ORDER BY total DESC), approved
FROM userrankings
WHERE category = 0
GROUP BY sUsername
ORDER BY total DESC
And looking at ORDER BY total DESC, the first of proof should be your result

Select all rows with the same aggregated value

There is a task: develop a fragment of the Web site that provides work with one table.
Attributes of the table:
Day of the week,
Time of the beginning of the lesson,
Subject name,
Number of the audience,
Full name of the teacher.
We need to make a query: determine the day of the week with the largest number of entries, if there are more than one maximum (ie, they are the same), then output them all. I did the query as follows:
SELECT COUNT (*) cnt, day
FROM schedule
GROUP BY day
ORDER BY cnt DESC
LIMIT 1;
But if there are several identical maxima, then only one is displayed. How to write a query which returns them all?
You can use your query as a subquery in the HAVING clause, e.g.:
SELECT day, count(*) as cnt
FROM schedule
GROUP BY day
HAVING count(*) = (
SELECT count(*) as cnt
FROM schedule
GROUP BY day
ORDER BY cnt DESC
LIMIT 1
)
ORDER BY day