MySQL Aggregate Row Result Count - mysql

Is there a simple way in MySQL to return the number of aggregate result rows?
For example:
SELECT `name`, SUM(`points`)
FROM `goals`
GROUP BY `name`
HAVING SUM(`points`) > 10
If I were looking for number of unique names, if possible, how may I achieve this?
For example, if a return data set is:
Player1 | 11
Player2 | 15
Player3 | 17
Is there a way to return the number of results, which would be three (3)?

Here's one option using a subquery:
SELECT COUNT(*)
FROM (
SELECT `name`, SUM(`points`)
FROM `goals`
GROUP BY `name`
HAVING SUM(`points`) > 10
) t

select count(*) as num_of_records
from
(
SELECT name
FROM goals
GROUP BY name
HAVING SUM(points) > 10
) tmp

Related

return max value for each group but when there is a tie, return one with lower id in MySQL

I am using MySQL 8.0
My table looks like this:
group user_id score
A 1 33
B 2 22
A 3 22
B 4 22
I want it to return
group user_id score
A 1 33
B 2 22
note that even though group B has same score user_id=2 is final winner since he/she has lower user_id
How to improve from below query...?
SELECT group, user_id, max(score)
from table
Thanks in advance!
#Ambleu you are on the right track using MAX(), but to do this you need to use it in addition to MIN(), and also use a sub query to get the MAX(score) like this:
SELECT `mt`.`group`,
MIN(`mt`.`user_id`) AS `user_id`,
`mt`.`score`
FROM `myTable` AS `mt`
JOIN (SELECT `group`,
MAX(`score`) AS `score`
FROM `myTable`
GROUP BY `group`) AS `der` ON `der`.`group` = `mt`.`group`
AND `der`.`score` = `mt`.`score`
GROUP BY `mt`.`group`, `mt`.`score`
Here are your tables and the solution query mocked up on db-fiddle.
If this doesn't get you what you need please let me know and I'll try to assist further.
In MySQL 8.0, I would recommend window functions:
select grp, user_id, score
fom (
select t.*,
row_number() over(partition by grp order by score desc, user_id) rn
from mytable t
) t
where rn = 1
Alternatively, you can use a correlated subquery for filtering:
select t.*
from mytable t
where user_id = (
select t1.user_id
from mytable t1
where t1.grp = t.grp
order by t1.score desc, t1.user_id limit 1
)
The second query would take advantage of an index on (grp, score desc, user_id).
Side note: group is a language keyword, hence a poor choice for a column name. I renamed it to grp in the queries.

MySQL update table only with the highest values of another table

For a game site.
All games are recorded if the player's score is greater than his old score
Table of all players (over 10,000 players)
CREATE TABLE games (
PlayerID INT UNSIGNED,
Date TIMESTAMP(12),
Score BIGINT UNSIGNED DEFAULT 0,
#...other data
);
Once a month, I do an update of the table of records best. And after I erase all games.
Table of best players (top 50)
CREATE TABLE best (
#...same as games, without final other data
PlayerID INT UNSIGNED,
Date TIMESTAMP(12),
Score BIGINT UNSIGNED DEFAULT 0
);
So I add the 50 best players of the table games in to the table best:
INSERT INTO best (PlayerID, Date, Score)
SELECT PlayerID, Date, Score FROM games ORDER BY Score DESC LIMIT 50;
And after (and this is where I have a problem) I try to keep in best only the best 50. At this point best contains 100 lines.
What I have to do:
Do not store several times the same player PlayerID.
Delete the worst Score for this player.
And at the end, leaving only the top 50.
->
+----------+---------+
| PlayerID | Score |
+----------+---------+
| 25 | 20000 | New
| 25 | 25000 | Old best
| 40 | 10000 | Old best
| 57 | 80000 | New best
| 57 | 45000 | Old
| 80 | 35000 | New best
+----------+---------+
I have to retain in the end only 50 lines (the ones with "best" in my example).
I tried many things, but I have not succeeded in achieve the expected result.
I am using PHP, so if it is possible to do it simply with a intermediare storage in an array, that's fine too.
The speed is not a priority because it is an operation that is done only once a month.
The following SQL returns the top 50 scores:
SELECT `PlayerId`, max(`Score`) MaxScore
FROM (
SELECT `PlayerId`, `Date`, `Score` FROM games
UNION
SELECT `PlayerId`, `Date`, `Score` FROM best
) t
GROUP BY `PlayerId`
ORDER BY `MaxScore` DESC
LIMIT 50
You can use the result to overwrite the table best. For this you also need the corresponding Date field, which is missing so far. The next SQL will also return a maxDate field which corresponds to the highscore.
SELECT t2.`PlayerId`, max(t2.`Date`) maxDate, top.`MaxScore`
FROM
(
SELECT `PlayerId`, max(`Score`) MaxScore
FROM (
SELECT `PlayerId`, `Date`, `Score` FROM games
UNION
SELECT `PlayerId`, `Date`, `Score` FROM best
) t1
GROUP BY `PlayerId`
ORDER BY `MaxScore` DESC
LIMIT 50
) top
LEFT JOIN (
SELECT `PlayerId`, `Date`, `Score` FROM games
UNION
SELECT `PlayerId`, `Date`, `Score` FROM best
) t2 ON t2.`PlayerId` = top.`PlayerId` AND t2.`Score` = top.`MaxScore`
GROUP BY t2.`PlayerId`
ORDER BY top.`MaxScore` DESC
To transfer the new top 50 highscores into the best table you can use a temporary table like tmp_best. Insert the top scores into the empty table tmp_best with (you have to insert your select query from above):
INSERT INTO tmp_best (`PlayerId`, `Date`, `Score`)
SELECT ...
After this the best table can be emptied and then you can copy the rows from tmp_best into best.
Here is an alternative solution, which has simplified SQL. The difference
to the solution above is the using of a temporary table tmp_all at the beginning for the unified data. Before using the following SQL you have to create tmp_all, which can be a copy of the structure of games or best.
DELETE FROM tmp_all;
INSERT INTO tmp_all
SELECT `PlayerId`, `Date`, `Score` FROM games
UNION
SELECT `PlayerId`, `Date`, `Score` FROM best
;
DELETE FROM best;
INSERT INTO best (`PlayerId`, `Date`, `Score`)
SELECT t2.`PlayerId`, max(t2.`Date`) maxDate, top.`MaxScore`
FROM
(
SELECT `PlayerId`, max(`Score`) MaxScore
FROM tmp_all t1
GROUP BY `PlayerId`
ORDER BY `MaxScore` DESC
LIMIT 50
) top
LEFT JOIN tmp_all t2 ON t2.`PlayerId` = top.`PlayerId` AND t2.`Score` = top.`MaxScore`
GROUP BY t2.`PlayerId`
ORDER BY top.`MaxScore` DESC
;
SELECT PlayerID, Date, Score FROM games ORDER BY Score DESC LIMIT 50
UNION
SELECT PlayerID, Date, Score FROM best
Here you'll get the best 50 players all-time. Then, as suggested by #ethrbunny, erase the best table and populate it again with the above query. You can use a TEMPORARY TABLE
UNION guarantees you that you'll get no duplicated player

SELECT COUNT(*) for unique pairs of IDs

I have a table like the following, named matches:
match_id ( AUTO INCREMENT )
user_id ( INT 11 )
opponent_id ( INT 11 )
date ( TIMESTAMP )
What I have to do is to SELECT the count of the rows where user_id and opponent_id are a unique pair. The goal is to see the count of total matches started between different users.
So if we have:
user_id = 10 and opponent_id = 11
user_id = 20 and opponent_id = 22
user_id = 10 and opponent_id = 11
user_id = 11 and opponent_id = 10
The result of the query should be 2.
In fact we only have 2 matches that have been started by a couple of different users. Match 1 - 3 - 4 are the same matches, because played by the same couple of user IDs.
Can anyone help me with this?
I have done similar queries but never on pairs of IDs, always on a single ID.
FancyPants answer is correct, but I prefer to use DISTINCT when no aggregate function is used:
SELECT COUNT(DISTINCT
LEAST(user_id, opponent_id),
GREATEST(user_id, opponent_id)
)
FROM yourtable;
is sufficient.
SELECT COUNT(*) AS nr_of_matches FROM (
SELECT
LEAST(user_id, opponent_id) AS pl1,
GREATEST(user_id, opponent_id) AS pl2
FROM yourtable
GROUP BY pl1, pl2
) sq
see it working in an sqlfiddle

Fetch 2nd Higest value from MySql DB with GROUP BY

I have a table tbl_patient and I want to fetch last 2 visit of each patient in order to compare whether patient condition is improving or degrading.
tbl_patient
id | patient_ID | visit_ID | patient_result
1 | 1 | 1 | 5
2 | 2 | 1 | 6
3 | 2 | 3 | 7
4 | 1 | 2 | 3
5 | 2 | 3 | 2
6 | 1 | 3 | 9
I tried the query below to fetch the last visit of each patient as,
SELECT MAX(id), patient_result FROM `tbl_patient` GROUP BY `patient_ID`
Now i want to fetch the 2nd last visit of each patient with query but it give me error
(#1242 - Subquery returns more than 1 row)
SELECT id, patient_result FROM `tbl_patient` WHERE id <(SELECT MAX(id) FROM `tbl_patient` GROUP BY `patient_ID`) GROUP BY `patient_ID`
Where I'm wrong
select p1.patient_id, p2.maxid id1, max(p1.id) id2
from tbl_patient p1
join (select patient_id, max(id) maxid
from tbl_patient
group by patient_id) p2
on p1.patient_id = p2.patient_id and p1.id < p2.maxid
group by p1.patient_id
id11 is the ID of the last visit, id2 is the ID of the 2nd to last visit.
Your first query doesn't get the last visits, since it gives results 5 and 6 instead of 2 and 9.
You can try this query:
SELECT patient_ID,visit_ID,patient_result
FROM tbl_patient
where id in (
select max(id)
from tbl_patient
GROUP BY patient_ID)
union
SELECT patient_ID,visit_ID,patient_result
FROM tbl_patient
where id in (
select max(id)
from tbl_patient
where id not in (
select max(id)
from tbl_patient
GROUP BY patient_ID)
GROUP BY patient_ID)
order by 1,2
SELECT id, patient_result FROM `tbl_patient` t1
JOIN (SELECT MAX(id) as max, patient_ID FROM `tbl_patient` GROUP BY `patient_ID`) t2
ON t1.patient_ID = t2.patient_ID
WHERE id <max GROUP BY t1.`patient_ID`
There are a couple of approaches to getting the specified resultset returned in a single SQL statement.
Unfortunately, most of those approaches yield rather unwieldy statements.
The more elegant looking statements tend to come with poor (or unbearable) performance when dealing with large sets. And the statements that tend to have better performance are more un-elegant looking.
Three of the most common approaches make use of:
correlated subquery
inequality join (nearly a Cartesian product)
two passes over the data
Here's an approach that uses two passes over the data, using MySQL user variables, which basically emulates the analytic RANK() OVER(PARTITION ...) function available in other DBMS:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM (
SELECT p.id
, p.patient_id
, p.visit_id
, p.patient_result
, #rn := if(#prev_patient_id = patient_id, #rn + 1, 1) AS rn
, #prev_patient_id := patient_id AS prev_patient_id
FROM tbl_patients p
JOIN (SELECT #rn := 0, #prev_patient_id := NULL) i
ORDER BY p.patient_id DESC, p.id DESC
) t
WHERE t.rn <= 2
Note that this involves an inline view, which means there's going to be a pass over all the data in the table to create a "derived tabled". Then, the outer query will run against the derived table. So, this is essentially two passes over the data.
This query can be tweaked a bit to improve performance, by eliminating the duplicated value of the patient_id column returned by the inline view. But I show it as above, so we can better understand what is happening.
This approach can be rather expensive on large sets, but is generally MUCH more efficient than some of the other approaches.
Note also that this query will return a row for a patient_id if there is only one id value exists for that patient; it does not restrict the return to just those patients that have at least two rows.
It's also possible to get an equivalent resultset with a correlated subquery:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM tbl_patients t
WHERE ( SELECT COUNT(1) AS cnt
FROM tbl_patients p
WHERE p.patient_id = t.patient_id
AND p.id >= t.id
) <= 2
ORDER BY t.patient_id ASC, t.id ASC
Note that this is making use of a "dependent subquery", which basically means that for each row returned from t, MySQL is effectively running another query against the database. So, this will tend to be very expensive (in terms of elapsed time) on large sets.
As another approach, if there are relatively few id values for each patient, you might be able to get by with an inequality join:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM tbl_patients t
LEFT
JOIN tbl_patients p
ON p.patient_id = t.patient_id
AND t.id < p.id
GROUP
BY t.id
, t.patient_id
, t.visit_id
, t.patient_result
HAVING COUNT(1) <= 2
Note that this will create a nearly Cartesian product for each patient. For a limited number of id values for each patient, this won't be too bad. But if a patient has hundreds of id values, the intermediate result can be huge, on the order of (O)n**2.
Try this..
SELECT id, patient_result FROM tbl_patient AS tp WHERE id < ((SELECT MAX(id) FROM tbl_patient AS tp_max WHERE tp_max.patient_ID = tp.patient_ID) - 1) GROUP BY patient_ID
Why not use simply...
GROUP BY `patient_ID` DESC LIMIT 2
... and do the rest in the next step?

Getting counts of records within bands

I have a table of data which contains numbers from 0 to 100.
I would like to write a query that gets counts of records in the bands 0 to 10, 11 to 20 ......and 91 to 100
Is this possible?
Many thanks for any help.
Dave
Assuming your table looks something like this...
CREATE TABLE `test1` (
`ts` BIGINT(20) DEFAULT NULL
) ENGINE=INNODB;
...you could tackle this with a mathematical approach:
SELECT ROUND((T.ts-1)/10) AS "tt",
COUNT(*)
FROM test1 AS T
GROUP BY tt;
sub query would do the job for you
SELECT
(SELECT COUNT(brands) FROM data_table where brands BETWEEN 1 and 10 ) as '1-10',
...
(SELECT COUNT(brands) FROM data_table where brands BETWEEN 90 and 100 ) as '90-100',
Select count from(
Select * from (select * from table where val >= 'lowerlimit') where val<='upperlimit')
This should give you the results:
SELECT MIN(`id`) `id_from`, MAX(`id`) as `id_to`, COUNT(1) `count_id`
FROM `session`
GROUP BY (FLOOR(IF(id>0, id-1, id) / 10));
Please feel free to change the table and column names as per your schema.
Hope this helps.