I have a database of baseball plays with a PlayerID and a TypeID (the kind of play: double, strike out, etc). The data looks something like this:
+----------+--------+
| playerid | typeid |
+----------+--------+
| 2 | 4 |
| 2 | 4 |
| 2 | 7 |
| 3 | 7 |
| 3 | 7 |
| 3 | 7 |
| 3 | 26 |
| 3 | 7 |
I'm trying to find which players had the most of each kind of play. E.g. Jim (PlayerID 3) had the most strike outs (TypeID 7) and Bob (PlayerID 2) had the most home runs (TypeID 4) which should result in the following table:
+----------+--------+----------------+
| playerid | typeid | max(playcount) |
+----------+--------+----------------+
| 2 | 4 | 12 |
| 3 | 7 | 9 |
| 3 | 26 | 1 |
My best attempt so far is to run:
SELECT playerid,typeid,MAX(playcount) FROM
(
SELECT playerid,typeid,COUNT(*) playcount FROM plays GROUP BY playerid,typeid
) AS t GROUP BY typeid;
Which returns the proper maximums of each type, but the associated PlayerIDs are all wrong and I can't figure out why. I'm sure I'm missing something simple (or making this overly complicated) but can't figure it out. Any ideas?
In MySQL this group=wise maximum it is sadly not a simply as you want it to be.
Here's a way to do it using a method similar to what is suggested in ROW_NUMBER() in MySQL
SELECT a.*
FROM (
SELECT playerid
,typeid
,COUNT(*) playcount
FROM plays
GROUP BY playerid,typeid
) a
LEFT JOIN
(
SELECT playerid
,typeid
,COUNT(*) playcount
FROM plays
GROUP BY playerid,typeid
) b
ON a.typeid = b.typeid
AND a.playcount < b.playcount
WHERE b.playerid IS NULL
you have to put playerid column also in group by clause.
rest all is ok.
SELECT playerid,typeid,MAX(playcount) FROM
(
SELECT playerid,typeid,COUNT(*) playcount FROM plays GROUP BY playerid,typeid
) AS t GROUP BY playerid,typeid;
Would this work?
SELECT
playertypecounts.*
FROM
(SELECT
playerid,
typeid,
COUNT(*) as playcount
FROM plays
GROUP BY playerid, typeid) playertypecounts
INNER JOIN
(SELECT
typeid,
MAX(playcount) as maxplaycount
FROM
(SELECT
playerid,
typeid,
COUNT(*) as playcount
FROM plays
GROUP BY playerid, typeid) playcounts
GROUP BY typeid) maxplaycounts
ON playertypecounts.typeid = maxplaycounts.typeid
AND playertypecounts.playcount = maxplaycounts.maxplaycount
This part of the query block returns the maximum playcount for each typeid:
(SELECT
typeid,
MAX(playcount) as maxplaycount
FROM
(SELECT
playerid,
typeid,
COUNT(*) as playcount
FROM plays
GROUP BY playerid, typeid) playcounts
GROUP BY typeid) maxplaycounts
Then it's inner-joined to all the typeid/playcounts in order to filter those counts where the player(s) have the maximum counts for any given typeid.
See SQLFiddle example.
Having said all that, I actually prefer #KarlKieninger's answer since it's more elegant.
Related
My task is to find all those subjects, by their id, that have (at least one, but) the fewest lowest passing grades in the database (the grade being the grade 6). I've managed to write the solution with three queries, however my task is to write it as a single query in MySQL. Thank you in advance.
-- 1. single query "solution"
SELECT subject_id FROM (SELECT subject_id, COUNT(*) AS six_count
FROM exams WHERE grade = 6
GROUP BY subject_id) AS sixes
WHERE subject_id = (SELECT MIN(six_count) FROM sixes);
-- 2. multiple queries solution
CREATE TABLE sixes AS (SELECT subject_id, COUNT(*) AS six_count
FROM exams WHERE grade = 6
GROUP BY subject_id);
SELECT subject_id FROM sixes
WHERE subject_id = (SELECT MIN(six_count) FROM sixes);
DROP TABLE sixes;
EDIT:
Exams table example:
| subject_id | student_id | exam_year | exam_mark | grade | exam_date |
| 1 | 20100022| 2011 | 'apr' | 10 | 2011-04-11 |
| 2 | 20100055| 2011 | 'oct' | 6 | 2011-10-04 |
| 3 | 20110030| 2011 | 'jan1' | 7 | 2011-01-26 |
| 5 | 20110055| 2011 | 'jan2' | 6 | 2011-02-13 |
| 5 | 20110001| 2011 | 'jun1' | 8 | 2011-06-23 |
This should do the trick. The sub query selects the first lowest number of sixes. The main query selects all subjects with that number. The trick is in ORDER BY count(*) LIMIT 1, which makes the sub query return the record with the lowest count.
SELECT
subject_id,
count(*) as six_count
FROM exams
WHERE grade = 6
GROUP BY subject_id
HAVING count(*) =
( SELECT count(*)
FROM exams
WHERE grade = 6
GROUP BY subject_id
ORDER BY count(*)
LIMIT 1
)
This pattern should do the trick. Generalized names.
SELECT subjectID
FROM TEST_DATA
WHERE grade = 6
GROUP
BY SubjectID
HAVING COUNT(1) =
( SELECT count(1) AS minCount
FROM TEST_DATA
WHERE grade = 6
GROUP
BY subjectID
ORDER
BY minCount
LIMIT 1
);
I've got a table like;
ID | Winner | Loser | WinningCaster | LosingCaster
0 | Player A | Player B | Warcaster A | Warcaster B
1 | Player A | Player B | Warcaster C | Warcaster A
2 | Player C | Player D | Warcaster A | Warcaster B
etc..
With various values for Player, and Warcaster.
WinningCaster / LosingCaster is a finite namelist, and I want to make a query that will find me which name occurs the most often, across both columns, both with and without a particular player entry.
IE Player A should return WarcasterA with 2, and an overall Query should return WarcasterA with 3.
So far I've only been able to get the most frequent from either column, not from both, with the following;
SELECT
ID, Winner, Loser, CasterWinner, Count(CasterWinner) AS Occ
FROM
`Games`
GROUP BY
CasterWinner
ORDER BY
Occ DESC
LIMIT 1
Use union all:
select caster, count(*)
from ((select casterwinner as caster from games
) union all
(select casterloser from games
)
) c
group by caster
order by count(*) desc
limit 1;
Sorry to confuse you about my title. I am building an auction system and I am having a difficulty in getting the user's winning item.
Example I have a table like this:
the columns are:
id, product_id, user_id, status, is_winner, info, bidding_price, bidding_date
here's my sql fiddle:
http://sqlfiddle.com/#!9/7097d/1
I want to get every user's item that they already win. So I need to identify if they are the last who bid in that item.
I need to filter it using a user_id.
If I do a query like this:
SELECT MAX(product_id) AS product_id FROM auction_product_bidding
WHERE user_id = 3;
it will get only the product_id that is 12 and the product_id of 9 did not get. Product ID 9 is also that last bid of the user_id 3.
Can you help me? I hope you got my point. Thanks. Sorry if my question a little bit confusing.
According to your question, seems 11 is also what you want, try this query:
SELECT apd.product_id
FROM auction_product_bidding apd
JOIN (
SELECT MAX(bidding_date) AS bidding_date, product_id
FROM auction_product_bidding
GROUP BY product_id
) t
ON apd.product_id = t.product_id
AND apd.bidding_date = t.bidding_date
WHERE apd.user_id = 3;
Check Demo Here
select id,product_id,user_id,status,is_winner,info,bidding_price,bidding_date,rank
from
( SELECT apb.*,
greatest(#rank:=if(product_id=#prodGrp,#rank+1,1),-1) as rank,
#prodGrp:=product_id as dummy
FROM auction_product_bidding apb
cross join (select #prodGrp:=-1,#rank:=0) xParams
order by product_id,bidding_date DESC
) xDerived
where user_id=3 and rank=1;
That user won 9,11,12
+----+------------+---------+--------+-----------+------+---------------+---------------------+------+
| id | product_id | user_id | status | is_winner | info | bidding_price | bidding_date | rank |
+----+------------+---------+--------+-----------+------+---------------+---------------------+------+
| 60 | 9 | 3 | | 0 | | 75000.00 | 2016-08-02 16:31:23 | 1 |
| 59 | 11 | 3 | | 0 | | 15000.00 | 2016-08-02 12:04:16 | 1 |
| 68 | 12 | 3 | | 0 | | 18000.00 | 2016-08-10 09:20:01 | 1 |
+----+------------+---------+--------+-----------+------+---------------+---------------------+------+
SELECT product_id FROM auction_product_bidding where bidding_price= any
(select max(bidding_price) from auction_product_bidding group by product_id)
and user_id='3';
select * from
(select product_id,user_id,max(bidding_price) from
(select * from auction_product_bidding order by bidding_price desc) a
group by product_id) b
where user_id=3;
Answer:
product_id user_id max(bidding_price)
9 3 75000
11 3 15000
12 3 18000
An idea could be to sort the table desc by date and select every distinct row by product_id and customer_id. Something like
SELECT DISTINCT prod_id, user_id FROM (
SELECT * FROM auction_product_bidding ORDER BY date DESC
)
You want everything that bids last in 3, is it right ?
I am in a very complicated problem. Let me explain you first what I am doing right now:
I have a table name feedback in which I am storing grades against course id. The table looks like this:
+-------+-------+-------+-------+-----------+--------------
| id | cid | grade |g_point| workload | easiness
+-------+-------+-------+-------+-----------+--------------
| 1 | 10 | A+ | 1 | 5 | 4
| 2 | 10 | A+ | 1 | 2 | 4
| 3 | 10 | B | 3 | 3 | 3
| 4 | 11 | B+ | 2 | 2 | 3
| 5 | 11 | A+ | 1 | 5 | 4
| 6 | 12 | B | 3 | 3 | 3
| 7 | 11 | B+ | 2 | 7 | 8
| 8 | 11 | A+ | 1 | 1 | 2
g_point has just specific values for the grades, thus I can use these values to show the user courses sorted by grades.
Okay, now first my task is to print out the grade of each course. The grade can be calculated by the maximum occurrence against each course. For example from this table we can see the result of cid = 10 will be A+, because it is present two times there. This is simple. I have already implemented this query which I will write here in the end.
The main problem is when we talk about the course cid = 11 which has two different grades. Now in that situation client asks me to take the average of workload and easiness of both these courses and whichever course has the greater average should be shown. The average would be computed like this:
all workload values of the grade against course
+ all easiness values of the grade against course
/ 2
From this example cid = 11 has four entries,have equal number of grades against a course
B+ grade average
avgworkload(2 + 7)/2=x
avgeasiness(3 + 8)/2 = y
answer x+y/2 = 10
A+ grade average
avgworkload(5 + 1)/2=x
avgeasiness(4 + 2)/2 = y
answer x+y/2 = 3
so the grade should be B+.
This is the query which I am running to get the max occurrence grade
SELECT
f3.coursecodeID cid,
f3.grade_point p,
f3.grade g
FROM (
SELECT
coursecodeID,
MAX(mode_qty) mode_qty
FROM (
SELECT
coursecodeID,
COUNT(grade_point) mode_qty
FROM feedback
GROUP BY
coursecodeID, grade_point
) f1
GROUP BY coursecodeID
) f2
INNER JOIN (
SELECT
coursecodeID,
grade_point,
grade,
COUNT(grade_point) mode_qty
FROM feedback
GROUP BY
coursecodeID, grade_point
) f3
ON
f2.coursecodeID = f3.coursecodeID AND
f2.mode_qty = f3.mode_qty
GROUP BY f3.coursecodeID
ORDER BY f3.grade_point
Here is SQL Fiddle.
I added a table Courses with the list of all course IDs, to make the main idea of the query easier to see. Most likely you have it in the real database. If not, you can generate it on the fly from feedback by grouping by cid.
For each cid we need to find the grade. Group feedback by cid, grade to get a list of all grades for the cid. We need to pick only one grade for a cid, so we use LIMIT 1. To determine which grade to pick we order them. First, by occurrence - simple COUNT. Second, by the average score. Finally, if there are several grades than have same occurrence and same average score, then pick the grade with the smallest g_point. You can adjust the rules by tweaking the ORDER BY clause.
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
FROM courses
ORDER BY courses.cid
result set
cid CourseGrade
10 A+
11 B+
12 B
UPDATE
MySQL doesn't have lateral joins, so one possible way to get the second column g_point is to repeat the correlated sub-query. SQL Fiddle
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
,(
SELECT feedback.g_point
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGPoint
FROM courses
ORDER BY CourseGPoint
result set
cid CourseGrade CourseGPoint
10 A+ 1
11 B+ 2
12 B 3
Update 2 Added average score into ORDER BY SQL Fiddle
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
,(
SELECT feedback.g_point
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGPoint
,(
SELECT (AVG(workload) + AVG(easiness))/2
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS AvgScore
FROM courses
ORDER BY CourseGPoint, AvgScore DESC
result
cid CourseGrade CourseGPoint AvgScore
10 A+ 1 3.75
11 B+ 2 5
12 B 3 3
If I understood well you need an inner select to find the average, and a second outer select to find the maximum values of the average
select cid, grade, max(average)/2 from (
select cid, grade, avg(workload + easiness) as average
from feedback
group by cid, grade
) x group by cid, grade
This solution has been tested on your data usign sql fiddle at this link
If you change the previous query to
select cid, max(average)/2 from (
select cid, grade, avg(workload + easiness) as average
from feedback
group by cid, grade
) x group by cid
You will find the max average for each cid.
As mentioned in the comments you have to choose wich strategy use if you have more grades that meets the max average. For example if you have
+-------+-------+-------+-------+-----------+--------------
| id | cid | grade |g_point| workload | easiness
+-------+-------+-------+-------+-----------+--------------
| 1 | 10 | A+ | 1 | 5 | 4
| 2 | 10 | A+ | 1 | 2 | 4
| 3 | 10 | B | 3 | 3 | 3
| 4 | 11 | B+ | 2 | 2 | 3
| 5 | 11 | A+ | 1 | 5 | 4
| 9 | 11 | C | 1 | 3 | 6
You will have grades A+ and C soddisfing the maximum average 4.5
I have a table like this:
Table: p
+----------------+
| id | w_id |
+---------+------+
| 5 | 8 |
| 5 | 10 |
| 5 | 8 |
| 5 | 10 |
| 5 | 8 |
| 6 | 5 |
| 6 | 8 |
| 6 | 10 |
| 6 | 10 |
| 7 | 8 |
| 7 | 10 |
+----------------+
What is the best SQL to get the following result? :
+-----------------------------+
| id | most_used_w_id |
+---------+-------------------+
| 5 | 8 |
| 6 | 10 |
| 7 | 8 |
+-----------------------------+
In other words, to get, per id, the most frequent related w_id.
Note that on the example above, id 7 is related to 8 once and to 10 once.
So, either (7, 8) or (7, 10) will do as result. If it is not possible to
pick up one, then both (7, 8) and (7, 10) on result set will be ok.
I have come up with something like:
select counters2.p_id as id, counters2.w_id as most_used_w_id
from (
select p.id as p_id,
w_id,
count(w_id) as count_of_w_ids
from p
group by id, w_id
) as counters2
join (
select p_id, max(count_of_w_ids) as max_counter_for_w_ids
from (
select p.id as p_id,
w_id,
count(w_id) as count_of_w_ids
from p
group by id, w_id
) as counters
group by p_id
) as p_max
on p_max.p_id = counters2.p_id
and p_max.max_counter_for_w_ids = counters2.count_of_w_ids
;
but I am not sure at all whether this is the best way to do it. And I had to repeat the same sub-query two times.
Any better solution?
Try to use User defined variables
select id,w_id
FROM
( select T.*,
if(#id<>id,1,0) as row,
#id:=id FROM
(
select id,W_id, Count(*) as cnt FROM p Group by ID,W_id
) as T,(SELECT #id:=0) as T1
ORDER BY id,cnt DESC
) as T2
WHERE Row=1
SQLFiddle demo
Formal SQL
In fact - your solution is correct in terms of normal SQL. Why? Because you have to stick with joining values from original data to grouped data. Thus, your query can not be simplified. MySQL allows to mix non-group columns and group function, but that's totally unreliable, so I will not recommend you to rely on that effect.
MySQL
Since you're using MySQL, you can use variables. I'm not a big fan of them, but for your case they may be used to simplify things:
SELECT
c.*,
IF(#id!=id, #i:=1, #i:=#i+1) AS num,
#id:=id AS gid
FROM
(SELECT id, w_id, COUNT(w_id) AS w_count
FROM t
GROUP BY id, w_id
ORDER BY id DESC, w_count DESC) AS c
CROSS JOIN (SELECT #i:=-1, #id:=-1) AS init
HAVING
num=1;
So for your data result will look like:
+------+------+---------+------+------+
| id | w_id | w_count | num | gid |
+------+------+---------+------+------+
| 7 | 8 | 1 | 1 | 7 |
| 6 | 10 | 2 | 1 | 6 |
| 5 | 8 | 3 | 1 | 5 |
+------+------+---------+------+------+
Thus, you've found your id and corresponding w_id. The idea is - to count rows and enumerate them, paying attention to the fact, that we're ordering them in subquery. So we need only first row (because it will represent data with highest count).
This may be replaced with single GROUP BY id - but, again, server is free to choose any row in that case (it will work because it will take first row, but documentation says nothing about that for common case).
One little nice thing about this is - you can select, for example, 2-nd by frequency or 3-rd, it's very flexible.
Performance
To increase performance, you can create index on (id, w_id) - obviously, it will be used for ordering and grouping records. But variables and HAVING, however, will produce line-by-line scan for set, derived by internal GROUP BY. It isn't such bad as it was with full scan of original data, but still it isn't good thing about doing this with variables. On the other hand, doing that with JOIN & subquery like in your query won't be much different, because of creating temporery table for subquery result set too.
But to be certain, you'll have to test. And keep in mind - you already have valid solution, which, by the way, isn't bound to DBMS-specific stuff and is good in terms of common SQL.
Try this query
select p_id, ccc , w_id from
(
select p.id as p_id,
w_id, count(w_id) ccc
from p
group by id,w_id order by id,ccc desc) xxx
group by p_id having max(ccc)
here is the sqlfidddle link
You can also use this code if you do not want to rely on the first record of non-grouping columns
select p_id, ccc , w_id from
(
select p.id as p_id,
w_id, count(w_id) ccc
from p
group by id,w_id order by id,ccc desc) xxx
group by p_id having ccc=max(ccc);