I'm currently learning MySQL and am working on a query that displays the top 5 and bottom 5 categories and groups by joining 2 tables. What I have meets the requirements but I want to display it more cleanly. I've got this to display by using a union but was wondering if I could show the results as four columns instead for a cleaner look. 2 columns related to the top 5 and 2 related to the bottom five categories determined by the number of groups in each category.
Current query:
SELECT*
FROM(SELECT
category_name,
count(category_name) AS NumOfGroups
From
category c
JOIN
grp g ON c.category_id=g.category_id
GROUP BY category_name
order by NumOfGroups desc
LIMIT 5) most
UNION
SELECT *
FROM (SELECT
category_name,
count(category_name) AS NumOfGroups
From
category c
JOIN
grp g ON c.category_id=g.category_id
GROUP BY category_name
ORDER BY NumOfGroups ASC
LIMIT 5) Least;
This displays:
category NumOfGroups
Tech 911
Food & Drink 790
Photography 320
Outdoors & Adventure 218
Games 166
Singles 4
Fitness 15
Paranormal 16
Fashion & Beauty 26
Movements & Politics 32
Can I take this one step further to display a result like below?
Would I have to transpose?
Desired result:
category NumOfGroups category NumOfGroups
Tech 911 Singles 4
Food & Drink 790 Fitness 15
Photography 320 Paranormal 16
Outdoors & Adventure 218 Fashion & Beauty 26
Games 166 Movements & Politics 32
Create a CTE where you use ROW_NUMBER() window function twice to rank the rows based on the value of NumOfGroups and then do a self join:
WITH cte AS (
SELECT c.category_name, COUNT(*) NumOfGroups,
ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC) rn_most,
ROW_NUMBER() OVER (ORDER BY COUNT(*)) rn_least
FROM category c JOIN grp g
ON c.category_id = g.category_id
GROUP BY c.category_name
)
SELECT c1.category_name category_most, c1.NumOfGroups NumOfGroups_most,
c2.category_name category_least, c2.NumOfGroups NumOfGroups_least
FROM cte c1 INNER JOIN cte c2
ON c2.rn_least = c1.rn_most
WHERE c1.rn_most <= 5
ORDER BY c2.rn_least
IMO, this is best done at the application level rather than in your database queries. Using each tool as it's designed results in cleaner solutions. However, if you really need to do this in mysql, you can generate row numbers in each of your subqueries and join them to make a unified result.
set #row:=0;
set #row2:=0;
SELECT most.category_name,most.members,least.category_name,least.members
FROM (
SELECT *,#row := #row + 1 as rownum
FROM (
SELECT
category_name,
count(*) numberOfGroups,
FROM category c
JOIN grp g ON c.category_id=g.category_id
GROUP by category_name
ORDER BY numberOfGroups DESC
LIMIT 5
) temp
) most
LEFT JOIN (
SELECT *,#row2 := #row2 + 1 as rownum
FROM (
SELECT
category_name,
count(*) numberOfGroups
FROM category c
JOIN grp g ON c.category_id=g.category_id
GROUP by category_name
ORDER BY numberOfGroups ASC
LIMIT 5
) temp
) least
ON most.rownum=least.rownum;
There's still a caveat where the "most" subquery needs to always be >= the number of row results relative to "least" or you'll get clipping. As long as it's always 5 though (as it appears to be very likely in your case), you'll be safe.
Related
I want to get the id of the lowest points from each team (the team field).
My query works but i need to make sure the following query is good enough with a large table.
I need Simplification and Optimization.
Query:
SELECT T.id from teams as T
INNER JOIN (
SELECT MIN(T1.points) AS P FROM teams AS T1
GROUP BY T1.team LIMIT 5
) TJOIN ON T.points IN (TJOIN.P)
GROUP BY T.team
ORDER BY T.points ASC LIMIT 5
Table teams
id
team (foreign_key)
points (indexed)
1
a
100
2
a
101
3
b
106
4
c
105
5
c
102
Result
id
1
5
3
I believe the query you are looking for is:
SELECT MIN(T.id)
FROM teams as T
INNER JOIN (
SELECT team, MIN(points) AS min_points
FROM teams
GROUP BY team LIMIT 5
) TJOIN
ON T.team = TJOIN.team
AND T.points = TJOIN.min_points
GROUP BY T.team
ORDER BY T.points ASC
LIMIT 5
You need to join based on both the column being grouped by and the min value. Consider the result of your query if multiple teams had a score of 100.
Another way of doing this is to use ROW_NUMBER():
SELECT id
FROM (
SELECT id, points, ROW_NUMBER() OVER (PARTITION BY team ORDER BY points ASC, id ASC) rn
FROM teams
) t
WHERE rn = 1
ORDER BY points ASC
LIMIT 5
SO,
The problem
My question is about - how to join table in MySQL with itself in reverse order? Suppose I have:
id name
1 First
2 Second
5 Third
6 Fourth
7 Fifth
8 Sixth
9 Seventh
13 Eight
14 Nine
15 Tenth
-and now I want to create a query, which will return joined records in reverse order:
left_id name right_id name
1 First 15 Tenth
2 Second 14 Nine
5 Third 13 Eight
6 Fourth 9 Seventh
7 Fifth 8 Sixth
8 Sixth 7 Fifth
9 Seventh 6 Fourth
13 Eight 5 Third
14 Nine 2 Second
15 Tenth 1 First
My approach
I have now this query:
SELECT
l.id AS left_id,
l.name,
(SELECT COUNT(1) FROM sequences WHERE id<=left_id) AS left_order,
r.id AS right_id,
r.name,
(SELECT COUNT(1) FROM sequences WHERE id<=right_id) AS right_order
FROM
sequences AS l
LEFT JOIN
sequences AS r ON 1
HAVING
left_order+right_order=(1+(SELECT COUNT(1) FROM sequences));
-see this fiddle for sample structure & code.
Some background
There's no use case for that. I was doing that in application before. Now it's mostly curiosity if there's a way to do that in SQL - that's why I'm seeking not just 'any solution' (like mine) - but as simple as possible solution. Source table will always be small (<10.000 records) - so performance is not a thing to care, I think.
The question
Can my query be simplified somehow? Also, it's important not to use variables. Order could be included in result (like in my fiddle) - but that's not mandatory.
The only thing i can think to be improved is
SELECT
l.id AS left_id,
l.name ln,
(SELECT COUNT(1) FROM sequences WHERE id<=left_id) AS left_order,
r.id AS right_id,
r.name rn,
(SELECT COUNT(1) FROM sequences WHERE id>=right_id) AS right_order
FROM
sequences AS l
LEFT JOIN
sequences AS r ON 1
HAVING
left_order=right_order;
There are 2 changes that should make this a little bit faster:
1) Calculating right order in reverse order in the first place
2) avoid using SELECT COUNT in the last line.
Edit: I aliased the ln,rn because i couldn't see the columns in fiddle
Without the SQL standard RANK() OVER(...), you have to compute the ordering yourself as you discovered.
The RANK() of a row is simply 1 + the COUNT() of all better-ranked rows. (DENSE_RANK(), for comparison, is 1 + the COUNT() of all DISTINCT better ranks.) While RANK() can be computed as a scalar subquery in your SELECT projection — as, e.g., you have done with SELECT (SELECT COUNT(1) ...), ... — I tend to prefer joins:
SELECT lft.id AS "left_id", lft.name AS "left_name",
rgt.id AS "right_id", rgt.name AS "right_name"
FROM ( SELECT s.id, s.name, COUNT(1) AS "rank" -- Left ranking
FROM sequences s
LEFT JOIN sequences d ON s.id <= d.id
GROUP BY 1, 2) lft
INNER JOIN ( SELECT s.id, s.name, COUNT(1) AS "rank" -- Right ranking
FROM sequences s
LEFT JOIN sequences d ON s.id >= d.id
GROUP BY 1, 2) rgt
ON lft.rank = rgt.rank
ORDER BY lft.id ASC;
SET #rank1=0;
SET #rank2=0;
SELECT *
FROM (SELECT *, #rank1 := #rank1 + 1 AS row_number FROM sequences ORDER BY ID ASC) t1
INNER JOIN (SELECT *, #rank2 := #rank2 + 1 AS row_number FROM sequences ORDER BY ID DESC) t2
on t1.row_number = t2.row_number
For some reason sql fiddler does show only 3 columns for this, not sure if my query is bad.
I'm running contests on my website. Every contest could have multiple entries. I want to retrieve if only the MAX value of votes has a duplicate.
The table is as follows:
contest_id entry_id votes
1 1 50
1 2 34
1 3 50
2 4 20
2 5 55
3 6 53
I just need the query to show me that contest 1 has a duplicate MAX value without additional information.
I tried this but didn't work:
SELECT MAX(votes) from contest group by contest_id having count(votes) > 1
SELECT a.contest_ID
FROM contest a
INNER JOIN
(
SELECT contest_id, MAX(votes) totalVotes
FROM contest
GROUP BY contest_id
) b ON a.contest_ID = b.contest_ID AND
a.votes = b.totalvotes
GROUP BY a.contest_ID
HAVING COUNT(*) >= 2
SQLFiddle Demo
This finds the max votes value per contest and counts the entries with that number of votes.
It then displays contest with more than one hit.
SELECT contest_id
FROM contests
WHERE votes=(
SELECT MAX(votes) FROM contests c WHERE c.contest_id=contests.contest_id
)
GROUP BY contest_id
HAVING COUNT(*) > 1;
SQLfiddle for testing.
You could do it by first selecting the maximum number of votes for each contest ID in a subquery, and then joining against the results (demo on SQLFiddle):
SELECT contest_id, votes
FROM contest
JOIN (
SELECT contest_id, MAX(votes) AS votes
FROM contest GROUP BY contest_id
) AS foo USING (contest_id, votes)
GROUP BY contest_id
HAVING COUNT(*) > 1
The nice thing about doing it like this is that it's an independent subquery, so MySQL only needs to rub it once.
Ps. Yes, this is basically identical to JW's answer, but I figured I'd leave it up anyway to show the slightly different syntax I used for the join.
I have three tables here, that I'm trying to do a tricky combined query on.
Table 1(teams) has Teams in it:
id name
------------
150 LA Lakers
151 Boston Celtics
152 NY Knicks
Table 2(scores) has scores in it:
id teamid week score
---------------------------
1 150 5 75
2 151 5 95
3 152 5 112
Table 3(tickets) has tickets in it
id teamids week
---------------------
1 150,152,154 5
2 151,154,155 5
I have two queries that I'm trying to write
Rather than trying to sum these each time i query the tickets, I've added a weekly_score field to the ticket. The idea being, any time a new score is entered for the team, I could take that teams id, get all tickets that have that team / week combo, and update them all based on the sum of their team scores.
I've tried the following to get the results i'm looking for (before I try and update them):
SELECT t.id, t.teamids, (
SELECT SUM( s1.score )
FROM scores s1
WHERE s1.teamid
IN (
t.teamids
)
AND s1.week =11
) AS score
FROM tickets t
WHERE t.week =11
AND (t.teamids LIKE "150,%" OR t.teamids LIKE "%,150")
Not only is the query slow, but it also seems to not return the sum of the scores, it just returns the first score in the list.
Any help is greatly appreciated.
If you are going to match, you'll need to accommodate for the column only having one team id. Also, you'll need to LIKE in your SELECT sub query.
SELECT t.id, t.teamids, (
SELECT SUM( s1.score )
FROM scores s1
WHERE
(s1.teamid LIKE t.teamids
OR CONCAT("%,",s1.teamid, "%") LIKE t.teamids
OR CONCAT("%",s1.teamid, ",%") LIKE t.teamids
)
AND s1.week =11
) AS score
FROM tickets t
WHERE t.week =11
AND (t.teamids LIKE "150,%" OR t.teamids LIKE "%,150" OR t.teamids LIKE "150")
You don't need SUM function here ? The scores table already has it? And BTW, avoid subqueries, try the left join (or left outer join depending on your needs).
SELECT t.id, t.name, t1.score, t2.teamids
FROM teams t
LEFT JOIN scores t1 ON t.id = t1.teamid AND t1.week = 11
LEFT JOIN tickets t2 ON t2.week = 11
WHERE t2.week = 11 AND t2.teamids LIKE "%150%"
Not tested.
Well not the most elegant query ever, but it should word:
SELECT
tickets.id,
tickets.teamids,
sum(score)
FROM
tickets left join scores
on concat(',', tickets.teamids, ',') like concat('%,', scores.teamid, ',%')
WHERE tickets.week = 11 and concat(',', tickets.teamids, ',') like '%,150,%'
GROUP BY tickets.id, tickets.teamids
or also this:
SELECT
tickets.id,
tickets.teamids,
sum(score)
FROM
tickets left join scores
on FIND_IN_SET(scores.teamid, tickets.teamids)>0
WHERE tickets.week = 11 and FIND_IN_SET('150', tickets.teamids)>0
GROUP BY tickets.id, tickets.teamids
(see this question and the answers for more informations).
Given a table (daily_sales) with say 100k rows of the following data/columns:
id rep sales date
1 a 123 12/15/2011
2 b 153 12/15/2011
3 a 11 12/14/2011
4 a 300 12/13/2011
5 a 120 12/12/2011
6 b 161 11/15/2011
7 a 3 11/14/2011
8 c 13 11/14/2011
9 c 44 11/13/2011
What would be the most efficient way to write a report (completely in SQL) showing the two most recent entries (rep, sales, date) for each name, so the output would be:
a 123 12/15/2011
a 11 12/14/2011
b 153 12/15/2011
b 161 11/15/2011
c 13 11/14/2011
c 44 11/13/2011
Thanks!
FYI, your example is using mostly reserved words and makes it horrid for us to attempt to program against. If you've got the real table columns, gives those to us. This is postgres:
select name,value, max(date)
from the_table_name_you_neglect_to_give_us
group by 1,2
That'll give you a list of first name,value,max(date)...though I gotta ask why give us a column called value if it doesn't change in the example?
Lets say you do have an id column...we'll be consistent with your scheme and call it 'ID'...
select b.id from
(select name,value, max(date) date
from the_table_name_you_neglect_to_give_us
group by 1,2) a
inner join the_table_name_you_neglect_to_give_us b on a.name=b.name and a.value=b.value and a.date = b.date
This gives a list of all ID's that are the max...put it together:
select name,value, max(date)
from the_table_name_you_neglect_to_give_us
group by 1,2
union all
select name,value, max(date)
from the_table_name_you_neglect_to_give_us
where id not in
(select b.id from
(select name,value, max(date) date
from the_table_name_you_neglect_to_give_us
group by 1,2) a
inner join the_table_name_you_neglect_to_give_us b on a.name=b.name and a.value=b.value and a.date = b.date)
Hoping my syntax is right...should be close at any rate. I'd put a bracket around that entire thing then select * from (above query) order by name...gives you the order you want.
For MySQL, explained in #Quassnoi's blog, an index on (name, date) and using this:
SELECT t.*
FROM (
SELECT name,
COALESCE(
(
SELECT date
FROM tableX ti
WHERE ti.name = dto.name
ORDER BY
ti.name, ti.date DESC
LIMIT 1
OFFSET 1 --- this is set to 2-1
), CAST('1000-01-01' AS DATE)) AS mdate
FROM (
SELECT DISTINCT name
FROM tableX dt
) dto
) tg
, tableX t
WHERE t.name >= tg.name
AND t.name <= tg.name
AND t.date >= tg.mdate
If I understand what you mean.. Then this MIGHT be helpful:
SELECT main.name, main.value, main.date
FROM tablename AS main
LEFT OUTER JOIN tablename AS ctr
ON main.name = ctr.rname
AND main.date <= ctr.rdate
GROUP BY main.name, main.date
HAVING COUNT(*) <= 2
ORDER BY main.name ASC, main.date DESC
I know the SQL is shorter than the other posts, but just give it a try first..