SO,
The problem
My question is about - how to join table in MySQL with itself in reverse order? Suppose I have:
id name
1 First
2 Second
5 Third
6 Fourth
7 Fifth
8 Sixth
9 Seventh
13 Eight
14 Nine
15 Tenth
-and now I want to create a query, which will return joined records in reverse order:
left_id name right_id name
1 First 15 Tenth
2 Second 14 Nine
5 Third 13 Eight
6 Fourth 9 Seventh
7 Fifth 8 Sixth
8 Sixth 7 Fifth
9 Seventh 6 Fourth
13 Eight 5 Third
14 Nine 2 Second
15 Tenth 1 First
My approach
I have now this query:
SELECT
l.id AS left_id,
l.name,
(SELECT COUNT(1) FROM sequences WHERE id<=left_id) AS left_order,
r.id AS right_id,
r.name,
(SELECT COUNT(1) FROM sequences WHERE id<=right_id) AS right_order
FROM
sequences AS l
LEFT JOIN
sequences AS r ON 1
HAVING
left_order+right_order=(1+(SELECT COUNT(1) FROM sequences));
-see this fiddle for sample structure & code.
Some background
There's no use case for that. I was doing that in application before. Now it's mostly curiosity if there's a way to do that in SQL - that's why I'm seeking not just 'any solution' (like mine) - but as simple as possible solution. Source table will always be small (<10.000 records) - so performance is not a thing to care, I think.
The question
Can my query be simplified somehow? Also, it's important not to use variables. Order could be included in result (like in my fiddle) - but that's not mandatory.
The only thing i can think to be improved is
SELECT
l.id AS left_id,
l.name ln,
(SELECT COUNT(1) FROM sequences WHERE id<=left_id) AS left_order,
r.id AS right_id,
r.name rn,
(SELECT COUNT(1) FROM sequences WHERE id>=right_id) AS right_order
FROM
sequences AS l
LEFT JOIN
sequences AS r ON 1
HAVING
left_order=right_order;
There are 2 changes that should make this a little bit faster:
1) Calculating right order in reverse order in the first place
2) avoid using SELECT COUNT in the last line.
Edit: I aliased the ln,rn because i couldn't see the columns in fiddle
Without the SQL standard RANK() OVER(...), you have to compute the ordering yourself as you discovered.
The RANK() of a row is simply 1 + the COUNT() of all better-ranked rows. (DENSE_RANK(), for comparison, is 1 + the COUNT() of all DISTINCT better ranks.) While RANK() can be computed as a scalar subquery in your SELECT projection — as, e.g., you have done with SELECT (SELECT COUNT(1) ...), ... — I tend to prefer joins:
SELECT lft.id AS "left_id", lft.name AS "left_name",
rgt.id AS "right_id", rgt.name AS "right_name"
FROM ( SELECT s.id, s.name, COUNT(1) AS "rank" -- Left ranking
FROM sequences s
LEFT JOIN sequences d ON s.id <= d.id
GROUP BY 1, 2) lft
INNER JOIN ( SELECT s.id, s.name, COUNT(1) AS "rank" -- Right ranking
FROM sequences s
LEFT JOIN sequences d ON s.id >= d.id
GROUP BY 1, 2) rgt
ON lft.rank = rgt.rank
ORDER BY lft.id ASC;
SET #rank1=0;
SET #rank2=0;
SELECT *
FROM (SELECT *, #rank1 := #rank1 + 1 AS row_number FROM sequences ORDER BY ID ASC) t1
INNER JOIN (SELECT *, #rank2 := #rank2 + 1 AS row_number FROM sequences ORDER BY ID DESC) t2
on t1.row_number = t2.row_number
For some reason sql fiddler does show only 3 columns for this, not sure if my query is bad.
Related
In the following, I am querying the same table 2 times. The second query is a nested query inside left join but queries the same table. The only difference is the addition of the aggregation function count, the result of which is used by the outer query. Is there a better way to approach this?
select sm.student_id, sm.marks, smarks.d as d_marks from student_marks as sm
left join(
select m.student_id, count(distinct m.marks) as d from student_marks as m group by m.student_id
) as smarks on smarks.student_id = sm.student_id;
Is it possible to do this in a single query without using a left join.
Yes there is an alternative approach which is using windowed functions. There's no way of doing COUNT(DISTINCT in a windowed function, but you can do this using DENSE_RANK() twice, once sorting by what you want a distinct count of ascending, and once descending, adding these together then taking one away:
SELECT sm.student_id,
sm.marks,
DENSE_RANK() OVER(PARTITION BY sm.student_id ORDER BY sm.marks DESC) +
DENSE_RANK() OVER(PARTITION BY sm.student_id ORDER BY sm.marks ASC) - 1 AS d_marks
FROM student_marks AS sm
N.B. this is not guaranteed to perform any better just because you are referencing a table one fewer times.
To explain the DENSE_RANK() trick, consider a simple data set:
marks
dense_rank ASC
dense_rank DESC
1
1
3
1
1
3
2
2
2
3
3
1
The two ranks added together will always be one more than the total number of items in the set (i.e. 1+3, 2+2, and 3+1 all equal 4), so we just need to take one off the result and this gives us our distinct count of items in the set without actually using COUNT(DISTINCT which isn't allowed (as noted in the restrictions)
ADENDUM
If marks is nullable (which I had assumed it would not be) and you don't want null rows included in the count, then as noted in the comments this wouldn't quite work as it is, you'd need to remove any null rows from the total, which can be done using:
- MAX(CASE WHEN sm.marks IS NULL THEN 1 ELSE 0 END) OVER(PARTITION BY sm.student_id)
I'm currently learning MySQL and am working on a query that displays the top 5 and bottom 5 categories and groups by joining 2 tables. What I have meets the requirements but I want to display it more cleanly. I've got this to display by using a union but was wondering if I could show the results as four columns instead for a cleaner look. 2 columns related to the top 5 and 2 related to the bottom five categories determined by the number of groups in each category.
Current query:
SELECT*
FROM(SELECT
category_name,
count(category_name) AS NumOfGroups
From
category c
JOIN
grp g ON c.category_id=g.category_id
GROUP BY category_name
order by NumOfGroups desc
LIMIT 5) most
UNION
SELECT *
FROM (SELECT
category_name,
count(category_name) AS NumOfGroups
From
category c
JOIN
grp g ON c.category_id=g.category_id
GROUP BY category_name
ORDER BY NumOfGroups ASC
LIMIT 5) Least;
This displays:
category NumOfGroups
Tech 911
Food & Drink 790
Photography 320
Outdoors & Adventure 218
Games 166
Singles 4
Fitness 15
Paranormal 16
Fashion & Beauty 26
Movements & Politics 32
Can I take this one step further to display a result like below?
Would I have to transpose?
Desired result:
category NumOfGroups category NumOfGroups
Tech 911 Singles 4
Food & Drink 790 Fitness 15
Photography 320 Paranormal 16
Outdoors & Adventure 218 Fashion & Beauty 26
Games 166 Movements & Politics 32
Create a CTE where you use ROW_NUMBER() window function twice to rank the rows based on the value of NumOfGroups and then do a self join:
WITH cte AS (
SELECT c.category_name, COUNT(*) NumOfGroups,
ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC) rn_most,
ROW_NUMBER() OVER (ORDER BY COUNT(*)) rn_least
FROM category c JOIN grp g
ON c.category_id = g.category_id
GROUP BY c.category_name
)
SELECT c1.category_name category_most, c1.NumOfGroups NumOfGroups_most,
c2.category_name category_least, c2.NumOfGroups NumOfGroups_least
FROM cte c1 INNER JOIN cte c2
ON c2.rn_least = c1.rn_most
WHERE c1.rn_most <= 5
ORDER BY c2.rn_least
IMO, this is best done at the application level rather than in your database queries. Using each tool as it's designed results in cleaner solutions. However, if you really need to do this in mysql, you can generate row numbers in each of your subqueries and join them to make a unified result.
set #row:=0;
set #row2:=0;
SELECT most.category_name,most.members,least.category_name,least.members
FROM (
SELECT *,#row := #row + 1 as rownum
FROM (
SELECT
category_name,
count(*) numberOfGroups,
FROM category c
JOIN grp g ON c.category_id=g.category_id
GROUP by category_name
ORDER BY numberOfGroups DESC
LIMIT 5
) temp
) most
LEFT JOIN (
SELECT *,#row2 := #row2 + 1 as rownum
FROM (
SELECT
category_name,
count(*) numberOfGroups
FROM category c
JOIN grp g ON c.category_id=g.category_id
GROUP by category_name
ORDER BY numberOfGroups ASC
LIMIT 5
) temp
) least
ON most.rownum=least.rownum;
There's still a caveat where the "most" subquery needs to always be >= the number of row results relative to "least" or you'll get clipping. As long as it's always 5 though (as it appears to be very likely in your case), you'll be safe.
This question already has answers here:
mysql select top n max values
(4 answers)
Closed 5 years ago.
Ive been trying to join two tables but only showing a limited amount (2) of results from the joined table. Unfortunately I havent been able to obtain the correct results. These are my tables:
Destinations
id name
------------
1 Bahamas
2 Caribbean
3 Barbados
Sailings
id name destination
---------------------------------
1 Adventure 1
2 For Kids 2
3 All Inclusive 3
4 Seniors 1
5 Singles 2
6 Disney 1
7 Adults 2
This is the query Ive tried:
SELECT
d.name as Destination,
s.name as Sailing
FROM destinations d
JOIN sailings s
ON s.destination = d.id
LIMIT 2
But this gives me 2 due to the limit:
Destination Sailing
-------------------------
Bahamas Adventure
Caribbean For Kids
SAMPLE: SQL FIDDLE
I would like LIMIT 2 to be applied only to the joined table sailings
Expected Results:
Destination Sailing
-------------------------
Bahamas Adventure
Bahamas Seniors
Caribbean Singles
Caribbean For Kids
Can someone please point me in the right direction?
try
select tmp.name as destination,d.name as sailings from (
SELECT
id,
name,
destination
FROM
(
SELECT
id,
name,
destination,
#rn := IF(#p = destination, #rn + 1, 1) AS rn,
#p := destination
FROM sailings
JOIN (SELECT #p := NULL, #rn := 0) AS vars
ORDER BY destination
) AS T1
WHERE rn <= 2
)tmp
JOIN (SELECT * FROM destinations limit 0,2) d
ON(tmp.destination=d.id)
I have made 2 derived table and joined them
Your problem is that you want to take the two highest (or lowest) members of a group, for each group in the table. In this case, you want the first two sailings for each destination group.
The canonical way you would handle this query in a database which supported analytic functions would be to use ROW_NUMBER(). But since MySQL does not support this, we can simulate it using session variables:
SET #row_number = 0;
SET #destination = NULL;
SELECT
t.Destination,
t.Sailing
FROM
(
SELECT
#row_number:=CASE WHEN #destination = Destination
THEN #row_number + 1 ELSE 1 END AS rn,
#destination:=Destination AS Destination,
Sailing,
id
FROM
(
SELECT s.id AS id, d.name AS Destination, s.name AS Sailing
FROM destinations d
INNER JOIN sailings s
ON s.destination = d.id
) t
ORDER BY
Destination,
id
) t
WHERE t.rn <= 2
ORDER BY
t.Destination,
t.rn;
Note that Barbados appears as single row, because in your sample data it only has one sailing. If you also want to restrict to only destinations having two or more sailings, this can also be done.
Output:
Demo here:
Rextester
Can you try
SELECT
d.name as Destination,
s.name as Sailing
FROM sailings s
JOIN (SELECT * from destinations LIMIT 2) d
ON s.destination = d.id
(You say you want to limit the sailings table, but I think you might want the limit on the destinations table, based on your expected output; you can adjust as necessary)
I'm about to throw in the towel with this.
Preface: I want to make this work with any N, but for the sake of simplicity, I'll set N to be 3.
I've got a query (MySQL, specifically) that needs to pull in data from a table and sort based on top 3 values from that table and after that fallback to other sort criteria.
So basically I've got something like this:
SELECT tbl.id
FROM
tbl1 AS maintable
LEFT JOIN
tbl2 AS othertable
ON
maintable.id = othertable.id
ORDER BY
othertable.timestamp DESC,
maintable.timestamp DESC
Which is all basic textbook stuff. But the issue is I need the first ORDER BY clause to only get the three biggest values in othertable.timestamp and then fallback on maintable.timestamp.
Also, doing a LIMIT 3 subquery to othertable and join it is a no go as this needs to work with an arbitrary number of WHERE conditions applied to maintable.
I was almost able to make it work with a user variable based approach like this, but it fails since it doesn't take into account ordering, so it'll take the FIRST three othertable values it finds:
ORDER BY
(
IF(othertable.timestamp IS NULL, 0,
IF(
(#rank:=#rank+1) > 3, null, othertable.timestamp
)
)
) DESC
(with a #rank:=0 preceding the statement)
So... any tips on this? I'm losing my mind with the problem. Another parameter I have for this is that since I'm only altering an existing (vastly complicated) query, I can't do a wrapping outer query. Also, as noted, I'm on MySQL so any solutions using the ROW_NUMBER function are unfortunately out of reach.
Thanks to all in advance.
EDIT. Here's some sample data with timestamps dumbed down to simpler integers to illustrate what I need:
maintable
id timestamp
1 100
2 200
3 300
4 400
5 500
6 600
othertable
id timestamp
4 250
5 350
3 550
1 700
=>
1
3
5
6
4
2
And if for whatever reason we add WHERE NOT maintable.id = 5 to the query, here's what we should get:
1
3
4
6
2
...because now 4 is among the top 3 values in othertable referring to this set.
So as you see, the row with id 4 from othertable is not included in the ordering as it's the fourth in descending order of timestamp values, thus it falls back into getting ordered by the basic timestamp.
The real world need for this is this: I've got content in "maintable" and "othertable" is basically a marker for featured content with a timestamp of "featured date". I've got a view where I'm supposed to float the last 3 featured items to the top and the rest of the list just be a reverse chronologic list.
Maybe something like this.
SELECT
id
FROM
(SELECT
tbl.id,
CASE WHEN othertable.timestamp IS NULL THEN
0
ELSE
#i := #i + 1
END AS num,
othertable.timestamp as othertimestamp,
maintable.timestamp as maintimestamp
FROM
tbl1 AS maintable
CROSS JOIN (select #i := 0) i
LEFT JOIN tbl2 AS othertable
ON maintable.id = othertable.id
ORDER BY
othertable.timestamp DESC) t
ORDER BY
CASE WHEN num > 0 AND num <= 3 THEN
othertimestamp
ELSE
maintimestamp
END DESC
Modified answer:
select ilv.* from
(select sq.*, #i:=#i+1 rn from
(select #i := 0) i
CROSS JOIN
(select m.*, o.id o_id, o.timestamp o_t
from maintable m
left join othertable o
on m.id = o.id
where 1=1
order by o.timestamp desc) sq
) ilv
order by case when o_t is not null and rn <=3 then rn else 4 end,
timestamp desc
SQLFiddle here.
Amend where 1=1 condition inside subquery sq to match required complex selection conditions, and add appropriate limit criteria after the final order by for paging requirements.
Can you use a union query as below?
(SELECT id,timestamp,1 AS isFeatured FROM tbl2 ORDER BY timestamp DESC LIMIT 3)
UNION ALL
(SELECT id,timestamp,2 AS isFeatured FROM tbl1 WHERE NOT id in (SELECT id from tbl2 ORDER BY timestamp DESC LIMIT 3))
ORDER BY isFeatured,timestamp DESC
This might be somewhat redundant, but it is semantically closer to the question you are asking. This would also allow you to parameterize the number of featured results you want to return.
I'm currently working on an assignment which requires me to find the average on the number of resources for each module. The current table looks like this:
ResourceID ModulID
1 1
2 7
3 2
4 4
5 1
6 1
So basically, I'm trying to figure out how to get the average number of resources. The only
relevant test data here is for module 1, which has 3 different resources connected to it. But I need to display all of the results.
This is my code:
select avg(a.ress) GjSnitt, modulID
from
(select count(ressursID) as ress
from ressursertiloppgave
group by modulID) as a, ressursertiloppgave r
group by modulID;
Obviously it isn't working, but I'm currently at loss on what to change at this point. I would really appreciate any input you guys have.
This is the query you are executing, written in a slightly less obtuse syntax.
SELECT
avg(a.ress) as GjSnitt
, modulID
FROM
(SELECT COUNT(ressursID) as ress
FROM ressursertiloppgave
GROUP BY modulID) as a
CROSS JOIN ressursertiloppgave r <--- Cross join are very very rare!
GROUP BY modulID;
You are cross joining the table, making (6x6=) 36 rows in total and condensing this down to 4, but because the total count is 36, the outcome is wrong.
This is why you should never use implicit joins.
Rewrite the query to:
SELECT AVG(a.rcount) FROM
(select count(*) as rcount
FROM ressursertiloppgave r
GROUP BY r.ModulID) a
If you want the individual rowcount and the average at the bottom do:
SELECT r1.ModulID, count(*) as rcount
FROM ressursertiloppgave r1
GROUP BY r1.ModulID
UNION ALL
SELECT 'avg = ', AVG(a.rcount) FROM
(select count(*) as rcount
FROM ressursertiloppgave r2
GROUP BY r2.ModulID) a
I got the solution
SELECT AVG(counter)
FROM
(
SELECT COUNT(column to count) AS counter FROM table
) AS counter
Note that the nickname {counter} was added in SELECT COUNT and at the end of the inner SELECT