Greatest n-per-group With Multiple Joins

Greatest n-per-group With Multiple Joins - mysql

Evening,
I am trying to get an output of rows that are limited to n per group in MySQL. I can get it to work without joins, but with it I am just shy. I've pasted a dump of the relevant tables here:
http://pastebin.com/6F0v1jhZ
The query I am using is:
SELECT
title, catRef, RowNum, pCat, tog
FROM
(
SELECT
title, catRef,
#num := IF(#prevCat=catRef,#num+1,1) AS RowNum,
#prevCat AS tog,
#prevCat := catRef AS pCat
FROM (select #prevCat:=null) AS initvars
CROSS JOIN
(
SELECT p.title, oi.catRef
FROM resources p
INNER JOIN placesRel v ON (p.resId = v.refId)
INNER JOIN catRel oi ON (p.resId = oi.refId)
WHERE p.status = 'live' AND v.type = 'res' AND oi.type = 'res'
) AS T
) AS U
WHERE RowNum <= 5
ORDER BY catRef
I just can't get the row count to go up. Or any other solution would be greatly appreciated.
I'm looking for a result like this:
title catRef RowNum
Title1 1 1
Title2 1 2
Title3 1 3
Title4 2 1
Title5 2 2
Title6 3 1
At the moment, the RowNum column is always 1.

This works:
SET #num := 1, #prevCat := 0;
SELECT title, start, end, type, description, linkOut, outType, catRef, row_number
FROM (
SELECT title, start, end, type, description, linkOut, outType, catRef,
#num := if(#prevCat = catRef, #num + 1, 1) as row_number,
#prevCat AS tog,
#prevCat := catRef AS dummy
FROM (
SELECT title, start, end, resources.type, description, linkOut, outType, catRef
FROM resources LEFT JOIN placesRel ON placesRel.refId = resId LEFT JOIN catRel ON catRel.refId = resId
WHERE status = 'live' AND placesRel.type = 'res' AND catRel.type = 'res'
ORDER BY catRef
) AS w
) AS x WHERE x.row_number <= 4;
You need to put your joined query in a sub-query and order it by the column you want to group by. Use it's parent query to add row numbers. Then, the top-level query glues it all together.
If you don't put your joined query in it's own sub-query, the results won't be ordered as you wish, but instead will come out in the order they are in the database. This means the data is not grouped, so row numbers will no be applied to ordered rows.

Related

MySQL SUM TOP 2 RECORDS FOR EACH CATEGORY

I have a table for students scores, am trying to sum top 2 marks for all student for a particular category.I have search for similar post but have not gotten correct answer
I have tried summing the marks but am only getting result for two students instead of all students and it does not give me correct value.
SELECT SUM(marks) as totalmarks,stdid
FROM (( select marks,stdid
from finalresult
where `subjectcategory` = 1
AND `classId`='3' AND `year`='2018'
AND `term`='2' AND `type`='23'
order by marks desc
LIMIT 2 ))t1
GROUP BY stdid

An auxiliary subquery might be used for iteration
SELECT
stdid, marks
FROM
(
SELECT stdid, marks,
#rn := IF(#iter = stdid, #rn + 1, 1) AS rn,
#iter := stdid
FROM finalresult
JOIN (SELECT #iter := NULL, #rn := 0) AS q_iter
WHERE `subjectcategory` = 1
AND `classId`='3'
AND `year`='2018'
AND `term`='2'
AND `type`='23'
ORDER BY stdid, marks DESC
) AS T1
WHERE rn <= 2
this solution ignores ties and takes only two for each Student ID.
Demo

In MySQL 8+, you would do:
SELECT stdid, SUM(marks) as totalmarks
FROM (SELECT fr.*,
ROW_NUMBER() OVER (PARTITION BY stdid ORDER BY marks DESC) as seqnm
FROM finalresult fr
WHERE subjectcategory = 1 AND
classId = 3 AND
year = 2018 AND
term = 2 AND
type = 23
) fr
WHERE seqnum <= 2
GROUP BY stdid;
Note that I removed the single quotes. Things that look like numbers probably are. And you should not mix type -- put the quotes back if the values really are stored as strings.
In earlier versions, probably the simplest method is to use variables, but you have to be very careful about them. MySQL does not guarantee the order of evaluation of variables in SELECT, so you cannot assign a variable in one expression and use it in another.
A complicated expression solves this. Also, it is best to sort in a subquery (the latest versions of MySQL 5+ require this):
SELECT stdid, SUM(marks) as totalmarks
FROM (SELECT fr.*,
(#rn := IF(#s = stdid, #rn + 1,
IF(#s := stdid, 1, 1)
)
) as seqnum
FROM (SELECT fr.*
FROM finalresult fr
WHERE subjectcategory = 1 AND
classId = 3 AND
year = 2018 AND
term = 2 AND
type = 23
ORDER BY stdid, marks DESC
) fr CROSS JOIN
(SELECT #s = '', #rn := 0) params
WHERE seqnum <= 2
GROUP BY stdid;

How to rank mysql with groupby

Guy I am trying to ranking some data from my database, and I notice that it's going very wrong when I put the group by clause;
SET #rank=0;
SELECT #rank:=#rank+1 AS RankSemGenero
,a.nome AS Artista
,f.nome AS Musica
,SUM(rnk.total) AS Tocadas
,rnk.mes AS Mes
,rnk.dia AS Dia
,current_timestamp() AS Criado_Em_Sem_Genero
,23 AS RankComGenero
,current_timestamp() AS Criado_Em_Com_Genero
/*,CASE rnk.categoria
WHEN 1 then 'AM'
WHEN 2 then 'FM'
WHEN 3 then 'Web'
WHEN 4 then 'Comunitaria'
END AS Categoria_Radio*/
,'Todas' AS TipoEmissora
,5 AS Relevancia_Emissora
,'Nacional' AS Local
,5 AS Relevancia_Local
,1 AS fl_ativo
FROM rnk201901 rnk
LEFT JOIN artistas a ON rnk.artista = a.id
LEFT JOIN fonogramas f ON rnk.fonograma = f.id
WHERE rnk.dia = 10
-- AND rnk.fonograma = 35876
-- GROUP BY rnk.fonograma
ORDER BY rnk.total DESC;
This code above bringing the information on the right way 1 until ....
But if I change the GROUP BY line, I am receiving something like: 1700 instead of 1.
GROUP BY rnk.fonograma
Any idea how to handle this group by counting 1 by 1?
Thanks!!

You need to use a subquery, when using variables with group by:
select (#rank := #rank + 1) as rank, t.*
from (<your aggregation query here with order by>) t cross join
(select #rank := 0) params;

Ordering records by rand() in sql that has multiple union of 2 tables

I am making multiple unions on the same tables
however i need to order the records of the second table by rand()
keeping in mind that I DO NOT want to have duplicate records since Iam using order by rand()
Example:
news table has the following data: (test1,test2,test3)
ads table has the following data: (ads1,ads2,ads3)
The result should be like this:
news are sorted by id
ads are sorted by rand() : which means ads2 may comes in the top of the list, and maybe ads1 comes in the top of the list and so on..
This is my sql statement:
(select news.title
from news
order by news.id desc limit 6) union
(select
advertisements.title
from advertisements
order rand() limit 1,1)
union
(select
news.title,
from news
order by news.id desc limit 6,6)
union
(select
advertisements.title
from advertisements
order by rand() limit 2,1)

Near as I can tell, you seem to want 6 news articles followed by an advertising one, and then repeated again. This is not what your query does, but I'm guessing that is the intention in using union.
I would suggest enumerating the values and then doing the sort outside:
select title
from ((select n.title, #rn := #rn + 1, 'n' as which, id
from news n cross join (select #rn := 0) params
order by n.id desc
limit 12
)
union all
(select a.title, (#rna := #rna + 1) as rn, 'a', NULL
from advertisements a cross join (select #rna := 0) params
order rand()
limit 2
)
) na
order by (case when which = 'n' and rn <= 6 then 1
when which = 'a' and rn = 1 then 2
when which = 'n' and rn <= 12 then 3
when which = 'a' and rn = 1 then 4
end),
id desc;

Top 20 percent by id - MySQL

I am using a modified version of a query similiar to another question here:Convert SQL Server query to MySQL
Select *
from
(
SELECT tbl.*, #counter := #counter +1 counter
FROM (select #counter:=0) initvar, tbl
Where client_id = 55
ORDER BY ordcolumn
) X
where counter >= (80/100 * #counter);
ORDER BY ordcolumn
tbl.* contains the field 'client_id' and I am attempting to get the top 20% of the records for each client_id in a single statement. Right now if I feed it a single client_id in the where statement it gives me the correct results, however if I feed it multiple client_id's it simply takes the top 20% of the combined recordset instead of doing each client_id individually.
I'm aware of how to do this in most databases, but the logic in MySQL is eluding me. I get the feeling it involves some ranking and partitioning.
Sample data is pretty straight forward.
Client_id rate
1 1
1 2
1 3
(etc to rate = 100)
2 1
2 2
2 3
(etc to rate = 100)
Actual values aren't that clean, but it works.
As an added bonus...there is also a date field associated to these records and 1 to 100 exists for this client for multiple dates. I need to grab the top 20% of records for each client_id, year(date),month(date)

You need to do the enumeration for each client:
SELECT *
FROM (SELECT tbl.*, #counter := #counter +1 counter
(#rn := if(#c = client_id, #rn + 1,
if(#c := client_id, 1, 1)
)
)
FROM (select #c := -1, #rn := 0) initvar CROSS JOIN tbl
ORDER BY client_id, ordcolumn
) t cross join
(SELECT client_id, COUNT(*) as cnt
FROM tbl
GROUP BY client_id
) tt
where rn >= (80/100 * tt.cnt);
ORDER BY ordcolumn;

Using Gordon's answer as a starting point, I think this might be closer to what you need.
SELECT t.*
, (#counter := #counter+1) AS overallRow
, (#clientRow := if(#prevClient = t.client_id, #clientRow + 1,
if(#prevClient := t.client_id, 1, 1) -- This just updates #prevClient without creating an extra field, though it makes it a little harder to read
)
) AS clientRow
-- Alteratively (for everything done in clientRow)
, #clientRow := if(#prevClient = t.client_id, #clientRow + 1, 1) AS clientRow
, #prevClient := t.client_id AS extraField
-- This may be more reliable as well; I not sure if the order
-- of evaluation of IF(,,) is reliable enough to guarantee
-- no side effects in the non-"alternatively" clientRow calculation.
FROM tbl AS t
INNER JOIN (
SELECT client_id, COUNT(*) AS c
FROM tbl
GROUP BY client_id
) AS cc ON tbl.client_id = cc.client_id
INNER JOIN (select #prevClient := -1, #clientRow := 0) AS initvar ON 1 = 1
WHERE t.client_id = 55
HAVING clientRow * 5 < cc.c -- You can use a HAVING without a GROUP BY in MySQL
-- (note that clientRow is derived, so you cannot use it in the `WHERE`)
ORDER BY t.client_id, t.ordcolumn
;

Finding the most rented car of each year in SQL

I have tables as follows:
Car(id, make, ....)
Deal(id,datetime,car_id,....)
I want to write a query that would return a year, and a car make for the cars that have the most deals (ie the most deal ids) and the number of deals for that car make.
I started out,
SELECT YEAR(D.datetime) AS the_year, C.make, COUNT(D.id) AS num
FROM Deal D, Car C
WHERE D.car_id=C.id
GROUP BY the_year
Unfortunately, this has returned the year and the total number of deals.
So I am thinking to create this within another table and then call MAX(tbl.num), but I am confused on the syntax.
Can somebody help me out please?

This is an interesting problem. What you are looking for is specifically called the "mode" in statistics. In MySQL, you would get this by using variables or the group_conat()/substring_index()` trick. I'll show the latter:
SELECT the_year,
substring_index(group_concat(cd.make order by num desc), ',', 1) as the_mark
FROM (SELECT YEAR(D.datetime) AS the_year, C.make, COUNT(D.id) AS num
FROM Deal D JOIN
Car C
ON D.car_id = C.id
GROUP BY the_year, c.make
) cd
GROUP BY the_year;
EDIT:
The version using variables:
SELECT the_year,
substring_index(group_concat(cd.make order by num desc), ',', 1) as the_mark
FROM (SELECT YEAR(D.datetime) AS the_year, C.make, COUNT(D.id) AS num,
#rn := if(#year = YEAR(D.datetime), #rn + 1, 1) as rn,
#year := YEAR(D.datetime)
FROM Deal D JOIN
Car C
ON D.car_id = C.id CROSS JOIN
(SELECT #year := 0, #rn := 0) vars
GROUP BY the_year, c.make
ORDER BY the_year, num DESC
) cd
WHERE rn = 1;

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Greatest n-per-group With Multiple Joins - mysql

Related

MySQL SUM TOP 2 RECORDS FOR EACH CATEGORY

How to rank mysql with groupby

Ordering records by rand() in sql that has multiple union of 2 tables

Top 20 percent by id - MySQL

Finding the most rented car of each year in SQL

Categories

Resources