How to limit results from a SQL subquery or join

How to limit results from a SQL subquery or join - mysql

Lets imagine I have 2 tables in MySQL, one called Vehicle and the other called Passenger.
If I want a complete list of all Vehicles and their passengers then I can do something like this:
SELECT *
FROM Vehicle v
LEFT
JOIN Passenger p
ON p.VehicleID = v.VehicleID
LIMIT 0,100
The problem here is lets imagine that my vehicles are buses, and the first has 50 passengers, the 2nd bus has 40 and the 3rd has 30. The Limit 100 on the above query would give me a partial list of passengers on the 3rd bus.
Is there a way create such a query that won't split the results from the joined table?
Or alternatively can you apply LIMITS separately to the different tables? So I could say I want a limit of 10 vehicles and a limit of 50 passengers per vehicle?
Logically something like this:
SELECT * FROM Vehicle (LEFT JOIN Passenger ON Passenger.VehicleID = Vehicle.VehicleID LIMIT 0,50) LIMIT 0, 10
I was wondering if this could be achieved using some kind of subquery? Maybe something like:
SELECT *, (SELECT * FROM Passenger WHERE Passenger.VehicleID = Vehicle.VehicleID LIMIT 0,50) FROM Vehicle LIMIT 0, 10
But this doesn't work (The subquery is only allowed to return a single row).
Thanks in advance.

In MySQL, the easiest way to do what you want is using variables to enumerate the rows:
SELECT *
FROM (SELECT v.*, (#rnv := #rnv + 1) as seqnum_v
FROM Vehicle v CROSS JOIN
(SELECT #rnv := 0) params
) v LEFT JOIN
(SELECT p.*,
(#rnp := if(#v = VehicleId, #rnp + 1,
if(#v := VehicleId, 1, 1)
)
) as seqnum_p
FROM Passenger p CROSS JOIN
(SELECT #v := -1, #rnp := 0) params
) p
ON p.VehicleID = v.VehicleID
WHERE seqnum_v <= 10 and seqnum_p <= 50;

Related

Double Aggregate Function Mysql

I want to take the maximum value from a series of returned values but I can't figure out a simple way to do it. My query returns all rows so 1/2 way there. I can filter it down with PHP but I'd like to do it all in SQL. I tried with a max subquery but that returned all results still.
DDL:
create table matrix(
count int(4),
date date,
product int(4)
);
create table products(
id int(4),
section int(4)
);
DML:
select max(magic_count), section, id
from (
select sum(count) as magic_count, p.section, p.id
from matrix as m
join products as p on m.product = p.id
group by m.product
) as faketable
group by id, section
Demo with my current try.
Only ids 1 and 3 should be returned from the sample data because they have the highest cumulative count for each of the sections.
Here's a second SQL fiddle that demonstrates the same issue.

Here you go:
select a.id,
a.section,
a.magic_count
from (
select p.id,
p.section,
magic_count
from (
select m.product, sum(count) as magic_count
from matrix m
group by m.product
) sm
join products p on sm.product = p.id
) a
left join (
select p.id,
p.section,
magic_count
from (
select m.product, sum(count) as magic_count
from matrix m
group by m.product
) sm
join products p on sm.product = p.id
) b on a.section = b.section and a.magic_count < b.magic_count
where b.id is null
see a simplified example (and other methods) in the manual entry for The Rows Holding the Group-wise Maximum of a Certain Column
see it working live here

Here you have solution without using JOINs, it has better performance than the other answer, which uses lot of JOINs:
select #rn := 1, #sectionLag := 0;
select id, section, count from (
select id,
case when #sectionLag = section then #rn := #rn + 1 else #rn := 1 end rn,
#sectionLag := section,
section,
count
from (
select id, section, sum(count) count
from matrix m
join products p on m.product = p.id
group by id, section
) a order by section, count desc
) a where rn = 1
Variables at the beginning are used to imitate window functions (LAG and ROW_NUMBER), which are available in MySQL 8.0 or higher (if you are using such version, let me know, so I will give you solution also with window functions).
DEMO
Another demo, where you can compare performance of my and the other query. It contains ~20K rows and my query tends to be almost 2 times faster.

insert / update records from one table to another table, no clear join

I have a list of sku's in one table that I need to assign to product id's in another table the same way that one would in excel, by copying records from a column of sku's and pasting it next to the a column of product id's starting at the first row. I'd like to do this with an update query or other.
table1: tmp_pid
fields: pid, sku
This is where I have a random number of pid records. The sku field is empty. I'm trying to fill it with date from the next table.
table2: tmp_sku
fields: sku, used
This is where I keep a very long list of unique sku's and whether they have been used.
I tried this query but it does not work ([Err] 1054 - Unknown column 'tmp_sku.sku' in 'IN/ALL/ANY subquery')
UPDATE tmp_pid
SET tmp_pid.sku = tmp_sku.sku
WHERE tmp_sku.sku IN (SELECT sku FROM tmp_sku WHERE used = NO )
Table1 can have 20 or 1000 pid records, Table2 has 10000 unused sku's. I only need to copy the needed sku's next to the 20-1000 pid records in Table1. I know there is no connecting key between the two, but I am limited to this structure.

If I understand correctly, you want to get this result:
select p.*, s.sku
from (select p.*, (#rnp := #rnp + 1) as n
from tmp_pid p cross join (select #rnp := 0) params
order by pid
) p join
(select s.*, (#rns := #rns + 1) as n
from tmp_sku s cross join (select #rns := 0) params
where used = 'NO'
order by sku
) s
on p.n = s.n;
If so, you can adapt this to an update:
update tmp_pid p join
(select p.*, (#rnp := #rnp + 1) as n
from tmp_pid p cross join (select #rnp := 0) params
order by pid
) pp
on p.pid = pp.pid join
(select s.*, (#rns := #rns + 1) as n
from tmp_sku s cross join (select #rns := 0) params
order by sku
) s
on pp.n = s.n
set p.sku = s.sku;

MySQL SUM top n values for several columns and group

I have a MySQL table containing player points for serveral categories (p1, p2 etc) and player id (pid).
I have a query that counts SUM of points for each category, puts them as aliases and groups them by player id (pid).
SELECT *,
SUM(p1) as p1,
SUM(p2) as p2,
SUM(p3) as p3,
SUM(p4) as p4,
SUM(p6) as p6,
SUM(p13) as p13,
SUM(p14) as p14,
SUM(p15) as p15,
SUM(p16) as p16,
SUM(p17) as p17,
SUM(p18) as p18,
SUM(p19) as p19,
SUM(p20) as p20,
SUM(p21) as p21
FROM results GROUP BY pid
Futher I do a while loop and update other table with these alias values.
Now I have a need to count only top 5 or 12 (depending on a category) values for each group. I don't know where to start. I found similar questions, but none of them addresses putting value in an alias, so i don't have to change futher code.
Can someone help me, and write an example query for at least two categories, so i can understand a principle of doing this right?
Thank you in advance!

As we need to do sum of top n records, we need to use something like this:
SELECT pid, sum(p1)
FROM (SELECT p.*,
(#pn := if(#p = pid, #pn + 1,
if(#p := pid, 1, 1)
)
) as seqnum
FROM player p CROSS JOIN
(SELECT #p := 0, #pn := 0) as p1
ORDER BY pid, p1 DESC
) p
WHERE seqnum <= 1
GROUP BY pid;
Here, we can modify seqnum <= 1 condition as per the number of records needed. E.g. if we want 5 records then we need to write seqnum <= 5.
Please note that this will only calculate Top n sum for a particular field. If we want multiple fields then we may need to repeat the query.
Here is the SQL Fiddle example to play around with.

Building on the answer by #DarshanMehta , you can do repeated sub queries like that. Note that the variable names in each sub query need to be different.
Something like this, assuming you have a table of players:-
SELECT players.pid,
suba1.p1sum,
suba2.p2sum
FROM players
LEFT OUTER JOIN
(
SELECT pid, SUM(p1) AS p1sum
FROM (SELECT r.pid,
r.p1,
#p1n := if(#p1 = pid, #p1n + 1, 1) AS seqnum,
#p1 := pid
FROM results r
CROSS JOIN (SELECT #p1 := 0, #p1n := 0) as p1
ORDER BY r.pid, r.p1 DESC
) sub1
WHERE seqnum <= 5
GROUP BY pid
) suba1
ON players.pid = suba1.pid
LEFT OUTER JOIN
(
SELECT pid, SUM(p2) AS p1sum
FROM (SELECT r.pid,
r.p2,
#p2n := if(#p2 = pid, #p2n + 1, 1) AS seqnum,
#p2 := pid
FROM results r
CROSS JOIN (SELECT #p2 := 0, #p2n := 0) as p2
ORDER BY r.pid, r.p2 DESC
) sub1
WHERE seqnum <= 5
GROUP BY pid
) suba2
ON players.pid = suba1.pid

You can build a table with all that SUM information, and use this one:
SELECT * from newTable ORDER BY p1 DESC LIMIT 5;
and you can catch all info that you want, by changing the field p1 and LIMIT 5

Generic greatest N per group query is too slow

The following query takes 18 minutes to complete. How can I optimize it to execute faster?
Basically, my query for every citizen joins row from citizens_static and citizens_dynamic table where update_id_to column is highest.
INSERT INTO latest_tmp (...)
SELECT cs1.*, cd1.*
FROM citizens c
JOIN citizens_static cs1 ON c.id = cs1.citizen_id
JOIN citizens_dynamic cd1 ON c.id = cd1.citizen_id
JOIN (
SELECT citizen_id, MAX(update_id_to) AS update_id_to
FROM citizens_static
GROUP BY citizen_id
) AS cs2 ON c.id = cs2.citizen_id AND cs1.update_id_to = cs2.update_id_to
JOIN (
SELECT citizen_id, MAX(update_id_to) AS update_id_to
FROM citizens_dynamic
GROUP BY citizen_id
) cd2 ON c.id = cd2.citizen_id AND cd1.update_id_to = cd2.update_id_to;
latest_tmp table is MyISAM table with indexes disabled during import. Disabling them improved execution time from 20 minutes to 18 minutes, so it's not the biggest problem.
I also benchmarked LEFT JOIN approach with WHERE t2.column IS NULL. It takes several hours comparing to INNER JOIN approach which I'm using.
Explain query output below. It seems to be using indexes.
citizens_dynamic and citizens_static have primary key on citizen_id,update_id_to and secondary key named "id" on update_id_to,citizen_id columns.

Could you explain, in English, what you want?
Then see Groupwise Max And edit the following as needed:
SELECT
province, n, city, population
FROM
( SELECT #prev := '', #n := 0 ) init
JOIN
( SELECT #n := if(province != #prev, 1, #n + 1) AS n,
#prev := province,
province, city, population
FROM Canada
ORDER BY
province,
population DESC
) x
WHERE n <= 3
ORDER BY province, n;
Regardless of the ASC/DESC on the inner ORDER BY, there will be a full table scan and a 'filesort'.

I'm not familiar enough with MySQL to be able to predict if this will run any better, but I would suggest to give this a try:
SELECT cs1.*, cd1.*
FROM citizens c
JOIN citizens_static cs1 ON c.id = cs1.citizen_id
AND NOT EXISTS ( SELECT *
FROM citizens_static cs2
WHERE cs2.citizen_id = cs1.citizen_id
AND cs2.update_id > cs1.update_id )
JOIN citizens_dynamic cd1 ON c.id = cd1.citizen_id
AND NOT EXISTS ( SELECT *
FROM citizens_dynamic cd2
WHERE cd2.citizen_id = cd1.citizen_id
AND cd2.update_id > cd1.update_id )
PS: Please comment the running time (if it returns within the hour =), that way I might learn (not) to propose this construction in the future again.

MYSQL Deleting all records over 15 for Each GROUP

At the end of this process I need to have a maximum of 15 records for each type in a table
My (hypothetical) table "stickorder" has 3 columns: StickColor, OrderNumber, PrimeryKey. (OrderNumber, PrimeryKey are unique)
I can only handle 15 orders for each stick color So I need to delete all the extra orders (They will be processed another day and are in a master table so I don't need them in this table.)
I have tried some similar solutions on this site but nothing seem to work, this is the closest
INSERT INTO stickorder2
(select posts_ordered.*
from (
select
stickorder.*,
#row:=if(#last_order=stickorder.OrderNumber, #row+1, 1) as row,
#last_orders:=stickorder.OrderNumber
from
stickorder inner join
(select OrderNumber from
(select distinct OrderNumber
from stickorder
order by OrderNumber) limit_orders
) limit_orders
on stickorder.OrderNumber = limit_orders.OrderNumber,
(select #last_order:=0, #row:=0) r
) posts_ordered
where row<=15);

When using insert, you should always list the columns. Alternatively, you might really want create table as.
Then, there are lots of other issues with your query. For instance, you say you want a limit on the number for each color, and yet you have no reference to StickColor in your query. I think you want something more along these lines:
INSERT INTO stickorder2(col1, . . . col2)
select so.*
from (select so.*,
#row:=if(#lastcolor = so.StickColor, #row+1,
if(#lastcolor := so.lastcolor, 1, 1)
) as row
from stickorders so cross join
(select #lastcolor := 0, #row := 0) vars
order by so.StickColor
) so
where row <= 15;

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How to limit results from a SQL subquery or join - mysql

Related

Double Aggregate Function Mysql

insert / update records from one table to another table, no clear join

MySQL SUM top n values for several columns and group

Generic greatest N per group query is too slow

MYSQL Deleting all records over 15 for Each GROUP

Categories

Resources