Let's say there are millions of records in my_table.
Here is my query to extract rows with a specific name from list:
SELECT * FROM my_table WHERE Name IN ('name1','name2','name3','name4')
How do I limit the returned result per name1, name2, etc?
The following query would limit the whole result (to 100).
SELECT * FROM my_table WHERE Name IN ('name1','name2','name3','name4') LIMIT 100
I need to limit to 100 for each name.
This is a bit of a pain in MySQL, but the best method is probably variables:
select t.*
from (select t.*,
(#rn := if(#n = name, #rn + 1,
if(#n := name, 1, 1)
)
) as rn
from my_table t cross join
(select #n := '', #rn := 0) params
order by name
) t
where rn <= 100;
If you want to limit this to a subset of the names, then add the where clause to the subquery.
Note: If you want to pick certain rows -- such as the oldest or newest or biggest or tallest -- just add a second key to the order by in the subquery.
Try
SELECT * FROM my_table WHERE Name IN ('name1','name2','name3','name4') FETCH FIRST 100 ROWS ONLY
Related
Is it possible to create a counter in mysql/mariadb in one single SELECT-statement. I've tried the following but it returns only the value 1 in the first column:
SELECT #rownr := IF(ISNULL(#rownr),0,#rownr)+1 AS rowNumber, * FROM table_x LIMIT 0,10
If I run the statement more often in the same mysql-instance it starts counting from the last number. So the second time it starts at 2, the third time at 12. This means that the variable is created but seems to be only available for modification when it was instantiated before the SQL statement was issued.
It is possible, but a bit tricky. First, you need to declare the variable outside of the select clause (in a separate set assignment, or in a derived table). Also, it is safer to sort the rows in a subquery first, and then compute the variable.
I would recommend:
set #rn := 0;
select t.*, #rn := #rn + 1 rowNumber
from (select t.* from mytable t order by id limit 10) t
Note that I added an order by clause to the inner query, otherwise it is undefined in which sequence rows will be ordered (I assumed id).
Alternatively, you can declare the variable in a derived table:
select t.*, #rn := #rn + 1 rowNumber
from (select t.* from mytable t order by id limit 10) t
cross join (select #rn := 0) x
Finally: if you are running MySQL 8.0, just use row_number():
select t.*, row_number() over(order by id) rn
from mytable t
order by id
limit 10;
You don't have an order by, so the ordering is indeterminate. But you can initialize the parameter in the statement itself:
SELECT #rownr := (#rownr + 1) AS rowNumber, x.*
FROM table_x x.CROSS JOIN
(SELECT #rownr := 0) params
LIMIT 0, 10;
If you want a particular ordering, you should use an order by in a subquery.
Also note that starting in MySQL 8, variable assignments in SELECT are deprecated. You should be using window functions (row_number()) in more recent versions.
So I have a bit of an unusual request. I'm working with a table with billions of rows.
The table has a column 'id' which is not unique, and has a column 'data'
What I want to do is run a count on the number of rows grouped by the 'id', but limit the counting to only 150 entries. I only need to know if there are 150 rows by any given id.
This is in an effort to optimize the query and performance.
It doesn't have to be a count. I only need to know if a given id has 150 entries, without have MySQL continue counting entries during the query. If that makes sense.
I know how to count, and I know how to group, and I know how to do both, but the count will come back with a number in the millions which is wasted processing time and the query needs to run on hundred of thousands of ids.
You can't really optimize performance for this -- I don't think.
select id, (count(*) >= 150)
from t
group by id;
If you happen to have a separate table with one row per id and an index on t(id), then this might be faster:
select ids.id,
((select count(*)
from t
where t.id = ids.id
) >= 150
)
from ids;
Unfortunately, MySQL does not support double nesting for correlated subqueries, so this is not possible:
select ids.id,
((select count(*)
from (select 1
from t
where t.id = ids.id
limit 150
) t
) >= 150
)
from ids;
If so, this might be faster.
EDIT:
If you have an index on id and only want ids that have 150 or more, then variables might be faster:
select id,
(#rn := if(#id = id, #rn + 1,
if(#id := id, 1, 1)
)
) as rn
from (select id
from t
order by id
) t cross join
(select #id := 0, #rn := 0) params
having rn = 150;
The thinking here is that using the index to order the table, materializing, and scanning again is probably faster than group by. I don't think row_number() would have the same performance characteristics.
EDIT II:
A slight variation on the above can be used to get all ids with a flag:
select id, (max(id) = 150)
from (select id,
(#rn := if(#id = id, #rn + 1,
if(#id := id, 1, 1)
)
) as rn
from (select id
from t
order by id
) t cross join
(select #id := 0, #rn := 0) params
having rn in (1, 150)
) t
group by id;
EDIT III:
Ahh! If you have a separate table of ids, then this might be the best approach:
select ids.id,
(select id
from t
where t.id = ids.id
limit 1 offset 149
) is not null
from ids;
This will fetch the 150th row from the index. If it not there, then no row is returned.
I don't think that this is possible. You will have to scan the entire table to know which ids have at least 150 entries.
So:
select id
from mytable
group by id
having count(*) >= 150
With an index on id, this should be as efficient as it can be.
I have a table like this:
01-Jul-17 100
02-Jul-17 100
03-Jul-17 300
04-Jul-17 300
05-Jul-17 500
06-Jul-17 500
07-Jul-17 300
08-Jul-17 400
09-Jul-17 100
10-Jul-17 100
What I want to output is (in this order) by eliminating the continuous duplicates but not all duplicates:
100
300
500
300
400
100
I cannot select Distinct, as it will eliminate the second instances of 300, 100. Is there a way to achieve this result in MySQL?
Thanks!
You want to get the previous value. If the dates really have no gaps or duplicates, just do:
select t.*
from t left join
t tprev
on t.col1 = date_add(tprev.col1, interval 1 day)
where tprev.col2 is null or tprev.col2 <> t.col2;
EDIT:
If the dates don't meet these conditions, then you can use variables:
select t.*
from (select t.*,
(#rn := if(#v = col2, #rn + 1,
if(#v := col2, 1, 1)
)
) as rn
from t cross join
(select #v := 0, #rn := 0) params
order by t.col1
) t
where rn = 1;
Note that MySQL does not guarantee the order of evaluation of expressions in the SELECT. So variables should not be assigned in one expression and then used in another -- they should be assigned in a single expression.
One way to handle this problem is by using session variables to track the changes of the values as ordered by your date column. In the query below, we keep track of the value, ordered by date, and assign a row number to each group of identical value. Then, only the first value in each group is retained. Note that this approach is robust to any number of duplicates. It is also robust with respect to there being gaps in your dates, so long as each record can be ordered by date.
SET #rn = 1;
SET #val = NULL;
SELECT t.val
FROM
(
SELECT
#rn:=CASE WHEN #val = val THEN #rn+1 ELSE 1 END rn,
#val:=val AS val,
dt
FROM yourTable
ORDER BY dt
) t
WHERE t.rn = 1
ORDER BY t.dt;
Output:
Demo here:
Rextester
You can make use of lag and lead functions.
select y from (select y , lag(y,1,0) over (order by x) as prev_y from t1) where y <> prev_y;
Suppose that I have a database which contains the following columns:
VehicleID|timestamp|lat|lon|
I may have multiple times the same VehicleId but with a different timestamp. Thus VehicleId,Timestamp is the primary key.
Now I would like to have as a result the last N measurements per VehicleId or the first N measurements per vehicleId.
How I am able to list the last N tuples according to an ordering column (e.g. in our case timestamp) per VehicleId?
Example:
|VehicleId|Timestamp|
1|1
1|2
1|3
2|1
2|2
2|3
5|5
5|6
5|7
In MySQL, this is most easily done using variables:
select t.*
from (select t.*,
(#rn := if(#v = vehicle, #rn + 1,
if(#v := vehicle, 1, 1)
)
) as rn
from table t cross join
(select #v := -1, #rn := 0) params
order by VehicleId, timestamp desc
) t
where rn <= 3;
I have the table with data:
And for this table I need to create pegination by productId column. I know about LIMIT N,M, but it works with rows and not with groups. For examle for my table with pegination = 2 I expect to retrieve all 9 records with productId = 1 and 2 (the number of groups is 2).
So how to create pegination by numbers of groups ?
I will be very thankfull for answers with example.
One way to do pagination by groups is to assign a product sequence to the query. Using variables, this requires a subquery:
select t.*
from (select t.*,
(#rn := if(#p = productid, #rn + 1,
if(#rn := productid, 1, 1)
)
) as rn
from table t cross join
(select #rn := 0, #p := -1) vars
order by t.productid
) t
where rn between X and Y;
With an index on t(productid), you can also do this with a subquery. The condition can then go in a having clause:
select t.*,
(select count(distinct productid)
from t t2
where t2.productid <= t.productid)
) as pno
from t
having pno between X and Y;
Try this:
select * from
(select * from <your table> where <your condition> group by <with your group>)
LIMIT number;