How to select last N Rows without using an Index - mysql

I have a query that contains several conditions to extract data from a table of 5 million rows. A composite index has been built to partially cover some of these conditions to the extend that I am not able to cover the sorting with an index:
SELECT columns FROM Table WHERE conditions='conditions' ORDER BY id DESC LIMIT N;
The id itself is an auto-increment column. The above query can be very slow (4-5s) as filesort is being used. By removing the ORDER BY clause, I am able to speed up the query by up to 4 times. However the data extracted will be mostly old data.
Since post-processing can be carried out to sort the extracted data, I am more interested in extracting data from roughly the latest N rows from the resultset. My question is, is there a way to do something like this:
SELECT columns FROM Table WHERE conditions='conditions' LIMIT -N;
Since I do not really need a sort and I know that there is very high likelihood that the bottom N rows contain newer data.

Here you go. Keep in mind that there should be no problem in using ORDER BY with any indexed columns, including id.
SET #seq:=0;
SELECT `id`
FROM (
SELECT #seq := #seq +1 AS `seq` , `id`
FROM `Table`
WHERE `condition` = 'whatever'
)t1
WHERE t1.seq
BETWEEN (
(
SELECT COUNT( * )
FROM `Table`
WHERE `condition` = 'whatever'
) -49
)
AND (
SELECT COUNT( * )
FROM `Table`
WHERE `condition` = 'whatever'
);
You can replace the "-49" with an expression like: -1 * ($quantity_desired -1);
Also check out this answer as it might help you:
https://stackoverflow.com/a/725439/631764
And here's another one:
https://stackoverflow.com/a/1441164/631764

Grab the last "few" rows using a between:
SELECT columns
FROM Table
WHERE conditions = 'conditions'
AND id between (select max(id) from table) - 50 AND (select max(id) from table)
ORDER BY id
DESC LIMIT N;
This example gets the last 50 rows, but the id index will be used efficiently. The other conditions and ordering will then be only over 50 rows. Should work a treat.

Related

run a query if a value on the last row is correct

Id like to run a query only if a value in the last row is correct. In my exemple if the value in ColumnA is 1 on the last row then i want to run MyQuery. But if the value is not 1 stop there and do nothing.
i've try with case and count(*) and also with If exists. but i keep getting error
SELECT CASE WHEN ((SELECT COUNT(*) FROM
(Select a.* from table as a order by a.index desc limit 1) as b
where b.ColumnA = 1)) > 0 )
THEN (MyQuery)
END
i've also try with if exists but it doesn'work either
if exists Select b.* from (Select a.* from table as a order by a.index desc limit 1) where b.ColumnA = 1
begin
(MyQuery)
end
can you point me what wrong in those query or maybee there's a better way to achive this.
EDIT. This query will be run on a trigger after each insert in that table the goal is to avoid running MyQuery on row that dont required it. MyQuery is a bit slow and most row dont required it to run.
I think we can rephrase your logic here to make it work as you want:
WITH cte AS (
SELECT ColumnA, ROW_NUMBER() OVER (ORDER BY index DESC) rn
FROM yourTable
)
(your query here)
WHERE (SELECT ColumnA FROM cte WHERE rn = 1) = 1;
The WHERE clause above would return either true or false, and would apply to all records in the potential result set from your query. That is, if the ColumnA value from the "last" record were 1, then you would get back the entire result set, otherwise it would be empty set.
Assuming your version of MariaDB supports neither ROW_NUMBER nor CTEs, then use:
(your query here)
WHERE (SELECT ColumnA FROM yourTable ORDER BY index DESC LIMIT 1) = 1;
It depends on what your query is.
INSERT ...
SELECT ... WHERE ... -- this could lead to zero rows being inserted
DELETE ...
WHERE NOT EXISTS ( SELECT ... ) -- this could lead to zero rows being deleted
UPDATE t1 JOIN t2 ... -- the JOIN may cause no rows to be updated
Note:
(Select a.* from table as a order by a.index desc limit 1) as b
where b.ColumnA = 1)) > 0 )
can be simplified (and sped up) to
( ( SELECT ColumnA FROM table ORDER BY index DESC LIMIT 1 ) = 1 )
Note that that is a true/false "expression", so it can be used in various places.

How to use ORDER BY inside UNION

I want to use ORDER BY on every UNION ALL queries, but I can't figure out the right syntax. This is what I want:
(
SELECT id, user_id, other_id, name
FROM tablename
WHERE user_id = 123 AND user_in IN (...)
ORDER BY name
)
UNION ALL
(
SELECT id, user_id, other_id, name
FROM tablename
WHERE user_id = 456 AND user_id NOT IN (...)
ORDER BY name
)
EDIT:
Just to be clear: I need two ordered lists like this, not one:
1
2
3
1
2
3
4
5
Thank you very much!
Something like this should work in MySQL:
SELECT a.*
FROM (
SELECT ... FROM ... ORDER BY ...
) a
UNION ALL
SELECT b.*
FROM (
SELECT ... FROM ... ORDER BY ...
) b
to return rows in an order we'd like them returned. i.e. MySQL seems to honor the ORDER BY clauses inside the inline views.
But, without an ORDER BY clause on the outermost query, the order that the rows are returned is not guaranteed.
If we need the rows returned in a particular sequence, we can include an ORDER BY on the outermost query. In a lot of use cases, we can just use an ORDER BY on the outermost query to satisfy the results.
But when we have a use case where we need all the rows from the first query returned before all the rows from the second query, one option is to include an extra discriminator column in each of the queries. For example, add ,'a' AS src in the first query, ,'b' AS src to the second query.
Then the outermost query could include ORDER BY src, name, to guarantee the sequence of the results.
FOLLOWUP
In your original query, the ORDER BY in your queries is discarded by the optimizer; since there is no ORDER BY applied to the outer query, MySQL is free to return the rows in whatever order it wants.
The "trick" in query in my answer (above) is dependent on behavior that may be specific to some versions of MySQL.
Test case:
populate tables
CREATE TABLE foo2 (id INT PRIMARY KEY, role VARCHAR(20)) ENGINE=InnoDB;
CREATE TABLE foo3 (id INT PRIMARY KEY, role VARCHAR(20)) ENGINE=InnoDB;
INSERT INTO foo2 (id, role) VALUES
(1,'sam'),(2,'frodo'),(3,'aragorn'),(4,'pippin'),(5,'gandalf');
INSERT INTO foo3 (id, role) VALUES
(1,'gimli'),(2,'boromir'),(3,'elron'),(4,'merry'),(5,'legolas');
query
SELECT a.*
FROM ( SELECT s.id, s.role
FROM foo2 s
ORDER BY s.role
) a
UNION ALL
SELECT b.*
FROM ( SELECT t.id, t.role
FROM foo3 t
ORDER BY t.role
) b
resultset returned
id role
------ ---------
3 aragorn
2 frodo
5 gandalf
4 pippin
1 sam
2 boromir
3 elron
1 gimli
5 legolas
4 merry
The rows from foo2 are returned "in order", followed by the rows from foo3, again, "in order".
Note (again) that this behavior is NOT guaranteed. (The behavior we observer is a side effect of how MySQL processes inline views (derived tables). This behavior may be different in versions after 5.5.)
If you need the rows returned in a particular order, then specify an ORDER BY clause for the outermost query. And that ordering will apply to the entire resultset.
As I mentioned earlier, if I needed the rows from the first query first, followed by the second query, I would include a "discriminator" column in each query, and then include the "discriminator" column in the ORDER BY clause. I would also do away with the inline views, and do something like this:
SELECT s.id, s.role, 's' AS src
FROM foo2 s
UNION ALL
SELECT t.id, t.role, 't' AS src
FROM foo3 t
ORDER BY src, role
Don't use ORDER BY in an individual SELECT statement inside a UNION, unless you're using LIMIT with it.
The MySQL docs on UNION explain why (emphasis mine):
To apply ORDER BY or LIMIT to an individual SELECT, place the clause
inside the parentheses that enclose the SELECT:
(SELECT a FROM t1 WHERE a=10 AND B=1 ORDER BY a LIMIT 10) UNION
(SELECT a FROM t2 WHERE a=11 AND B=2 ORDER BY a LIMIT 10);
However, use of ORDER BY for individual SELECT statements implies
nothing about the order in which the rows appear in the final result
because UNION by default produces an unordered set of rows. Therefore,
the use of ORDER BY in this context is typically in conjunction with
LIMIT, so that it is used to determine the subset of the selected rows
to retrieve for the SELECT, even though it does not necessarily affect
the order of those rows in the final UNION result. If ORDER BY appears
without LIMIT in a SELECT, it is optimized away because it will have
no effect anyway.
To use an ORDER BY or LIMIT clause to sort or limit the entire UNION
result, parenthesize the individual SELECT statements and place the
ORDER BY or LIMIT after the last one. The following example uses both
clauses:
(SELECT a FROM t1 WHERE a=10 AND B=1)
UNION
(SELECT a FROM t2 WHERE a=11 AND B=2)
ORDER BY a LIMIT 10;
It seems like an ORDER BY clause like the following will get you what you want:
ORDER BY user_id, name
You just use one ORDER BY at the very end.
The Union turns two selects into one logical select. The order-by applies to the entire set, not to each part.
Don't use any parens either. Just:
SELECT 1 as Origin, blah blah FROM foo WHERE x
UNION ALL
SELECT 2 as Origin, blah blah FROM foo WHERE y
ORDER BY Origin, z
(SELECT id, user_id, other_id, name
FROM tablename
WHERE user_id = 123
AND user_in IN (...))
UNION ALL
(SELECT id, user_id, other_id, name
FROM tablename
WHERE user_id = 456
AND user_id NOT IN (...)))
ORDER BY name
You can also simplify this query:
SELECT id, user_id, other_id, name
FROM tablename
WHERE (user_id = 123 AND user_in IN (...))
OR (user_id = 456 AND user_id NOT IN (...))

Correct way to delete 'older' rows in MySQL

My table has a TIME field.
I want to keep only 5 newest rows.
Can I delete the old rows without using SELECT?
I think logic should be something like this:
DELETE FROM tbl WHERE row_num > 5 ORDER BY TIME
How can I implement this in MySQL whitout using SELECT to get list of TIME values?
Without proper ORDER BY clause, SQL result set have to be considered as unordered.
You have to provide a column to explicitly store your rows sequence numbers. This could be a time stamp or the auto_increment column of your table.
Please keep in mind you could have concurrent access to your table as well. What should be the expected behavior if someone else is inserting while you are deleting? As far as I can tell this could lead to situation where you keep only the "5 latest rows" + "those inserted on the other transaction".
If your have the time column for that purpose on your table and a PRIMARY KEY (or some other UNIQUE NOT NULL column) you could write:
DELETE tbl FROM tbl LEFT JOIN (SELECT * FROM tbl ORDER BY tm DESC LIMIT 5) AS k
ON (tbl.pk) = (k.pk)
WHERE k.`time` IS NULL;
If you have composite primary key (a,b) You could write:
DELETE tbl FROM tbl LEFT JOIN (SELECT * FROM tbl ORDER BY tm DESC LIMIT 5) AS k
ON (tbl.a,tbl.b) = (k.a,k.b)
WHERE k.tm IS NULL;
DELETE FROM TBL
WHERE ROW_NUM = (SELECT ROW_NUM FROM TBL LIMIT 6, 99999)
ORDER BY TIME DESC;
This will delete records from 6, 7, 8, 9, 10, ....., 200005
Because LIMIT range starts here from 6 to 9999 records, means 200005
Maybe this would be an alternative:
DELETE FROM tbl
WHERE primary_key NOT IN (SELECT primary_key
FROM tbl
ORDER BY time
DESC LIMIT 5)
If you want to exclude the top 5 rows, use something like:
DELETE FROM table WHERE primary_key IN
(SELECT primary_key FROM table LIMIT 1 OFFSET 5,1000000)
100000 can be a very large no

select 2nd row of every ID in mysql

I have a table :
ID | time
1 | 300
1 | 100
1 | 200
2 | 200
2 | 500
I want to get 2nd row for every ID
I know that I can get 1st row as
select ID,time from T group by ID;
But I don't know about how to get 2nd row for every ID.
I know about limit and offset clause in mysql, but can't figure out how to use them here.
How can I do it ?
EDIT : Actually, time is not ordered. I forgot to specify that. I have made an edit in the table.
i have just an idee how to make it but i couldnt fix it , maybe you can fix it. any suggest is appreciated to correct my query
first this to select the first row of each id.
SELECT min(id) id
FROM TableName t2
group by id
then select the min(id) which are not in the first query to select to min(id) (which is second row)
like that
SELECT min(id) id ,time
FROM TableName
WHERE id NOT IN (
SELECT min(id) id
FROM TableName
GROUP BY id
)
GROUP BY id
** as i said its just suggest . it returns me 0 values.if u fix it let me edit my post to be helpful
here a demo
SELECT ID, MAX(time) time
FROM
(
select ID, Time
from TableName a
where
(
select count(*)
from TableName as f
where f.ID = a.ID and f.time <= a.time
) <= 2
) s
GROUP BY ID
SQLFiddle Demo
SELECT x.*
FROM test x
JOIN test y
ON y.id = x.id
AND y.time >= x.time
GROUP
BY id,time
HAVING COUNT(*) = n;
Note that any entries with less than n results will be omitted
You cannot do this with the tables that you have. You could make a valiant attempt with:
select id, time
from (select id, time
from t
group by t
) t
where not exists (select 1 from t t2 where t2.id = t.id and t2.time = t.time)
group by id
That is, attempt to filter out the first row.
The reason this is not possible is because tables are inherently unordered, so there is not real definition of "second" in your tables. This gives the SQL engine the opportunity to rearrange the rows as it sees fit during processing -- which can result in great performance gains.
Even the construct that you are using:
select id, time
from t
group by id
is not guaranteed to return time from the first row. This is a (mis)feature of MySQL called Hidden Columns. It is really only intended for the case where all the values are the same. I will admit that in practice it seems to get the value from the first row, but you cannot guarantee that.
Probably your best solution is to select the data into a new table that has an auto-incrementing column:
create table newtable (
autoid int auto_increment,
id int,
time int
);
insert into newtable(id, time)
select id, time from t;
In practice, this will probably keep the same order as the original table, and you can then use the autoid to get the second row. I want to emphasize, though, the "in practice". There is no guarantee that the values are in the correct order, but they probably will be.

retrieving multiple random rows from MySQL query result set - without using order by rand()

I have a query which aims to retrieve a random row from a result set. I do not want to use ORDER BY Rand() as it seems to be rather inefficient.
My method is as follows:
generate a single random number between [0,1)
give each row of the result query a unique 'rank' number. i.e. give the first row a value 1, second row a value 2, and so forth
use the random number to get a number between 1 and the number of rows in the result
return the row where rank == the number generated from the random number
example query:
SELECT * FROM(
(SELECT #rand := RAND(), #rank := 0) r1
CROSS JOIN
(SELECT (#rank:=#rank+1) as num, A.id FROM
A JOIN B
ON A.id = B.id
WHERE B.number = 42
)
WHERE num = FLOOR(1 + #rand * #rank) LIMIT 1
This works for retrieving one row, but I instead want 10 random rows. Changing LIMIT 1 to LIMIT 10 doesn't work, because if num + 10 > number of rows the query doesn't return 10 rows.
The only solution I can think of it to either generate 10 random numbers in the sql query, check they are all different from each other and have several WHERE num = random_number_1 lines. Alternatively, I could call the query 10 times, checking that the rows selected are unique. I wouldn't know how to do the former, and the latter seems like it is rather inefficient. Unless there is likely to be some wonderful cache that would make running the same query extremely fast?
Does anyone have any ideas? thank you
You could try the following:
select sq2.c1
from ( select *
from (select #count := 0) sq0
cross join
(select t1.c1, #count := #count+1
from t t1
join t t2
using(c1)
where t2.c2 = 42
) sq1
) sq2
--use a probability to pick random rows
where if(#count <= 5, 1, floor(1 + rand() * (#count-1))) <= ceiling(log(pow(#count,2)))+1
limit 5;
The results will be random unless the result set is smaller (or the same size as) the limit. If this is a problem, you can wrap the whole thing:
select sq3.* from ( select ... limit 5 ) sq3
order by rand().
This will only randomize the small number of output rows (at most 5) which is efficient.
Of course, you can always use a temporary table:
create temporary table rset (row_key int auto_increment, key(row_key))
as ( select .... where c2 = 42 ) engine=myisam;
set #count := select count(*) from rset;
select rset.c1
from rset
where row_key in ( (floor(1 + rand() * (#count-1))),
(floor(1 + rand() * (#count-1))),
(floor(1 + rand() * (#count-1))),
(floor(1 + rand() * (#count-1))),
(floor(1 + rand() * (#count-1))) );
drop table rset;
If you want to guarantee that you get five unique rows, then you can use a second temporary table:
create temporary table row_keys ( row_key int not null primary key );
-- do this successful five times. if you get a unique key error try again
insert into row_keys values (floor(1 + rand() * (#count-1));
select rset.c1
from rset
join row_keys
using(row_key);