MYSQL like query does not show full table scan in explain

MYSQL like query does not show full table scan in explain - mysql

I have a query running in Mysql like so (obfuscated names)
explain
select this_.id as id1_0_,
this_.column1 as column1,
this_.column2 as column2,
this_.column3 as column3,
this_.column4 as column4,
this_.column5 as column5,
from
tablename this_
where
this_.column1 like '/blah%'
and this_.column2 = 'a9b51a14-4338-94f7-f23dbf9d539e'
and this_.column3 <> 'DUH'
and this_.column4=0
and this_.column5 like '%somename%'
order by this_.created desc
limit 20
Edit: column1 has a BTREE index, column2, column3, column 4, column5, created all have HASH indexes.
Table has one foreign key which is selected in the select clause but not the WHERE clause.
I'm told and read that the
like %somename%
will result in a full table scan. However, when i ran the explain, the output of the explain is
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE this_ ref somecolumnandinexnames 111 const 30414 Using where
The explain output looks exactly the same if I take away the like clause.
Based on this, we decided to put the query in production only to discover that in practice the query with like took way longer to execute (some seconds compared to some millis without the like).
Is there an explanation as to why the explain didn't warn me beforehand about this?
Edit: Observations
Taking away the order by makes the query go fast again even with the LIKE still in there.
Splitting into a subquery with the like in the outer query as mentioned below in the post actually works!
As #Uueerdo says, moving the rest of the conditions into a subquery actually speeds up performance! I'm therefore tempted to conclude that one of the things that could be happening is that the WHERE clause with the like is executed before the other conditions leading to a large resultset. As #Uueerdo says, moving the rest of the conditions into a subquery actually speeds up performance! I'm therefore tempted to conclude that one of the things that could be happening is that the WHERE clause with the like is executed before the other conditions leading to a large resultset. However, I still don't have an explanation on why removing the order by speeds up performance. The query selects all of 10 rows so the order by should be quite fast.
Is there a way I can see the order in which MYSQL evaluates the query. I think I remember seeing some sort of a graphical representation in MS SQL Server explain plans once. Don't remember if it was quite the same.

Even if it does not require a table scan it can still be expensive; what is likely happening is MySQL is using other conditions in your where for initial candidate row selection, and then reducing those results with the rest of the conditions.
If there are a large number of candidates, and/or column5 values are long, that condition could take some time to evaluate. Keep in mind the LIMIT occurs after the WHERE, so that does not reduce the amount of work needed.
You might see some improvement if you put most of the query in a subquery, and filter it's results by the like '%somename%' condition in an outer query.
Something like:
SELECT * FROM (
SELECT t.id as id1_0_
, t.column1, t.column2, t.column3, t.column4, t.column5
, t.created
FROM tablename AS t
WHERE t.column2 = 'a9b51a14-4338-94f7-f23dbf9d539e'
AND t.column3 <> 'DUH'
AND t.column4=0
AND t.column1 like '/blah%'
) AS subQ
WHERE subQ.column5 like '%somename%'
ORDER BY subQ.created DESC
LIMIT 20

If you have an index that includes all of the columns being read (in the SELECT or WHERE clause), MySQL will be able to read all of these values out of the index without scanning the table.
Also note that all primary key columns will be in the index as well.
In that case, even though it isn't scanning the full table, it will be scanning every row in the index in order to handle the LIKE '%somename%' query, so it's not going to be especially more efficient.

Related

query timeout because of one always valid primary key condition in select mysql

I have a 40M record table having id as the primary key. I execute a select statement as follows:
select * from messages where (some condition) order by id desc limit 20;
It is ok and the query executes in a reasonable time. But when I add an always valid condition as follows, It takes a huge time.
select * from messages where id > 0 and (some condition) order by id desc limit 20;
I guess it is a bug and makes MySQL search from the top side of the table instead of the bottom side. If there is any other justification or optimization it would be great a help.
p.s. with a high probability, the results are found in the last 10% records of my table.
p.p.s. the some condition is like where col1 = x1 and col2 = x2 where col1 and col2 are indexed.

MySQL has to choose whether to use an index to process the WHERE clause, or use an index to control ORDER BY ... LIMIT ....
In the first query, the WHERE clause can't make effective use of an index, so it prefers to use the primary key index to optimize scanning in order by ORDER BY. In this case it stops when it finds 20 results that satisfy the WHERE condition.
In the second query, the id > 0 condition in the WHERE clause can make use of the index, so it prefers to use that instead of using the index for ORDER BY. In this case, it has to find all the results that match the WHERE condition, and then sort them by id.
I wouldn't really call this a bug, as there's no specification of precisely how a query should be optimized. It's not always easy for the query planner to determine the best way to make use of indexes. Using the index to filter the rows using WHERE id > x could be better if there aren't many rows that match that condition.

A query like this
select *
from messages
where col1 = x1
and col2 = x2
order by id desc
limit 20;
is best handled by a 'composite' index with the tests for '=' first:
INDEX(col1, col2, id)
Try it, I think it will be faster than either of the queries you are working with.

It’s not a bug. You are searching through a 40 million table where your where clause doesn’t have an index. Add an index on the column in your where clause and you will see substantial improvement.

SQL gets slow on a simple query with ORDER BY

I have problem with MySQL ORDER BY, it slows down query and I really don't know why, my query was a little more complex so I simplified it to a light query with no joins, but it stills works really slow.
Query:
SELECT
W.`oid`
FROM
`z_web_dok` AS W
WHERE
W.`sent_eRacun` = 1 AND W.`status` IN(8, 9) AND W.`Drzava` = 'BiH'
ORDER BY W.`oid` ASC
LIMIT 0, 10
The table has 946,566 rows, with memory taking 500 MB, those fields I selecting are all indexed as follow:
oid - INT PRIMARY KEY AUTOINCREMENT
status - INT INDEXED
sent_eRacun - TINYINT INDEXED
Drzava - VARCHAR(3) INDEXED
I am posting screenshoots of explain query first:
The next is the query executed to database:
And this is speed after I remove ORDER BY.
I have also tried sorting with DATETIME field which is also indexed, but I get same slow query as with ordering with primary key, this started from today, usually it was fast and light always.
What can cause something like this?

The kind of query you use here calls for a composite covering index. This one should handle your query very well.
CREATE INDEX someName ON z_web_dok (Drzava, sent_eRacun, status, oid);
Why does this work? You're looking for equality matches on the first three columns, and sorting on the fourth column. The query planner will use this index to satisfy the entire query. It can random-access the index to find the first row matching your query, then scan through the index in order to get the rows it needs.
Pro tip: Indexes on single columns are generally harmful to performance unless they happen to match the requirements of particular queries in your application, or are used for primary or foreign keys. You generally choose your indexes to match your most active, or your slowest, queries. Edit You asked whether it's better to create specific indexes for each query in your application. The answer is yes.

There may be an even faster way. (Or it may not be any faster.)
The IN(8, 9) gets in the way of easily handling the WHERE..ORDER BY..LIMIT completely efficiently. The possible solution is to treat that as OR, then convert to UNION and do some tricks with the LIMIT, especially if you might also be using OFFSET.
( SELECT ... WHERE .. = 8 AND ... ORDER BY oid LIMIT 10 )
UNION ALL
( SELECT ... WHERE .. = 9 AND ... ORDER BY oid LIMIT 10 )
ORDER BY oid LIMIT 10
This will allow the covering index described by OJones to be fully used in each of the subqueries. Furthermore, each will provide up to 10 rows without any temp table or filesort. Then the outer part will sort up to 20 rows and deliver the 'correct' 10.
For OFFSET, see http://mysql.rjweb.org/doc.php/index_cookbook_mysql#or

select count() taking considerably longer than select for same "where" clause?

I am finding a select count(*) is taking considerably longer than select * for the queries with the same where clause.
The table in question has about 2.2 million records (call it detailtable). It has a foreign key field linking to another table (maintable).
This query takes about 10-15 seconds:
select count(*) from detailtable where maintableid = 999
But this takes a second or less:
select * from detailtable where maintableid = 999
UPDATE - It was asked to specify the number of records involved. It is 150.
UPDATE 2 Here is information when the EXPLAIN keyword is used.
For the SELECT COUNT(*), The EXTRA column reports:
Using where; Using index
KEY and POSSIBLE KEYS both have the foreign key constraint as their value.
For the SELECT * query, everything is the same except EXTRA just says:
Using Where
UPDATE 3 Tried OPTIMIZE TABLE and it still does not make a difference.

For sure
select count(*)
should be faster than
select *
count(*), count(field), count(primary key), count(any) are all the same.
Your explain clearly stateas that the optimizer somehow uses the index for count(*) and not for the other making the foreign key the main reason for the delay.
Eliminate the foreign key.

Try
select count(PRIKEYFIELD) from detailtable where maintableid = 999
count(*) will get all data from the table, then count the rows meaning it has more work to do.
Using the primary key field means it's using its index, and should run faster.

Thread Necro!
Crazy idea... In some cases, depending on the query planner and the table size, etc, etc., it is possible for using an index to actually be slower than not using one. So if you get your count without using an index, in some cases, it could actually be faster.
Try this:
SELECT count(*)
FROM detailtable
USING INDEX ()
WHERE maintableid = 999

SELECT count(*)
with that syntax alone is no problem, you can do that to any table.
The main issue on your scenario is the proper use of INDEX and applying [WHERE] clause on your search.
Try to reconfigure your index if you have the chance.
If the table is too big, yes it may take time. Try to check MyISAM locking article.

As the table is 2.2 million records, count can take time. As technically, MySQL should find the records and then count them. This is an extra operation that becomes significant with millions of records. The only way to make it faster is to cache the result in another table and update it behind the scenes.

Or simply Try
SELECT count(1) FROM table_name WHERE _condition;
SELECT count('x') FROM table_name WHERE _condition;

Best way to use indexes on large mysql like query

This mysql query is runned on a large (about 200 000 records, 41 columns) myisam table :
select t1.* from table t1 where 1 and t1.inactive = '0' and (t1.code like '%searchtext%' or t1.name like '%searchtext%' or t1.ext like '%searchtext%' ) order by t1.id desc LIMIT 0, 15
id is the primary index.
I tried adding a multiple column index on all 3 searched (like) columns. works ok but results are served on a auto filled ajax table on a website and the 2 seond return delay is a bit too slow.
I also tried adding seperate indexes on all 3 columns and a fulltext index on all 3 columns without significant improvement.
What would be the best way to optimize this type of query? I would like to achieve under 1 sec performance, is it doable?

The best thing you can do is implement paging. No matter what you do, that IO cost is going to be huge. If you only return one page of records, 10/25/ or whatever that will help a lot.
As for the index, you need to check the plan to see if your index is actually being used. A full text index might help but that depends on how many rows you return and what you pass in. Using parameters such as % really drain performance. You can still use an index if it ends with % but not starts with %. If you put % on both sides of the text you are searching for, indexes can't help too much.

You can create a full-text index that covers the three columns: code, name, and ext. Then perform a full-text query using the MATCH() AGAINST () function:
select t1.*
from table t1
where match(code, name, ext) against ('searchtext')
order by t1.id desc
limit 0, 15
If you omit the ORDER BY clause the rows are sorted by default using the MATCH function result relevance value. For more information read the Full-Text Search Functions documentation.
As #Vulcronos notes, the query optimizer is not able to use the index when the LIKE operator is used with an expression that starts with a wildcard %.

Removing 'using filesort' from query

I have the following query:
SELECT *
FROM shop_user_member_spots
WHERE delete_flag = 0
ORDER BY date_spotted desc
LIMIT 10
When run, it takes a few minutes. The table is around 2.5 million rows.
Here is the table (not designed by me but I am able to make some changes):
And finally, here are the indexes (again, not made by me!):
I've been attempting to get this query running fast for hours now, to no avail.
Here is the output of EXPLAIN:
Any help / articles are much appreciated.

Based on your query, it seems the index you would want would be on (delete_flag, date_spotted). You have an index that has the two columns, but the id column is in between them, which would make the index unhelpful in sorting based on date_spotted. Now whether mysql will use the index based on Zohaib's answer I can't say (sorry, I work most often with SQL Server).

The problem that I see in the explain plan is that the index on spotted date is not being used, insted filesort mechanism is being used to sort (as far as index on delete flag is concerned, we actually gain performance benefit of index if the column on which index is being created contains unique values)
the mysql documentation says
Index will not used for order by clause if
The key used to fetch the rows is not the same as the one used in the ORDER BY:
SELECT * FROM t1 WHERE key2=constant ORDER BY key1;
http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html
I guess same is the case here. Although you can try using Force Index

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008