MySQL not using Index (despite FORCE INDEX) - mysql

I need to do a live search using PHP and jQuery to select cities and countries from the two tables cities (almost 3M rows) and countries (few hundred rows).
For a short moment I was thinking of using a MyISAM table for cities as InnoDB does not support FULLTEXT search, however decided it is not a way to go (frequent table crashes, all other tables are InnoDB etc and with MySQL 5.6+ InnoDB also starts to support FULLTEXT index).
So, right now I still use MySQL 5.1, and as most cities consist of one-word only or max 2-3 Words, but e.g. "New York" - most people will not search for "York" if they mean "New York". So, I just put an index on the city_real column (which is a varchar).
The following query (I tried it in different versions, without any JOIN and without ORDER BY, with USE INDEX and even with FORCE INDEX, I have tried LIKE instead equal (=) but another post said = was faster and if the wildcard is only at the end, it is OK to use it), in EXPLAIN it always says "using where, using filesort". The average time for the query is about 4sec, which you have to admit is a little bit to slow for a live search (user typing in text-box and seeing suggestions of cities and countries)...
Live search (jQuery ajax) searches if the user typed at least 3 characters...
SELECT ci.id, ci.city_real, co.country_name FROM cities ci LEFT JOIN countries co ON(ci.country_id=co.country_id) WHERE city_real='cit%' ORDER BY population DESC LIMIT 5
There is a PRIMARY on ci.id and an INDEX on ci.city_real. Any ideas why MySQL does not use the index? Or how I could speed up the query? Or where else I should/should not set an INDEX?
Thank you very much in advance for your help!
Here's the explain output
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE ci range city_real city_real 768 NULL 1250 Using where; Using filesort
1 SIMPLE co eq_ref PRIMARY PRIMARY 6 fibsi_1.ci.country_id 1

You should use WHERE city_real LIKE 'cit%', not WHERE city_real='cit%'.
I have tried LIKE instead equal (=) but another post said = was faster and if the wildcard is only at the end, it is OK to use it
This is wrong. = doesn't support wildcards so it will give you the wrong results.
Or how I could speed up the query?
Make sure you have an index on country_id in both tables. Post the output of EXPLAIN SELECT ... if you need further help.

The query do use index as seen in the key field of explain output. The reason it uses filesort is the order by, and the reason it uses where is probably one of the fields (city_real, population) allows null values.

Related

SQL query: Speed up for huge tables

We have a table with about 25,000,000 rows called 'events' having the following schema:
TABLE events
- campaign_id : int(10)
- city : varchar(60)
- country_code : varchar(2)
The following query takes VERY long (> 2000 seconds):
SELECT COUNT(*) AS counted_events, country_code
FROM events
WHERE campaign_id` in (597)
GROUPY BY city, country_code
ORDER BY counted_events
We found out that it's because of the GROUP BY part.
There is already an index idx_campaign_id_city_country_code on (campaign_id, city, country_code) which is used.
Maybe someone can suggest a good solution to speed it up?
Update:
'Explain' shows that out of many possible index MySql uses this one: 'idx_campaign_id_city_country_code', for rows it shows: '471304' and for 'Extra' it shows: 'Using where; Using temporary; Using filesort' –
Here is the whole result of EXPLAIN:
id: '1'
select_type: 'SIMPLE'
table: 'events'
type: 'ref'
possible_keys: 'index_campaign,idx_campaignid_paid,idx_city_country_code,idx_city_country_code_campaign_id,idx_cid,idx_campaign_id_city_country_code'
key: 'idx_campaign_id_city_country_code'
key_len: '4'
ref: 'const'
rows: '471304'
Extra: 'Using where; Using temporary; Using filesort'
UPDATE:
Ok, I think it has been solved:
Looking at the pasted query here again I realized that I forget to mention here that there was one more column in the SELECT called 'country_name'. So the query was very slow then (including country_name), but I'll just leave it out and now the performance of the query is absolutely ok.
Sorry for that mistake!
So thank you for all your helpful comments, I'll upvote all the good answers! There were some really helpful additions, that I probably also we apply (like changing types etc).
without seeing what EXPLAIN says it's a long distance shot, anyway:
make an index on (city,country_code)
see if there's a way to use partitioning, your table is getting rather huge
if country code is always 2 chars change it to char
change numeric indexes to unsigned int
post entire EXPLAIN output
don't use IN() - better use:
WHERE campaign_id = 597
OR campaign_id = 231
OR ....
afaik IN() is very slow.
update: like nik0lias commented - IN() is faster than concatenating OR conditions.
Some ideas:
Given the nature and size of the table it would be a great candidate for partitioned tables by country. This way the events of every country would be stored in a different physical table even if it behaves as a virtual big table
Is country code an string? May be you have a country_id that could be easier to sort. (It may force you to create or change indexes)
Are you really using the city in the group by?
partitioning - especially by country will not help
column IN (const-list) is not slow, it is in fact a case with special optimization
The problem is, that MySQL doesn't use the index for sorting. I cannot say why, because it should. Could be a bug.
The best strategy to execute this query is to scan that sub-tree of the index where event_id=597. Since the index is then sorted by city_id, country_code no extra sorting is needed and rows can be counted while scanning.
So the indexes are already optimal for this query. MySQL is just not using them correctly.
I'm getting more information off line. It seems this is not a database problem at all, but
the schema is not normalized. The table contains not only country_code, but also country_name (this should be in an extra table).
the real query contains country_name in the select list. But since that column is not indexed, MySQL cannot use an index scan.
As soon as country_name is dropped from the select list, the query reverts to an index-only scan ("using index" in EXPLAIN output) and is blazingly fast.

Mysql query takes long time

Hello I have table with 500k records and folowing columns:
id, id_route, id_point, lat, lng, distance, status
I want to select id_routes which are inside radius from my defined point.
Thats no problem
SELECT id_route
FROM route_path
WHERE (((lat < 48.7210 + 2.0869) AND
(lat > 48.7210 - 2.0869)) AND
((lng < 21.2578 + 2.0869) AND
(lng > 21.2578 - 2.0869)))
GROUP BY id_route
But according PHPmyadmin it takes 0.2s. This is pretty to much since I am going to build huge query and this is just beginning.
I have also index on id_route.
Primary key is id, schema is MyISAM
EXPLAIN of SELECT:
id
select_type
table
type
possible_keys
key
key_len
ref
rows
Extra
1
SIMPLE
route_path
ALL
NULL
NULL
NULL
NULL
506902
Using where; Using temporary; Using filesort
How can I reduce time, I think 500K is not su much records to make it so long? Thanks
if queries takes longer time , and you have set the index properly then you need a powerful server to compute the query quickly !
A 2-dimensional search is inherently slow. The tools won't tell you how to improve this particular query.
You seem to have no indexes in your table?? You should at least try INDEX(lat). that will limit the effort to a stripe of about 4 degrees (in your example). This probably includes thousands of rows. Most of them are then eliminated by checking lng, but not until after fetching all of those thousands.
So, you are tempted to try INDEX(lat, lng) only to find that it ignores lng. And perhaps it runs slower because the index is bigger.
INDEX(lat, lng, id) and using a subquery to find the ids, then doing a self-join back to the table to do the rest of the work is perhaps the simplest semi-straightforward solution. This is slightly beneficial because that is a "covering index" for the subquery, and, although you scan thousands of rows in the index, you don't have to fetch many rows in the data.
Can it be made faster? Yes. However, the complexity is beyond the space available here. See Find the nearest 10 pizza parlors. It involves InnoDB (to get index clustering), PARTITIONs (as crude 2D indexing) and modifications to the original data (to turn lat/lng into integers for PARTITION keys).
Click on follwing Link to know how to improve MySQL performance
MySQL Query Analyzer
MySQL performance tools

Best way to use indexes on large mysql like query

This mysql query is runned on a large (about 200 000 records, 41 columns) myisam table :
select t1.* from table t1 where 1 and t1.inactive = '0' and (t1.code like '%searchtext%' or t1.name like '%searchtext%' or t1.ext like '%searchtext%' ) order by t1.id desc LIMIT 0, 15
id is the primary index.
I tried adding a multiple column index on all 3 searched (like) columns. works ok but results are served on a auto filled ajax table on a website and the 2 seond return delay is a bit too slow.
I also tried adding seperate indexes on all 3 columns and a fulltext index on all 3 columns without significant improvement.
What would be the best way to optimize this type of query? I would like to achieve under 1 sec performance, is it doable?
The best thing you can do is implement paging. No matter what you do, that IO cost is going to be huge. If you only return one page of records, 10/25/ or whatever that will help a lot.
As for the index, you need to check the plan to see if your index is actually being used. A full text index might help but that depends on how many rows you return and what you pass in. Using parameters such as % really drain performance. You can still use an index if it ends with % but not starts with %. If you put % on both sides of the text you are searching for, indexes can't help too much.
You can create a full-text index that covers the three columns: code, name, and ext. Then perform a full-text query using the MATCH() AGAINST () function:
select t1.*
from table t1
where match(code, name, ext) against ('searchtext')
order by t1.id desc
limit 0, 15
If you omit the ORDER BY clause the rows are sorted by default using the MATCH function result relevance value. For more information read the Full-Text Search Functions documentation.
As #Vulcronos notes, the query optimizer is not able to use the index when the LIKE operator is used with an expression that starts with a wildcard %.

MySQL join query not using Indexes?

I have this query:
SELECT
COUNT(*) AS `numrows`
FROM (`tbl_A`)
JOIN `tbl_B` ON `tbl_A`.`B_id` = `tbl_B`.`id`
WHERE
`tbl_B`.`boolean_value` <> 1;
I added three indexes for tbl_A.B_id, tbl_B.id and tbl_B.boolean_value but mysql still says it doesn't use indexes (in queries not using indexes log) and it examine whole of tables to retrieve the result.
I need to know what I should do to optimize this query.
EDIT:
Explain output:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tbl_B ALL PRIMARY,boolean_value NULL NULL NULL 5049 Using where
1 SIMPLE tbl_A ref B_id B_id 9 tbl_B.id 9 Using where; Using index
The explain show us that an index is used to make the join to tbl_B but no index is used to filter tbl_A on the boolean value.
An index was available but the engine choose not to use it. Why it happen:
maybe 5049 rows is not a big deal and the engine saw that using the index to filter something like 10% of the rows using the index would be as fast as doing it without using it.
Booleans takes only 3 values: 1, 0 or NULL. So the cardinality of the index will always be very low (3 max). Low cardinality index are usually dropped by the query analyser (which is quite right usually thinking this index won't help him a lot)
It would be interesting to see if the query analyser behaves the same way when you have a 50/50 repartition of true and false value for this boolean, or when you have just a few False.
Now usually boolean fields are useful only on indexes containing multiple keys, so that if your queries use all the fields of the index (in where or order by) the query analyser will trust that index to be really a good tool.
Note that indexes are slowing down your writes and takes extra-spaces, do not add useless indexes. Using logt-query-not-using-indexes is a good thing, but you should compensate that log information with the slow queries log.If the query is fast it's not a problem.
if boolean_value it's really boolean value indexing of it not so good idea. Index wouldn't be effective.

MySQL Indices and Order By Clause

Say I have a simple query like this:
SELECT * FROM topics ORDER BY last_post_id DESC
In accordance to MySQL's Order-By Optimization Docs, an index on last_post_id should be utilized. But EXPLAINing the query tells the contrary:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE topic_info ALL NULL NULL NULL NULL 13 Using filesort
Why is my index not being utilized?
Do you really only have 13 rows? The database may be deciding that simply sorting them is quicker than going via the index.
A basic principle of index optimization is to use representative data for testing. A few rows in a table has no predictive value about how either the index or the optimizer will work in real life.
If you really have only a few records, then the indexing will provide no effective benefit.
EXPLAIN only tells you about the selection process, in which case it's expected that the query would have to examine all 13 rows to see if they meet the WHERE clause (which you don't have, so it's useless information!). It only reports the indexes and keys used for this purpose (evaluating WHERE, JOIN, HAVING). So, regardless of whether the query uses an index to sort or not, EXPLAIN won't tell you about it, so don't get caught up in its results.
Yes, the query uses the index to sort quickly. I've noticed the same results from EXPLAIN (aka reporting all rows despite being sorted by an index and having a limit) and I doubt it's a result of the small number of rows in your table, but rather a limitation of the power of EXPLAIN.
Try selecting the specific columns in order as they are in the table. MySQL indexes don't hold when the order is changed.