I have the following problem when running a mysql query:
Query is very slow and when i use explain the query key is null but possible_keys are avaiable and the order is correct, i also tried adding independent indexes per each row but still key was NULL.
You can see table, index and mysql explain here: https://snag.gy/vcChl6.jpg
The optimizer likely has just decided that there is no reason to use the index.
Since you are using SELECT * that means that means that if it used the index, then it would have to use the primary key from the index to then go back and look up all the necessary data from the clustered index. That is referred to as a double lookup, and is generally bad for performance. As there are so few records in this table, the optimizer likely decided that it can easily do a full table scan instead and get your result faster.
In short, this is expected behavior.
If you want to SELECT just some columns, add them to the t1 index and then just SELECT only the columns you need, with that given WHERE clause. It should use the index then. As your table grows in size, it may start using the index as well, once it estimates that the double lookup is cheaper than the full table scan.
A guess: Most rows are of that 'project' and that 'lang'.
The Optimizer does not understand that fact, so it takes the index that is obviously the best:
(id_project, id_lang)
This one would be equally good: (id_lang, id_project).
No fair... The EXPLAIN mentions indexes named id_project and id_lang (not useful), but the list of indexes shows a composite index t1(id_project, id_lang) (useful).
Then, as Willem suggests, it has to bounce between the index and the table. Normally (that is, when it has adequate statistics), the Optimizer will say "Oh, more than ~20% of the table is being referenced; let's ignore any index."
Things you can do:
Get rid of that index.
Change * to a list of just the columns you need. In particular, if you avoid the 3 TEXT columns, two optimizations kick in. Alternatively, any that will never be longer than 255 characters can be changed to VARCHAR(255).
Use some other filtering, ordering, limiting, etc. If this is a web application, do you really want to get ~534 rows?
Related
I would like to make system whitch allows to search user messages, by specific user.
assume having folowing table
create table messages(
user_id int,
message nvarchar(500));
So what kind of index I should use here, if I want to search for all messages from user 1, containing word 'foo'.
Simple, non unique index user_id
It will filter only specific user messages nd then full scan for specific word.
FULLTEXT index on message
this will find all messages from all users and then filter by ID, seems to be very inefficient in case of big amount of users.
comopound index on both user_id and message
So full text index tree is created for each user separately, so they can be searched individually. During query system filters messages by ID and then performs text search on remaining rows in index.
A.F.A.I.K. last one is impossible. So then I assume I shall use 1-st option, It will perform better in case of few thousands of users?
And if each will have ~100 messages, full iteration won't cost much resources?
Perhaps I can include username into message and use BOOLEAN full text search mode, but I think it would be slower than by using indexed user_id.
#Alden Quimby's answer is correct as far as it goes, but there is more to the story, because MySQL will only try to choose the optimal index, and its ability to make that determination is limited because of the way fulltext indexes interact with the optimizer.
What actually happens is this:
If the specified user_id exists in either 0 or 1 matching rows in the table, the optimizer will realize this and will choose user_id as the index for that query. Fast execution.
Otherwise, the optimizer will choose the fulltext index, filtering every row matched by the fulltext index to eliminate rows not containing a user_id that matches the WHERE clause. Not quite as fast.
So it's not truly the "optimum" path. It's more like fulltext, with a nice optimization to avoid the fulltext search under the one condition that we know we have almost nothing of interest in the table.
The reason this breaks down is that a fulltext index doesn't give any meaningful statistics back to the optimizer. It just says "yeah, I think that query should probably only require me to check 1 row" ... which, of course, pleases the optimizer greatly, so the fulltext index wins the bid for lowest cost, unless the index with the integer value also comes in comparably low or lower.
Still, that doesn't mean I wouldn't try it this way first.
There's another option, which would work best with fulltext queries IN BOOLEAN MODE and that is to create another column which you would populate with something like CONCAT('user_id_',user_id) or something similar, and then declare a 2-column fulltext index.
filter_string VARCHAR(48) # populated with CONCAT('user_id_',user_id);
....
FULLTEXT KEY (message,filter_string)
Then specify everything in the query.
SELECT ...
WHERE user_id = 500 AND
MATCH (message,filter_string) AGAINST ('+kittens +puppies +user_id_500' IN BOOLEAN MODE);
Now, the fulltext index will be responsible for matching only those rows where kittens, puppies, and "user_id_500" appears in the combined fulltext index of the two columns, but you'd still want to have the integer filter there too to make sure the final results are constrained in spite of any random appearance of "user_id_500" in the message.
You should add a fulltext index on message and a regular index on user_id, and use the query:
SELECT *
FROM messages
WHERE MATCH(message) AGAINST(#search_query)
AND user_id = #user_id;
You're right that you can't do option 3. But rather than trying to pick between 1 and 2, let MySQL do the work for you. MySQL will only use one of the two indexes, and will do a linear scan to complete the second filter, but it will estimate the effectiveness of each index and choose the optimal one.
Note: only do this if you can afford the overhead of two indexes (slower insert/update/delete). Also, if you know that each user will only have a few messages, then yes it might make sense to use a simple index and do a regex in the application layer or something like that.
Turn on the "Optimizer trace" and look for "considered_execution_plans". I contend that the Optimizer will always pick the FULLTEXT index, even when some other index might be better. This may be because it is quite costly when the MATCH is not pre-computed as when the FT index is built.
More on Optimizer Trace: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#optimizer_trace (Earlier in that doc are my tips on FULLTEXT.)
I would like to make system whitch allows to search user messages, by specific user.
assume having folowing table
create table messages(
user_id int,
message nvarchar(500));
So what kind of index I should use here, if I want to search for all messages from user 1, containing word 'foo'.
Simple, non unique index user_id
It will filter only specific user messages nd then full scan for specific word.
FULLTEXT index on message
this will find all messages from all users and then filter by ID, seems to be very inefficient in case of big amount of users.
comopound index on both user_id and message
So full text index tree is created for each user separately, so they can be searched individually. During query system filters messages by ID and then performs text search on remaining rows in index.
A.F.A.I.K. last one is impossible. So then I assume I shall use 1-st option, It will perform better in case of few thousands of users?
And if each will have ~100 messages, full iteration won't cost much resources?
Perhaps I can include username into message and use BOOLEAN full text search mode, but I think it would be slower than by using indexed user_id.
#Alden Quimby's answer is correct as far as it goes, but there is more to the story, because MySQL will only try to choose the optimal index, and its ability to make that determination is limited because of the way fulltext indexes interact with the optimizer.
What actually happens is this:
If the specified user_id exists in either 0 or 1 matching rows in the table, the optimizer will realize this and will choose user_id as the index for that query. Fast execution.
Otherwise, the optimizer will choose the fulltext index, filtering every row matched by the fulltext index to eliminate rows not containing a user_id that matches the WHERE clause. Not quite as fast.
So it's not truly the "optimum" path. It's more like fulltext, with a nice optimization to avoid the fulltext search under the one condition that we know we have almost nothing of interest in the table.
The reason this breaks down is that a fulltext index doesn't give any meaningful statistics back to the optimizer. It just says "yeah, I think that query should probably only require me to check 1 row" ... which, of course, pleases the optimizer greatly, so the fulltext index wins the bid for lowest cost, unless the index with the integer value also comes in comparably low or lower.
Still, that doesn't mean I wouldn't try it this way first.
There's another option, which would work best with fulltext queries IN BOOLEAN MODE and that is to create another column which you would populate with something like CONCAT('user_id_',user_id) or something similar, and then declare a 2-column fulltext index.
filter_string VARCHAR(48) # populated with CONCAT('user_id_',user_id);
....
FULLTEXT KEY (message,filter_string)
Then specify everything in the query.
SELECT ...
WHERE user_id = 500 AND
MATCH (message,filter_string) AGAINST ('+kittens +puppies +user_id_500' IN BOOLEAN MODE);
Now, the fulltext index will be responsible for matching only those rows where kittens, puppies, and "user_id_500" appears in the combined fulltext index of the two columns, but you'd still want to have the integer filter there too to make sure the final results are constrained in spite of any random appearance of "user_id_500" in the message.
You should add a fulltext index on message and a regular index on user_id, and use the query:
SELECT *
FROM messages
WHERE MATCH(message) AGAINST(#search_query)
AND user_id = #user_id;
You're right that you can't do option 3. But rather than trying to pick between 1 and 2, let MySQL do the work for you. MySQL will only use one of the two indexes, and will do a linear scan to complete the second filter, but it will estimate the effectiveness of each index and choose the optimal one.
Note: only do this if you can afford the overhead of two indexes (slower insert/update/delete). Also, if you know that each user will only have a few messages, then yes it might make sense to use a simple index and do a regex in the application layer or something like that.
Turn on the "Optimizer trace" and look for "considered_execution_plans". I contend that the Optimizer will always pick the FULLTEXT index, even when some other index might be better. This may be because it is quite costly when the MATCH is not pre-computed as when the FT index is built.
More on Optimizer Trace: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#optimizer_trace (Earlier in that doc are my tips on FULLTEXT.)
I have a fairly simple mysql query which contains a few inner join and then a where clause. I have created indexes for all the columns that are used in the joins as well as the primary keys. I also have a where clause which contains an IN operator. When only 5 or less ids are passed into the IN clause the query optimizer uses one of my indexes to run the query in a reasonable amount of time. When I use explain I see that the type is range and key is PRIMARY. My issue is that if I use more than 5 ids in the IN clause, the optimizer ignores all the available indexes and query runs extremely slow. When I use explain I see that the type is ALL and the key is NULL.
Could someone please shed some light on what is happening here and how I could fix this.
Thanks
Regardless of the "primary key" indexes on the tables to optimize the JOINs, you should also have an index based on common criteria you are applying a WHERE against. More info needed on columns of your query, but you should have an index on your WHERE criteria TOO.
You could also try using Mysql Index Hints. It lets you specify which index should be used during the query execution.
Examples:
SELECT * FROM table1 USE INDEX (col1_index,col2_index)
WHERE col1=1 AND col2=2 AND col3=3;
-
SELECT * FROM table1 IGNORE INDEX (col3_index)
WHERE col1=1 AND col2=2 AND col3=3;
More Information here:
Mysql Index Hints
Found this while checking up on a similar problem I am having. Thought my findings might help anyone else with a similar issue in future.
I have a MyISAM table with about 30 rows (contains common typos of similar words for a search where both the possible original typo and the alternative word may be valid spellings, the table will slowly build up in size). However the cutoff for me is that if there are 4 items in the IN clause the index is used but when 5 are in the IN clause the index is ignored (note I haven't tried alternative words so the actual individual items in the IN clause might be a factor). So similar to the OP, but with a different number of words.
Use index would not work and the index would still be ignored. Force index does work, although I would prefer to avoid specifying indexes (just in case someone deletes the index).
For some testing I padded out the table with an extra 1000 random unique rows the query would use the relevant index even with 80 items in the IN clause.
So seems MySQL decides whether to use the index based on the number of items in the IN clause compared to the number of rows in the table (probably some other factors at play though).
Just say I had a query as below..
SELECT
name,category,address,city,state
FROM
table
WHERE
MATCH(name,subcategory,category,tag1) AGAINST('education')
AND
city='Oakland'
AND
state='CA'
LIMIT
0, 10;
..and I had a fulltext index as name,subcategory,category,tag1 and a composite index as city,state; is this good enough for this query? Just wondering if something extra is needed when mixing additional AND's when making use of the fulltext index with the MATCH/AGAINST.
Edit: What I am trying to understand is, what happens with the additional columns that are within the query but are not indexed in the chosen index (the fulltext index), the above example being city and state. How does MySQL now find the matching rows for these since it can't use two indexes (or can it?) - so, basically, I'm trying to understand how MySQL goes about finding the data optimally for the columns NOT in the chosen fulltext index and if there is anything I can or should do to optimize the query.
If I understand your question, you know that the MATCH AGAINST uses your FULLTEXT index and your wondering how MySQL goes about applying the rest of the WHERE clause (ie. does it do a tablescan or an indexed lookup).
Here's what I'm assuming about your table: it has a PRIMARY KEY on some id column and the FULLTEXT index.
So first off, MySQL will never use the FULLTEXT index for the city/state WHERE clause. Why? Because FULLTEXT indexes only apply with MATCH AGAINST. See here in the paragraph after the first set of bullets (not the Table of Contents bullets).
EDIT: In your case, assuming your table doesn't only have like 10 rows, MySQL will apply the FULLTEXT index for your MATCH AGAINST, then do a tablescan on those results to apply the city/state WHERE.
So what if you add a BTREE index onto city and state?
CREATE INDEX city__state ON table (city(10),state(2)) USING BTREE;
Well MySQL can only use one index for this query since it's a simple select. It will either use the FULLTEXT or the BTREE. Note that when I say one index, I mean one index definition, not one column in a multi-part index. Anwway, this then begs the question which one does it use?
That depends on the table analysis. MySQL will attempt to estimate (based on table stats from the last OPTIMIZE TABLE) which index will prune the most records. If the city/state WHERE gets you down to 10 records while the MATCH AGAINST only gets you down to 100, then MySQL will use the city__state index first for the city/state WHERE and then do a tablescan for the MATCH AGAINST.
On the other hand, if the MATCH_AGAINST gets you down to 10 records while the city/state WHERE gets you down to only a 1000, then MySQL will apply the FULLTEXT index first and tablescan for city and state.
The bottom line is the cardinality of your index. Essentially, how unique are the values that will go into your index? If every record in your table has city set to Oakland, then it's not a very unique key and so having city = 'Oakland' doesn't really reduce the number of records all that much for you. In that case, we say your city__state index has a low cardinality.
Consequently if 90% of the words in your FULLTEXT index are "John", then that doesn't really help you much either for the exact same reasons.
If you can afford the space and the UPDATE/DELETE/INSERT overhead, I would recommend adding the BTREE index and letting MySQL decide which index he wants to use. In my experience, he usually does a very good job of picking the right one.
I hope that answers your question.
EDIT: On a side note, making sure you pick the right size for your BTREE index (in my example I picked the first 10 char in city). This obviously makes a huge impact to cardinality. If you picked city(1), then obviously you'll get a lower cardinality then if you did city(10).
EDIT2: MySQL's query plan (estimation) for which index prunes the most records is what you see in EXPLAIN.
I think you can easily determine which index gets used by using EXPLAIN on your query. Please check the accepted answer for this question, which provides some good resources on how to interpret the output of EXPLAIN.
How does MySQL now find the matching rows for these since it can't use
two indexes
Yes it can: Can MySQL use multiple indexes for a single query? Also, you should read the documentation: How MySQL Uses Indexes
I had similar task some time ago, and I have noticed that MySQL can use either FULLTEXT index or any other index/indexes in one query, but not both; I wasn't able to mix FULLTEXT with any other index. Any selection with fulltext search will work in such way:
select subset using FULLTEXT search
select records matching other criteria from that subset 'Using where'
So you can use either fulltext index or any other index (I wasn't able to use both indexes by FORCE INDEX or anything else).
I suggest trying with both using fulltext and using other index (i.e. on City and State columns) and compare the results - they may vary depending on actual content in your database.
In my case I have discovered that forcing regular (non-fulltext) index in such query produced better performance (since I had very large number of rows, about 300 000, and non-fulltext criteria matched about 1000 of them).
I was using MySQL 5.5.24
I know I need to have a primary key set, and to set anything that should be unique as a unique key, but what is an INDEX and how do I use them?
What are the benefits? Pros & Cons? I notice I can either use them or not, when should I?
Short answer:
Indexes speed up SELECT's and slow down INSERT's.
Usually it's better to have indexes, because they speed up select more than they slow down insert.
On an UPDATE the index can speed things way up if an indexed field is used in the WHERE clause and slow things down if you update one of the indexed fields.
How do you know when to use an index
Add EXPLAIN in front of your SELECT statement.
Like so:
EXPLAIN SELECT * FROM table1
WHERE unindexfield1 > unindexedfield2
ORDER BY unindexedfield3
Will show you how much work MySQL will have to do on each of the unindexed fields.
Using that info you can decide if it is worthwhile to add indexes or not.
Explain can also tell you if it is better to drop and index
EXPLAIN SELECT * FROM table1
WHERE indexedfield1 > indexedfield2
ORDER BY indexedfield3
If very little rows are selected, or MySQL decided to ignore the index (it does that from time to time) then you might as well drop the index, because it is slowing down your inserts but not speeding up your select's.
Then again it might also be that your select statement is not clever enough.
(Sorry for the complexity in the answer, I was trying to keep it simple, but failed).
Link:
MySQL indexes - what are the best practices?
Pros:
Faster lookup for results. This is all about reducing the # of Disk IO's. Instead of scanning the entire table for the results, you can reduce the number of disk IO's(page fetches) by using index structures such as B-Trees or Hash Indexes to get to your data faster.
Cons:
Slower writes(potentially). Not only do you have to write your data to your tables, but you also have to write to your indexes. This may cause the system to restructure the index structure(Hash Index, B-Tree etc), which can be very computationally expensive.
Takes up more disk space, naturally. You are storing more data.
The easiest way to think about an index is to think about a dictionary. It has words and it has definitions corresponding to those words. The dictionary has an index on "word" because when you go to a dictionary you want to look up a word quickly, then get its definition. A dictionary usually contains just one index - an index by word.
A database is analogous. When you have a bunch of data in the database, you will have certain ways that you want to get it out. Let's say you have a User table and you often look up a user by the FirstName column. Since this is an operation that you are doing often in your application, you should consider using an index on this column. That will create a structure in the database that is sorted, if you will, by that column, so that looking up something by first name is like looking up a word in a dictionary. If you didn't have this index you might need to look at ALL rows before you determine which ones have a specific FirstName. By adding an index, you have made this fast.
So why not put an index on all columns and make them all fast? Like everything, there is a trade off. Every time you insert a row into the table User, the database will need to perform its magic and sort everything on your indexed column. This can be expensive.
You don't have to have a primary key. Indexes (of any type) are used to speed up queries and, at least with the InnoDB engine, enforce foreign key constraints. Whether you use a unique or plain (non-unique) index depends on whether you want to allow duplicate values in the key.
This is a general database concept, you might use external resources to read about it, like http://beginner-sql-tutorial.com/sql-index.htm or http://en.wikipedia.org/wiki/Index_(database)
An index allows MySQL to find data quicker. You use them on columns that you'll be using in WHERE clauses. For example, if you have a column named score, and want to find everything with where score > 5, by default this means MySQL will need to scan through the WHOLE table to find those scores. However if you use a BTREE index, finding those that meet that condition will happen a LOT faster.
Indices have a price: disk and memory space. If it's a very big table, your index will grow rather large.
Think of it this way: what are the biggest benefits of having an index in a book? It's much the same thing. You have a slightly larger book, yet you're able to quickly look things up. When you create an index on a column, you're saying you want to be able to reference it in a where clause to look it up quickly.