This question MAY have been asked before, but I can't for the life of me find the answer.
In order to avoid
SELECT * FROM student WHERE name LIKE '%searchphrase%' ORDER BY score
which, as I understand it, will never use index and will always use filesort there's the ability to use FULLTEXT index.
The question: How can I order by score without a filesort if I perform a fulltext search?
Result rows will come out in whatever order they're in in the FULLTEXT index which certainly isn't the order required by ORDER BY score, so the fulltext matches need to be sorted for ORDER BY in a separate step, and this is what filesort does.
The only alternative execution plan would be to retrieve rows in score order, and then apply fulltext match row by row, which totally defies any fulltext specific optimizations.
What may make sense in your case may be to have a combined index on (score, name) and stick with LIKE if your search expression covers a large part of the student rows, in this case you'd get an index scan in score order and the LIKE expression can be evaluated on index entries. Sou you're getting a full index scan instead of a full table scan, and no extra sort is needed as index entries are ordered by score already.
But if the number of matching rows is rather small compared to the total number of rows in the table doing a fulltext index lookup first, followed by filesort, will be the better plan.
Related
select id, col1,col2,col3,seq
from `table`
order by seq asc
i have already created index on 'seq', but i found that it doesn't use the index and use filesort when selecting , because the col1 may save some large data ,so i don't want to create the covering index in this table, so it is have some solutions to optimize this sql or table or index, thanks ,my English is not good😂😂😂
The SQL query optimizer apparently estimated the cost of using the index and concluded that it would be better to just do a table-scan and use a filesort of the result.
There is overhead to using a non-covering index. It reads the index in sorted order, but then has to dereference the primary key to get the other columns not covered by the row.
By analogy, this is like reading a book by looking up every word in alphabetical order in the index at the back, and then flipping back to the respect pages of the book, one word at a time. Time-consuming, but it's one way of reading the book in order by keyword.
That said, a filesort has overhead too. The query engine has to collect matching rows, and sort them manually, potentially using temporary files. This is expensive if the number of rows is large. You haven't described the size of your table in number of rows.
If the table you are testing has a small number of rows, the optimizer might have reasoned that it would be quick enough to do the filesort, so it would be unnecessary to read the rows by the index.
Testing with a larger table might give you different results from the optimizer.
The query optimizer makes the right choice in the majority of cases. But it's not infallible. If you think forcing it to use the index is better in this case, you can use the FORCE INDEX hint to make it believe that a table-scan is prohibitively expensive. Then if the index is usable at all, it'll prefer the index.
select id, col1,col2,col3,seq
from `table` FORCE INDEX(seq)
order by seq asc
I have a query like this:
SELECT * FROM staffs
WHERE MATCH staff_name AGAINST ('johnny')
ORDER BY staff_city ASC
Just an example, I want to ask which Index should I use here. For the MATCH() and AGAINST() there is FULLTEXT index on column staff_name, that's okay. But in the query there is also ORDER BY on the staff_city column. The FULLTEXT works fast, but when it comes to ordering the matched results, the search is slower. What INDEX should need there?
MySQL can never (almost never) use two indexes in a single SELECT. The Optimizer picks from among the indexes you have, and it usually picks the best for the query.
For this query, only the FULLTEXT index you have will be used, regardless of the other indexes the table has.
The other index might be useful for some other query.
More: Assuming you care only about rows with the word 'johnny' in it, change:
AGAINST ('johnny')
-->
AGAINST ('+johnny' IN BOOLEAN MODE)
Suppose you have a table with the following columns:
id
date
col1
I would like to be able to query this table with a specific id and date, and also order by another column. For example,
SELECT * FROM TABLE WHERE id = ? AND date > ? ORDER BY col1 DESC
According to this range documentation, an index will stop being used after it hits the > operator. But according to this order by documentation, an index can only be used to optimize the order by clause if it is ordering by the last column in the index. Is it possible to get an indexed lookup on every part of this query, or can you only get 2 of the 3? Can I do any better than index (id, date)?
Plan A: INDEX(id, date) -- works best if when it filters out a lot of rows, making the subsequent "filesort" not very costly.
Plan B: INDEX(col1), which may work best if very few rows are filtered by the WHERE clause. This avoids the filesort, but is not necessarily faster than the other choices here.
Plan C: INDEX(id, date, col1) -- This is a "covering" index if the query does not reference any other fields. The potential advantage here is to look only at the index, and not have to touch the data. If it applies, Plan C is better than Plan A.
You have not provided enough information to say which of these INDEXes will work best. Suggest you add C and B, if "covering" applies; else add A and B. The see which index the Optimizer picks. (There is still a chance that the Optimizer will not pick 'right'.)
(These three indexes are what my Index blog recommends.)
SELECT * FROM messages_messages WHERE (from_user_id=? AND to_user_id=?) OR (from_user_id=? AND to_user_id=?) ORDER BY created_at DESC
I have another query, which is this:
SELECT COUNT(*) FROM messages_messages WHERE from_user_id=? AND to_user_id=? AND read_at IS NULL
I want to index both of these queries, but I don't want to create 2 separate indexes.
Right now, I'm using 2 indexes:
[from_user_id, to_user_id, created_at]
[from_user_id, to_user_id, read_at]
I was wondering if I could do this with one index instead of 2?
These are the only 2 queries I have for this table.
The docs explain fairly completely how MySQL uses indices. In particular, its optimizer can use any left prefix of a multi-column index. Therefore, you could drop either of your two existing indices, and the other would be eligible for use in both queries, though it would be more selective / useful for one than for the other.
In principle, it could be more beneficial to keep your first index, provided that the created_at column was indexed in descending order. In practice, MySQL allows you to specify index column order, but in fact implements only ascending order. Therefore, having created_at in your index probably doesn't help very much.
No, you need both indexes for these two queries if you want to optimize fully.
Once you reach the column used for either sorting or range comparison (IS [NOT] NULL counts as a range predicate for this purpose), you don't get any benefit from putting more columns in the index. In other words, your index can have:
Some columns that are used in equality predicates
One column that is used either in a range predicate, or to avoid a filesort -- but not both.
Extra columns used in neither searching nor sorting, but only for the sake of a covering index.
So you cannot make a four-column index that serves both queries.
The only way you can reduce this to one index, as #JohnBollinger says, is to make an index that optimizes for one query, and uses a subset of the index for the second query. But that won't work as well.
I have a SELECT statement which I would like to optimize. The mysql - order by optimization says that in some cases the index cannot be used to optimize the ORDER BY. Specifically the point:
You use ORDER BY on nonconsecutive parts of a key
SELECT * FROM t1 WHERE key2=constant ORDER BY key_part2;
makes me thinking, that this could be the case. I'm using following indexes:
UNIQUE KEY `met_value_index1` (`RTU_NB`,`DATETIME`,`MP_NB`),
KEY `met_value_index` (`DATETIME`,`RTU_NB`)
With following SQL-statement:
SELECT * FROM met_value
WHERE rtu_nb=constant
AND mp_nb=constant
AND datetime BETWEEN constant AND constant
ORDER BY mp_nb, datetime
Would it be enough delete the index met_value_index1 and create it with the new ordering RTU_NB, MP_NB, DATETIME?
Do I have to include RTU_NB into the ORDER BY clause?
Outcome: I have tried what #meriton suggested and added the index met_value_index2. The SELECT completed after 1.2 seconds, previously it completed after 5.06 seconds. The following doesn't belong to the question but as a side note: After some other tries I switched the engine from MyISAM to InnoDB – with rtu_nb, mp_nb, datetime as primary key – and the statement completed after 0.13 seconds!
I don't get your query. If a row must match mp_np = constant to be returned, all rows returned will have the same mp_nb, so including mp_nb in the order by clause has no effect. I recommend you use the semantically equivalent statement:
SELECT * FROM met_value
WHERE rtu_nb=constant
AND mp_nb=constant
AND datetime BETWEEN constant AND constant
ORDER BY datetime
to avoid needlessly confusing the query optimizer.
Now, to your question: A database can implement an order by clause without sorting if it knows that the underlying access will return the rows in proper order. In the case of indexes, that means that an index can assist with sorting if the rows matched by the where clause appear in the index in the order requested by the order by clause.
That is the case here, so the database could actually do an index range scan over met_value_index1 for the rows where rtu_nb=constant AND datetime BETWEEN constant AND constant, and then check whether mp_nb=constant for each of these rows, but that would amount to checking far more rows than necessary if mp_nb=constant has high selectivity. Put differently, an index is most useful if the matching rows are contiguous in the index, because that means the index range scan will only touch rows that actually need to be returned.
The following index will therefore be more helpful for this query:
UNIQUE KEY `met_value_index2` (`RTU_NB`,`MP_NB`, `DATETIME`),
as all matching rows will be right next to each other in the index and the rows appear in the index in the order the order by clause requests. I can not say whether the query optimizer is smart enough to get that, so you should check the execution plan.
I do not think it will use any index for the ORDER BY. But you should look at the execution plan. Or here.
The order of the fields as they appear in the WHERE clause must match the order in the index. So with your current query you need one index with the fields in order of rtu_nb, mp_nb, datetime.