How to optimize a MySQL/MyISAM full text search with many results - mysql

I have a MySQL MyISAM table with a full text index on the keywords column and 20 millions rows. It works well when a search for rare words, for example:
SELECT count(*) FROM books WHERE MATCH(keywords) AGAINST ('+DUCK' IN BOOLEAN MODE)
(0.005s, 2k results)
But when I search for a more common terms it is much slowers:
SELECT count(*) FROM books WHERE MATCH(keywords) AGAINST ('+YES' IN BOOLEAN MODE)
(5s, 2millions results)
It makes sens because the last one returns much more rows, but then how can I pre-filter the rows before the text search? This doesn't work:
SELECT count(*) FROM books WHERE date > "2019-09-23" AND MATCH(keywords) AGAINST ('+YES' IN BOOLEAN MODE)
(5s, 0 result)

MyISAM's (and maybe InnoDB's) FULLTEXT will always do the MATCH first, then any other clauses. Hence, adding that extra filter does not help with speed.
Think of it this way... A FT index is constructed to test the entire table(s) for the MATCH clause. It is not ready to handle any filtering before it goes to work. So, you are stuck with FT first, then filter the results the other way but without benefit of any indexes.

Related

Index for MATCH and ORDER BY

I have a query like this:
SELECT * FROM staffs
WHERE MATCH staff_name AGAINST ('johnny')
ORDER BY staff_city ASC
Just an example, I want to ask which Index should I use here. For the MATCH() and AGAINST() there is FULLTEXT index on column staff_name, that's okay. But in the query there is also ORDER BY on the staff_city column. The FULLTEXT works fast, but when it comes to ordering the matched results, the search is slower. What INDEX should need there?
MySQL can never (almost never) use two indexes in a single SELECT. The Optimizer picks from among the indexes you have, and it usually picks the best for the query.
For this query, only the FULLTEXT index you have will be used, regardless of the other indexes the table has.
The other index might be useful for some other query.
More: Assuming you care only about rows with the word 'johnny' in it, change:
AGAINST ('johnny')
-->
AGAINST ('+johnny' IN BOOLEAN MODE)

mysql match() agains() OR match() against() SLOW

I have table with quite many rows (~2M).
When i search it like
SELECT * FROM product WHERE
MATCH(name) AGAINST('+some' '+word' IN BOOLEAN MODE)
it works like charm and finds what i need in less than 0.5s.
But when i search for 2 sets of words like this
SELECT * FROM product WHERE
MATCH(name) AGAINST('+some' '-word' IN BOOLEAN MODE)
OR
MATCH(name) AGAINST('+something' '-other' IN BOOLEAN MODE)
search takes sometimes over minute.
I would expect it to work 2 times slower (it's 2 searches), maybe a bit more (you still have to compare results and remove duplicates, but if there are only few results it should not take long), but not so much longer. After adding OR it works slower, than LIKE "%...%" OR LIKE "%...%"
Anyone can tell me what am i doing wrong?
Unfortunately for you, fulltext indexes have some limitations, and not being able to properly merge the results of two independent fulltext searches is one them:
The Index Merge optimization algorithm has the following known limitations:
[...]
Index Merge is not applicable to full-text indexes.
Fortunately for you, fulltext searches can be more complex, so you can merge your searches yourself. Your second query can be written as a single search using:
SELECT * FROM product WHERE
MATCH(name) AGAINST('(+something -other) (+some -word)' IN BOOLEAN MODE)
This defines two search sets and is ok if either of the two (...) matches - which is an or.
Alternatively, you can use a union instead of an or, which allows MySQL to actually run two independent fulltext searches and then combine the two results, which is basically what you are thinking of:
SELECT * FROM product WHERE
MATCH(name) AGAINST('+some -word' IN BOOLEAN MODE)
UNION
SELECT * FROM product WHERE
MATCH(name) AGAINST('+something -other' IN BOOLEAN MODE)
This also works for more complicated situations, e.g. merging searches on different columns, but will not work that easy if you want to do something else than or.

Mysql : Match Against query

Due to BigData I want to use Match against in place of like. My Column is FULL INDEXED.
What is the alternate of this Query, in Match against.
MySQL Query is:
select count(*) from keywords where sb_keyword like 'a%'
Is this exactly what the query is?
select count(*) from keywords where sb_keyword like 'a%'
That should benefit from INDEX(sb_keyword). A FULLTEXT index is not practical for this query, either as it stands or using WHERE MATCH(sb_keyword) AGAINST(+a* IN BOOLEAN MODE).
It will take time to walk through all the values starting with a to count them. The index I suggested helps because and index is (usually) smaller then the entire dataset due to having fewer 'columns'.

Do multiple indexes get used in queries with subqueries?

My current query is a full text search, on a particular user's records. For this table, I have a FULLTEXT index over compColumn and a bTree over userID.
SELECT K.* FROM k_table AS K WHERE K.userID=2 AND (MATCH (K.compColumn) AGAINST ('+gatsby' IN BOOLEAN MODE));
From what I can tell, only one index gets used, and the WHOLE table is searched for the fulltext result, as opposed to just doing a fulltext search over user 2's records.
I was wondering how to set the above up having the user ID as a subquery, from which the fulltext search is then made, and if that would use the two indexes?
Thanks for your time and help.
To make sub queries you put a query where you would logically put a table:
SELECT
k2.*
FROM
(SELECT
K.*
FROM
k_table AS K
WHERE K.userID=2) k2
WHERE
MATCH (K2.compColumn) AGAINST ('+gatsby' IN BOOLEAN MODE);

Best way to use indexes on large mysql like query

This mysql query is runned on a large (about 200 000 records, 41 columns) myisam table :
select t1.* from table t1 where 1 and t1.inactive = '0' and (t1.code like '%searchtext%' or t1.name like '%searchtext%' or t1.ext like '%searchtext%' ) order by t1.id desc LIMIT 0, 15
id is the primary index.
I tried adding a multiple column index on all 3 searched (like) columns. works ok but results are served on a auto filled ajax table on a website and the 2 seond return delay is a bit too slow.
I also tried adding seperate indexes on all 3 columns and a fulltext index on all 3 columns without significant improvement.
What would be the best way to optimize this type of query? I would like to achieve under 1 sec performance, is it doable?
The best thing you can do is implement paging. No matter what you do, that IO cost is going to be huge. If you only return one page of records, 10/25/ or whatever that will help a lot.
As for the index, you need to check the plan to see if your index is actually being used. A full text index might help but that depends on how many rows you return and what you pass in. Using parameters such as % really drain performance. You can still use an index if it ends with % but not starts with %. If you put % on both sides of the text you are searching for, indexes can't help too much.
You can create a full-text index that covers the three columns: code, name, and ext. Then perform a full-text query using the MATCH() AGAINST () function:
select t1.*
from table t1
where match(code, name, ext) against ('searchtext')
order by t1.id desc
limit 0, 15
If you omit the ORDER BY clause the rows are sorted by default using the MATCH function result relevance value. For more information read the Full-Text Search Functions documentation.
As #Vulcronos notes, the query optimizer is not able to use the index when the LIKE operator is used with an expression that starts with a wildcard %.