MySQL fulltext or exact value search is very slow - mysql

I have this table of 9.5 million rows and I need to perform both fulltext and exact value search over the same column.
Altho there are 2 indexes over this column, one BTREE, one FULLTEXT, database engine doesn't use any and goes thru all 9.5M rows.
select * from mytable
where match(document) against ('+111/05257' in boolean mode)
or document = '111/05257';
-- very slow, takes ~ 9 seconds
-- possible keys: both
-- used key: none :(
If I use only one type of search, queries run fast.
select * from mytable where document = '111/05257';
-- very fast, around 80 ms
-- used key: btree
select * from mytable where match(document) against ('+111/05257' in boolean mode)
-- very fast, around 100 ms
-- used key: fulltext
Given poorly structured data at document column, ranging from '1/XA' thru '5778292019' to 'S:NXA/0001/XA2019/111/05257', I need to use both exact and partial (fulltext) search over this column.
Wildcard searches ('%111/05257%') also perform terribly over btree index.
Any idea how to solve this?
Thank you all

Queries involving OR are notoriously hard to optimize. A common solution is to change them into two queries, and UNION the results:
select * from mytable
where match(document) against ('+111/05257' in boolean mode)
UNION
select * from mytable
where document = '111/05257';
Each of the respective queries should be free to use a different index. The UNION will eliminate any rows in common from the two results.

Related

SQL gets slow on a simple query with ORDER BY

I have problem with MySQL ORDER BY, it slows down query and I really don't know why, my query was a little more complex so I simplified it to a light query with no joins, but it stills works really slow.
Query:
SELECT
W.`oid`
FROM
`z_web_dok` AS W
WHERE
W.`sent_eRacun` = 1 AND W.`status` IN(8, 9) AND W.`Drzava` = 'BiH'
ORDER BY W.`oid` ASC
LIMIT 0, 10
The table has 946,566 rows, with memory taking 500 MB, those fields I selecting are all indexed as follow:
oid - INT PRIMARY KEY AUTOINCREMENT
status - INT INDEXED
sent_eRacun - TINYINT INDEXED
Drzava - VARCHAR(3) INDEXED
I am posting screenshoots of explain query first:
The next is the query executed to database:
And this is speed after I remove ORDER BY.
I have also tried sorting with DATETIME field which is also indexed, but I get same slow query as with ordering with primary key, this started from today, usually it was fast and light always.
What can cause something like this?
The kind of query you use here calls for a composite covering index. This one should handle your query very well.
CREATE INDEX someName ON z_web_dok (Drzava, sent_eRacun, status, oid);
Why does this work? You're looking for equality matches on the first three columns, and sorting on the fourth column. The query planner will use this index to satisfy the entire query. It can random-access the index to find the first row matching your query, then scan through the index in order to get the rows it needs.
Pro tip: Indexes on single columns are generally harmful to performance unless they happen to match the requirements of particular queries in your application, or are used for primary or foreign keys. You generally choose your indexes to match your most active, or your slowest, queries. Edit You asked whether it's better to create specific indexes for each query in your application. The answer is yes.
There may be an even faster way. (Or it may not be any faster.)
The IN(8, 9) gets in the way of easily handling the WHERE..ORDER BY..LIMIT completely efficiently. The possible solution is to treat that as OR, then convert to UNION and do some tricks with the LIMIT, especially if you might also be using OFFSET.
( SELECT ... WHERE .. = 8 AND ... ORDER BY oid LIMIT 10 )
UNION ALL
( SELECT ... WHERE .. = 9 AND ... ORDER BY oid LIMIT 10 )
ORDER BY oid LIMIT 10
This will allow the covering index described by OJones to be fully used in each of the subqueries. Furthermore, each will provide up to 10 rows without any temp table or filesort. Then the outer part will sort up to 20 rows and deliver the 'correct' 10.
For OFFSET, see http://mysql.rjweb.org/doc.php/index_cookbook_mysql#or

speed up slow count in a 2 millions rows table

i have a Mysql table which has 2 millions rows.
The size is 600Mb.
this query take 2 seconds.
I don't know how to speed it up. The table is already in a Myisam format.
I don't know if i reached the limit of the slowness of a select count.
SELECT COUNT(video) FROM yvideos use index (PRIMARY) WHERE rate>='70' AND tags LIKE '%;car;%'
Thanks all
Yes, it can be optimised.
Firstly, you are doing a full scan with LIKE, because MySQL can not use an index with variable left part (it's possible for ';car;%', but not for '%;car;%').
Secondly, MySQL (in most cases) doesn't use more than one index for a SELECT, so if you have two separate indexes for rate and tags, only one will be used.
So to deal with these things I advice to:
1. use a fulltext index for tags column,
2. replace one query with two separate queries and "glue" result with INNER JOIN (equals to WHERE ... AND ... in this case).
So in the end:
SELECT t1.* FROM
(SELECT * FROM yvideos WHERE rate >= 60) t1
INNER JOIN
(SELECT * FROM yvideos WHERE MATCH (tags) AGAINST ('+car +russia -usa' IN BOOLEAN MODE)) t2
USING (id);
Live example on SQLFiddle.
Execute EXPLAIN for this query and take a look at a plan. Now there is no full scan, all filtering are done using indexes.
For more information about boolean fulltext searches you can read a documetation.
BTW, fulltext indexes are supported in both InnoDB and MyISAM now, so you can decide about an engine.

LIKE % or AGAINST for FULLTEXT search?

I was trying to make a very fast & efficient approach to fetch the records using keywords as search.
Our MYSQL table MASTER tablescontains 30,000 rows and has 4 fields.
ID
title (FULLTEXT)
short_descr (FULLTEXT)
long_descr (FULLTEXT)
Can any one suggest which is one more efficient?
LIKE %
MYSQL's AGAINST
It would be nice if some one can write a SQL query for the keywords
Weight Loss Secrets
SELECT id FROM MASTER
WHERE (title LIKE '%Weight Loss Secrets%' OR
short_descr LIKE '%Weight Loss Secrets%' OR
long_descr LIKE '%Weight Loss Secrets%')
Thanks in advance
The FULLTEXT index should be faster, maybe its a good idea to add all columns into 1 fulltext index.
ALTER TABLE MASTER
ADD FULLTEXT INDEX `FullTextSearch`
(`title` ASC, `short_descr` ASC, `long_descr` ASC);
Then execute using IN BOOLEAN MODE
SELECT id FROM MASTER WHERE
MATCH (title, short_descr, long_descr)
AGAINST ('+Weight +Loss +Secrets' IN BOOLEAN MODE);
This will find rows that contains all 3 keywords.
However, this wont give you exact match the keywords just need to be present in same row.
If you also want exact match you could do like this, but its a bit hacky and would only work if your table doesnt get to big.
SELECT id FROM
(
SELECT CONCAT(title,' ',short_descr,' ', long_descr) AS SearchField
FROM MASTER WHERE
MATCH (title, short_descr, long_descr)
AGAINST ('+Weight +Loss +Secrets' IN BOOLEAN MODE)
) result WHERE SearchField LIKE '%Weight Loss Secrets%'

How to search for a text? (MySQL)

I have this table:
bussId | nameEn | keywords
500 name1 name2 keyword1 keyword2
I want to return bussId 5000 if the user search for (keyword1 or keyword2 or name2 or name1).
So I should use this query SELECT * FROM business WHERE nameEn LIKE '%searched_word%'.
But this query doesn't use the index nameEn or keywords, according to Comparison of B-Tree and Hash Indexes "The index also can be used for LIKE comparisons if the argument to LIKE is a constant string that does not start with a wildcard character".
I have this solution, I want to create another table and insert all the single words:
bussId | word
500 name1
500 name2
500 keyword1
500 keyword2
Then I will search for the bussId using this query:
SELECT * WHERE word LIKE 'searched_word%'.
In that way I will be sure that the MySQL will use the index , and it will be faster, but this table will contain about 20 million rows!
Is there another solution?
You have to use a fulltext index using MyISAM or InnoDB from MySQL 5.6 onwards:
mysql> ALTER TABLE business ADD FULLTEXT(nameEn, keywords);
And here is your request:
mysql> SELECT * FROM business
-> WHERE MATCH (nameEn, keywords) AGAINST ('searched_word');
Did you try the Instr() or Locate() functions? Here is a SO discussion comparing them with Like but may prove better comparing a front % wildcard. Still it runs full table scans but unaware how the MySQL query optimizer indexes with string functions.
SELECT * FROM business WHERE Instr(nameEN, 'search_word') > 0
OR
SELECT * FROM business WHERE Locate(nameEN, 'search_word') > 0
Also, there may be other areas of optimization. See if other potential indices are available in the business table, explicitly declare specific columns instead of the asterisk (*) if all columns are not being used, and parse the nameEN and keywords columns by the spaces so columns retain one value (with potential to transpose), then use the implicit join, WHERE, or explicit join, JOIN. This might even be a table design issue with the challenge of storing multiple values in a singe field.
With new version of MySQL you don't need to make engine "MyISAM", InnoDB is also support FULLTEXT index (I've tested this 5.6.15, supports form version >=5.6.4 ).
So if your server version higher then 5.6.4 than you need just add FULLTEXT index to your table and make select with MATCH(...)AGAINST(...), example below
CREATE FULLTEXT INDEX idx ON business (nameEn);
SELECT * FROM business
WHERE match(nameEn)against('+searched_word' IN BOOLEAN MODE);
Use the below statement in MySQL or SQL it'll return perfect result:
SELECT * FROM business WHERE (nameEn LIKE 'searched_word%' OR nameEn LIKE '%searched_word%') OR (keywords LIKE 'searched_word%' OR keywords LIKE '%searched_word%') AND bussID = 500;
This should work.
20 million records is quite a lot and a mapping table with varchar column would allocate the max allowed chars in byte for each row + 32-bit for integer column.
What if you could just create a table like (id int, crc int) and store only the text data's crc32 value. It's case sensitive so you need to convert to uppercase/lowercase while populating the data and the same when comparing.
I agree with the full-text approach but to save space and use the advantage of indexing, you can try something like below.
Create Temporary TABLE t (id INT, crc INT);
Insert Into t
Select 500, CRC32(UPPER('name1'))
Union Select 500, CRC32(UPPER('name2'))
Union Select 500, CRC32(UPPER('keyword1'))
Union Select 500, CRC32(UPPER('keyword2'));
Select * From t Where crc = CRC32(UPPER('keyword2');

Best way to use indexes on large mysql like query

This mysql query is runned on a large (about 200 000 records, 41 columns) myisam table :
select t1.* from table t1 where 1 and t1.inactive = '0' and (t1.code like '%searchtext%' or t1.name like '%searchtext%' or t1.ext like '%searchtext%' ) order by t1.id desc LIMIT 0, 15
id is the primary index.
I tried adding a multiple column index on all 3 searched (like) columns. works ok but results are served on a auto filled ajax table on a website and the 2 seond return delay is a bit too slow.
I also tried adding seperate indexes on all 3 columns and a fulltext index on all 3 columns without significant improvement.
What would be the best way to optimize this type of query? I would like to achieve under 1 sec performance, is it doable?
The best thing you can do is implement paging. No matter what you do, that IO cost is going to be huge. If you only return one page of records, 10/25/ or whatever that will help a lot.
As for the index, you need to check the plan to see if your index is actually being used. A full text index might help but that depends on how many rows you return and what you pass in. Using parameters such as % really drain performance. You can still use an index if it ends with % but not starts with %. If you put % on both sides of the text you are searching for, indexes can't help too much.
You can create a full-text index that covers the three columns: code, name, and ext. Then perform a full-text query using the MATCH() AGAINST () function:
select t1.*
from table t1
where match(code, name, ext) against ('searchtext')
order by t1.id desc
limit 0, 15
If you omit the ORDER BY clause the rows are sorted by default using the MATCH function result relevance value. For more information read the Full-Text Search Functions documentation.
As #Vulcronos notes, the query optimizer is not able to use the index when the LIKE operator is used with an expression that starts with a wildcard %.