i have a Mysql table which has 2 millions rows.
The size is 600Mb.
this query take 2 seconds.
I don't know how to speed it up. The table is already in a Myisam format.
I don't know if i reached the limit of the slowness of a select count.
SELECT COUNT(video) FROM yvideos use index (PRIMARY) WHERE rate>='70' AND tags LIKE '%;car;%'
Thanks all
Yes, it can be optimised.
Firstly, you are doing a full scan with LIKE, because MySQL can not use an index with variable left part (it's possible for ';car;%', but not for '%;car;%').
Secondly, MySQL (in most cases) doesn't use more than one index for a SELECT, so if you have two separate indexes for rate and tags, only one will be used.
So to deal with these things I advice to:
1. use a fulltext index for tags column,
2. replace one query with two separate queries and "glue" result with INNER JOIN (equals to WHERE ... AND ... in this case).
So in the end:
SELECT t1.* FROM
(SELECT * FROM yvideos WHERE rate >= 60) t1
INNER JOIN
(SELECT * FROM yvideos WHERE MATCH (tags) AGAINST ('+car +russia -usa' IN BOOLEAN MODE)) t2
USING (id);
Live example on SQLFiddle.
Execute EXPLAIN for this query and take a look at a plan. Now there is no full scan, all filtering are done using indexes.
For more information about boolean fulltext searches you can read a documetation.
BTW, fulltext indexes are supported in both InnoDB and MyISAM now, so you can decide about an engine.
Related
I have this table of 9.5 million rows and I need to perform both fulltext and exact value search over the same column.
Altho there are 2 indexes over this column, one BTREE, one FULLTEXT, database engine doesn't use any and goes thru all 9.5M rows.
select * from mytable
where match(document) against ('+111/05257' in boolean mode)
or document = '111/05257';
-- very slow, takes ~ 9 seconds
-- possible keys: both
-- used key: none :(
If I use only one type of search, queries run fast.
select * from mytable where document = '111/05257';
-- very fast, around 80 ms
-- used key: btree
select * from mytable where match(document) against ('+111/05257' in boolean mode)
-- very fast, around 100 ms
-- used key: fulltext
Given poorly structured data at document column, ranging from '1/XA' thru '5778292019' to 'S:NXA/0001/XA2019/111/05257', I need to use both exact and partial (fulltext) search over this column.
Wildcard searches ('%111/05257%') also perform terribly over btree index.
Any idea how to solve this?
Thank you all
Queries involving OR are notoriously hard to optimize. A common solution is to change them into two queries, and UNION the results:
select * from mytable
where match(document) against ('+111/05257' in boolean mode)
UNION
select * from mytable
where document = '111/05257';
Each of the respective queries should be free to use a different index. The UNION will eliminate any rows in common from the two results.
I have problem with MySQL ORDER BY, it slows down query and I really don't know why, my query was a little more complex so I simplified it to a light query with no joins, but it stills works really slow.
Query:
SELECT
W.`oid`
FROM
`z_web_dok` AS W
WHERE
W.`sent_eRacun` = 1 AND W.`status` IN(8, 9) AND W.`Drzava` = 'BiH'
ORDER BY W.`oid` ASC
LIMIT 0, 10
The table has 946,566 rows, with memory taking 500 MB, those fields I selecting are all indexed as follow:
oid - INT PRIMARY KEY AUTOINCREMENT
status - INT INDEXED
sent_eRacun - TINYINT INDEXED
Drzava - VARCHAR(3) INDEXED
I am posting screenshoots of explain query first:
The next is the query executed to database:
And this is speed after I remove ORDER BY.
I have also tried sorting with DATETIME field which is also indexed, but I get same slow query as with ordering with primary key, this started from today, usually it was fast and light always.
What can cause something like this?
The kind of query you use here calls for a composite covering index. This one should handle your query very well.
CREATE INDEX someName ON z_web_dok (Drzava, sent_eRacun, status, oid);
Why does this work? You're looking for equality matches on the first three columns, and sorting on the fourth column. The query planner will use this index to satisfy the entire query. It can random-access the index to find the first row matching your query, then scan through the index in order to get the rows it needs.
Pro tip: Indexes on single columns are generally harmful to performance unless they happen to match the requirements of particular queries in your application, or are used for primary or foreign keys. You generally choose your indexes to match your most active, or your slowest, queries. Edit You asked whether it's better to create specific indexes for each query in your application. The answer is yes.
There may be an even faster way. (Or it may not be any faster.)
The IN(8, 9) gets in the way of easily handling the WHERE..ORDER BY..LIMIT completely efficiently. The possible solution is to treat that as OR, then convert to UNION and do some tricks with the LIMIT, especially if you might also be using OFFSET.
( SELECT ... WHERE .. = 8 AND ... ORDER BY oid LIMIT 10 )
UNION ALL
( SELECT ... WHERE .. = 9 AND ... ORDER BY oid LIMIT 10 )
ORDER BY oid LIMIT 10
This will allow the covering index described by OJones to be fully used in each of the subqueries. Furthermore, each will provide up to 10 rows without any temp table or filesort. Then the outer part will sort up to 20 rows and deliver the 'correct' 10.
For OFFSET, see http://mysql.rjweb.org/doc.php/index_cookbook_mysql#or
I was running a query of this kind of query:
SELECT
-- fields
FROM
table1 JOIN table2 ON (table1.c1 = table.c1 OR table1.c2 = table2.c2)
WHERE
-- conditions
But the OR made it very slow so i split it into 2 queries:
SELECT
-- fields
FROM
table1 JOIN table2 ON table1.c1 = table.c1
WHERE
-- conditions
UNION
SELECT
-- fields
FROM
table1 JOIN table2 ON table1.c2 = table.c2
WHERE
-- conditions
Which works much better but now i am going though the tables twice so i was wondering if there was any further optimizations for instance getting set of entries that satisfies the condition (table1.c1 = table.c1 OR table1.c2 = table2.c2) and then query on it. That would bring me back to the first thing i was doing but maybe there is another solution i don't have in mind. So is there anything more to do with it or is it already optimal?
Splitting the query into two separate ones is usually better in MySQL since it rarely uses "Index OR" operation (Index Merge in MySQL lingo).
There are few items I would concentrate for further optimization, all related to indexing:
1. Filter the rows faster
The predicate in the WHERE clause should be optimized to retrieve the fewer number of rows. And, they should be analized in terms of selectivity to create indexes that can produce the data with the fewest filtering as possible (less reads).
2. Join access
Retrieving related rows should be optimized as well. According to selectivity you need to decide which table is more selective and use it as a driving table, and consider the other one as the nested loop table. Now, for the latter, you should create an index that will retrieve rows in an optimal way.
3. Covering Indexes
Last but not least, if your query is still slow, there's one more thing you can do: use covering indexes. That is, expand your indexes to include all the rows from the driving and/or secondary tables in them. This way the InnoDB engine won't need to read two indexes per table, but a single one.
Test
SELECT
-- fields
FROM
table1 JOIN table2 ON table1.c1 = table2.c1
WHERE
-- conditions
UNION ALL
SELECT
-- fields
FROM
table1 JOIN table2 ON table1.c2 = table2.c2
WHERE
-- conditions
/* add one more condition which eliminates the rows selected by 1st subquery */
AND table1.c1 != table2.c1
Copied from the comments:
Nico Haase > What do you mean by "test"?
OP shows query patterns only. So I cannot predict does the technique is effective or not, and I suggest OP to test my variant on his structure and data array.
Nico Haase > what you've changed
I have added one more condition to 2nd subquery - see added comment in the code.
Nico Haase > and why?
This replaces UNION DISTINCT with UNION ALL and eliminates combined rowset sorting for duplicates remove.
I have a query like this:
( SELECT * FROM mytable WHERE author_id = ? AND seen IS NULL )
UNION
( SELECT * FROM mytable WHERE author_id = ? AND date_time > ? )
Also I have these two indexes:
(author_id, seen)
(author_id, date_time)
I read somewhere:
A query can generally only use one index per table when process the WHERE clause
As you see in my query, there is two separated WHERE clause. So I want to know, "only one index per table" means my query can use just one of those two indexes or it can use one of those indexes for each subquery and both indexes are useful?
In other word, is this sentence true?
"always one of those index will be used, and the other one is useless"
That statement about only using one index is no longer true about MySQL. For instance, it implements the index merge optimization which can take advantage of two indexes for some where clauses that have or. Here is a description in the documentation.
You should try this form of your query and see if it uses index mer:
SELECT *
FROM mytable
WHERE author_id = ? AND (seen IS NULL OR date_time > ? );
This should be more efficient than the union version, because it does not incur the overhead of removing duplicates.
Also, depending on the distribution of your data, the above query with an index on mytable(author_id, date_time, seen) might work as well or better than your version.
UNION combines results of subqueries. Each subquery will be executed independent of others and then results will be merged. So, in this case WHERE limits are applied to each subquery and not to all united result.
In answer to your question: yes, each subquery can use some index.
There are cases when the database engine can use more indexes for one select statement, however when filtering one set of rows really it not possible. If you want to use indexing on two columns then build one index on both columns instead of two indexes.
Every single subquery or part of composite query is itself a query can be evaluated as single query for performance and index access .. you can also force the use of different index for eahc query .. In your case you are using union and these are two separated query .. united in a resulting query
. you can have a brief guide how mysql ue index .. acccessing at this guide
http://dev.mysql.com/doc/refman/5.7/en/mysql-indexes.html
This mysql query is runned on a large (about 200 000 records, 41 columns) myisam table :
select t1.* from table t1 where 1 and t1.inactive = '0' and (t1.code like '%searchtext%' or t1.name like '%searchtext%' or t1.ext like '%searchtext%' ) order by t1.id desc LIMIT 0, 15
id is the primary index.
I tried adding a multiple column index on all 3 searched (like) columns. works ok but results are served on a auto filled ajax table on a website and the 2 seond return delay is a bit too slow.
I also tried adding seperate indexes on all 3 columns and a fulltext index on all 3 columns without significant improvement.
What would be the best way to optimize this type of query? I would like to achieve under 1 sec performance, is it doable?
The best thing you can do is implement paging. No matter what you do, that IO cost is going to be huge. If you only return one page of records, 10/25/ or whatever that will help a lot.
As for the index, you need to check the plan to see if your index is actually being used. A full text index might help but that depends on how many rows you return and what you pass in. Using parameters such as % really drain performance. You can still use an index if it ends with % but not starts with %. If you put % on both sides of the text you are searching for, indexes can't help too much.
You can create a full-text index that covers the three columns: code, name, and ext. Then perform a full-text query using the MATCH() AGAINST () function:
select t1.*
from table t1
where match(code, name, ext) against ('searchtext')
order by t1.id desc
limit 0, 15
If you omit the ORDER BY clause the rows are sorted by default using the MATCH function result relevance value. For more information read the Full-Text Search Functions documentation.
As #Vulcronos notes, the query optimizer is not able to use the index when the LIKE operator is used with an expression that starts with a wildcard %.