Due to BigData I want to use Match against in place of like. My Column is FULL INDEXED.
What is the alternate of this Query, in Match against.
MySQL Query is:
select count(*) from keywords where sb_keyword like 'a%'
Is this exactly what the query is?
select count(*) from keywords where sb_keyword like 'a%'
That should benefit from INDEX(sb_keyword). A FULLTEXT index is not practical for this query, either as it stands or using WHERE MATCH(sb_keyword) AGAINST(+a* IN BOOLEAN MODE).
It will take time to walk through all the values starting with a to count them. The index I suggested helps because and index is (usually) smaller then the entire dataset due to having fewer 'columns'.
Related
I have a query like this:
SELECT * FROM staffs
WHERE MATCH staff_name AGAINST ('johnny')
ORDER BY staff_city ASC
Just an example, I want to ask which Index should I use here. For the MATCH() and AGAINST() there is FULLTEXT index on column staff_name, that's okay. But in the query there is also ORDER BY on the staff_city column. The FULLTEXT works fast, but when it comes to ordering the matched results, the search is slower. What INDEX should need there?
MySQL can never (almost never) use two indexes in a single SELECT. The Optimizer picks from among the indexes you have, and it usually picks the best for the query.
For this query, only the FULLTEXT index you have will be used, regardless of the other indexes the table has.
The other index might be useful for some other query.
More: Assuming you care only about rows with the word 'johnny' in it, change:
AGAINST ('johnny')
-->
AGAINST ('+johnny' IN BOOLEAN MODE)
I have a MySQL query where I want to use a MATCH AGAINST on the result of a WITH clause, like this:
WITH (select stuff from somewhere)
WHERE MATCH(somefield) AGAINST(' +"term_a" +"term_b"' IN BOOLEAN MODE)
However, that generates an error [HY000][1214] The used table type doesn't support FULLTEXT indexes
Is there a way to use MATCH against the results of a WITH clause? The MATCH/AGAINST does work against the tables themselves if I unroll the WITH, so that's not a problem. It's only a problem when I use the WITH.
I have a query like this:
( SELECT * FROM mytable WHERE author_id = ? AND seen IS NULL )
UNION
( SELECT * FROM mytable WHERE author_id = ? AND date_time > ? )
Also I have these two indexes:
(author_id, seen)
(author_id, date_time)
I read somewhere:
A query can generally only use one index per table when process the WHERE clause
As you see in my query, there is two separated WHERE clause. So I want to know, "only one index per table" means my query can use just one of those two indexes or it can use one of those indexes for each subquery and both indexes are useful?
In other word, is this sentence true?
"always one of those index will be used, and the other one is useless"
That statement about only using one index is no longer true about MySQL. For instance, it implements the index merge optimization which can take advantage of two indexes for some where clauses that have or. Here is a description in the documentation.
You should try this form of your query and see if it uses index mer:
SELECT *
FROM mytable
WHERE author_id = ? AND (seen IS NULL OR date_time > ? );
This should be more efficient than the union version, because it does not incur the overhead of removing duplicates.
Also, depending on the distribution of your data, the above query with an index on mytable(author_id, date_time, seen) might work as well or better than your version.
UNION combines results of subqueries. Each subquery will be executed independent of others and then results will be merged. So, in this case WHERE limits are applied to each subquery and not to all united result.
In answer to your question: yes, each subquery can use some index.
There are cases when the database engine can use more indexes for one select statement, however when filtering one set of rows really it not possible. If you want to use indexing on two columns then build one index on both columns instead of two indexes.
Every single subquery or part of composite query is itself a query can be evaluated as single query for performance and index access .. you can also force the use of different index for eahc query .. In your case you are using union and these are two separated query .. united in a resulting query
. you can have a brief guide how mysql ue index .. acccessing at this guide
http://dev.mysql.com/doc/refman/5.7/en/mysql-indexes.html
This mysql query is runned on a large (about 200 000 records, 41 columns) myisam table :
select t1.* from table t1 where 1 and t1.inactive = '0' and (t1.code like '%searchtext%' or t1.name like '%searchtext%' or t1.ext like '%searchtext%' ) order by t1.id desc LIMIT 0, 15
id is the primary index.
I tried adding a multiple column index on all 3 searched (like) columns. works ok but results are served on a auto filled ajax table on a website and the 2 seond return delay is a bit too slow.
I also tried adding seperate indexes on all 3 columns and a fulltext index on all 3 columns without significant improvement.
What would be the best way to optimize this type of query? I would like to achieve under 1 sec performance, is it doable?
The best thing you can do is implement paging. No matter what you do, that IO cost is going to be huge. If you only return one page of records, 10/25/ or whatever that will help a lot.
As for the index, you need to check the plan to see if your index is actually being used. A full text index might help but that depends on how many rows you return and what you pass in. Using parameters such as % really drain performance. You can still use an index if it ends with % but not starts with %. If you put % on both sides of the text you are searching for, indexes can't help too much.
You can create a full-text index that covers the three columns: code, name, and ext. Then perform a full-text query using the MATCH() AGAINST () function:
select t1.*
from table t1
where match(code, name, ext) against ('searchtext')
order by t1.id desc
limit 0, 15
If you omit the ORDER BY clause the rows are sorted by default using the MATCH function result relevance value. For more information read the Full-Text Search Functions documentation.
As #Vulcronos notes, the query optimizer is not able to use the index when the LIKE operator is used with an expression that starts with a wildcard %.
I started looking into Index(es) in depth for the first time and started analyzing our db beginning from the users table for the first time. I searched SO to find a similar question but was not able to frame my search well, I guess.
I was going through a particular concept and this first observation left me wondering - The difference in these Explain(s) [Difference : First query is using 'a%' while the second query is using 'ab%']
[Total number of rows in users table = 9193]:
1) explain select * from users where email_address like 'a%';
(Actually matching columns = 1240)
2) explain select * from users where email_address like 'ab%';
(Actually matching columns = 109)
The index looks like this :
My question:
Why is the index totally ignored in the first query? Does mySql think that it is a better idea not to use the index in the case 1? If yes, why?
If the probability, based statistics mysql collects on distribution of the values, is above a certain ratio of the total rows (typically 1/11 of the total), mysql deems it more efficient to simply scan the whole table reading the disks pages in sequentially, rather than use the index jumping around the disk pages in random order.
You could try your luck with this query, which may use the index:
where email_address between 'a' and 'az'
Although doing the full scan may actually be faster.
This is not a direct answer to your question but I still want to point it out (in case you already don't know):
Try:
explain select email_address from users where email_address like 'a%';
explain select email_address from users where email_address like 'ab%';
MySQL would now use indexes in both the queries above since the columns of interest are directly available from the index.
Probably in the case where you do a "select *", index access is more costly since the optmizer has to go through the index records, find the row ids and then go back to the table to retrieve other column values.
But in the query above where you only do a "select email_address", the optmizer knows all the information desired is available right from the index and hence it would use the index irrespective of the 30% rule.
Experts, please correct me if I am wrong.