I am running the following query and however I change it, it still takes almost 5 seconds to run which is completely unacceptable...
The query:
SELECT cat1, cat2, cat3, PRid, title, genre, artist, author, actors, imageURL,
lowprice, highprice, prodcatID, description
from products
where title like '%' AND imageURL <> '' AND cat1 = 'Clothing and accessories'
order by userrating desc
limit 500
I've tried taking out the "like %", taking out the "imageURl <> ''" but still the same. I've tried returning only 1 colum, still the same.
I have indexes on almost every column in the table, certainly all the columns mentioned in the query.
This is basically for a category listing. If I do a fulltext search for something in the title column which has a fulltext index, it takes less than a second.
Should I add another fulltext index to column cat1 and change the query focus to "match against" on that column?
Am I expecting too much?
The table has just short of 3 million rows.
You said you had an index on every column. Do you have an index such as?
alter table products add index (cat1, userrating)
If you don't, give it a try. Run that query and let me know if it run faster.
Also, I assume you're actually setting some kind of filter instead of the % on the title, field, right?
You should rather have the cat1 as a integer, then a string in these 3 million rows. You must also index correctly. If indexing all columns only improved, then it'd be a default thing the system would do.
Apart from that, title LIKE '%' doesn't do anything. I guess you use it to search so it becomes `title LIKE 'search%'
Do you use any sort of framework to fetch this? Getting 500 rows with a lot of columns can exhaust the system if your framework saves this to a large array. It may probably not be the case, but:
Try running a ordinary $query = mysql_query() and while($row = mysql_fetch_object($query)).
I suggest to add an index with the columns queried: title, imageURL and cat1.
Second improvement: use the SQL server cache, it will deadly improve the speed.
Last improvement: if you query is always like that, only the values change, then use prepared statements.
Well, I am quite sure that a % as the first char in a LIKE clause, gives you a full table scan for that column (in your case you won't have that full table scan executed because you already have restricting clauses in the AND clause).
Beside that try to add an index on cat1 column. Also, try to add other criterias to your query, in order to reduce the size of the dataset - your working data set (the number of rows that matches your query, without the LIMIT clause) might be too big also.
Related
I am using mysql database,its table name is license_csv ,and it has 18 million records, and it is using MYISAM engine when i try to count record without where condition it take less than 1 sec for count, but when i applied where condition it take 2 min for count result, i have applied indexing for that 2 columns lic_state and lic_city, still it take 2 min for the count result, can anyone please help me what i need to do now for that query ? here i also added my query,
SELECT COUNT(*) as total
from license_csv WHERE
lic_state like '%ca%' AND lic_city LIKE '%fresno%'
The problem with your query is that you are using the % wildcard on both sides of the string. This defeats an existing index. Because indexes are sequential, MySQL is not able to use them for this type of condition, since there is nothing to tell him where the start searching.
Possible options :
if possible, switch to equality instead of LIKE; you could then takd advantage of an index on (lic_state, lic_sity)
else, removing the wildcard on the left side (lic_state like 'ca%') could help
switch to FULL TEXT search
store the list of CA codes and cities in another table (which should much smaller than the licences table), and have a column in the licences table referrencing its primary key; then JOIN that table in the search. With this option, the range of records to search when applying the search condition will be mch smaller
This mysql query is runned on a large (about 200 000 records, 41 columns) myisam table :
select t1.* from table t1 where 1 and t1.inactive = '0' and (t1.code like '%searchtext%' or t1.name like '%searchtext%' or t1.ext like '%searchtext%' ) order by t1.id desc LIMIT 0, 15
id is the primary index.
I tried adding a multiple column index on all 3 searched (like) columns. works ok but results are served on a auto filled ajax table on a website and the 2 seond return delay is a bit too slow.
I also tried adding seperate indexes on all 3 columns and a fulltext index on all 3 columns without significant improvement.
What would be the best way to optimize this type of query? I would like to achieve under 1 sec performance, is it doable?
The best thing you can do is implement paging. No matter what you do, that IO cost is going to be huge. If you only return one page of records, 10/25/ or whatever that will help a lot.
As for the index, you need to check the plan to see if your index is actually being used. A full text index might help but that depends on how many rows you return and what you pass in. Using parameters such as % really drain performance. You can still use an index if it ends with % but not starts with %. If you put % on both sides of the text you are searching for, indexes can't help too much.
You can create a full-text index that covers the three columns: code, name, and ext. Then perform a full-text query using the MATCH() AGAINST () function:
select t1.*
from table t1
where match(code, name, ext) against ('searchtext')
order by t1.id desc
limit 0, 15
If you omit the ORDER BY clause the rows are sorted by default using the MATCH function result relevance value. For more information read the Full-Text Search Functions documentation.
As #Vulcronos notes, the query optimizer is not able to use the index when the LIKE operator is used with an expression that starts with a wildcard %.
I started looking into Index(es) in depth for the first time and started analyzing our db beginning from the users table for the first time. I searched SO to find a similar question but was not able to frame my search well, I guess.
I was going through a particular concept and this first observation left me wondering - The difference in these Explain(s) [Difference : First query is using 'a%' while the second query is using 'ab%']
[Total number of rows in users table = 9193]:
1) explain select * from users where email_address like 'a%';
(Actually matching columns = 1240)
2) explain select * from users where email_address like 'ab%';
(Actually matching columns = 109)
The index looks like this :
My question:
Why is the index totally ignored in the first query? Does mySql think that it is a better idea not to use the index in the case 1? If yes, why?
If the probability, based statistics mysql collects on distribution of the values, is above a certain ratio of the total rows (typically 1/11 of the total), mysql deems it more efficient to simply scan the whole table reading the disks pages in sequentially, rather than use the index jumping around the disk pages in random order.
You could try your luck with this query, which may use the index:
where email_address between 'a' and 'az'
Although doing the full scan may actually be faster.
This is not a direct answer to your question but I still want to point it out (in case you already don't know):
Try:
explain select email_address from users where email_address like 'a%';
explain select email_address from users where email_address like 'ab%';
MySQL would now use indexes in both the queries above since the columns of interest are directly available from the index.
Probably in the case where you do a "select *", index access is more costly since the optmizer has to go through the index records, find the row ids and then go back to the table to retrieve other column values.
But in the query above where you only do a "select email_address", the optmizer knows all the information desired is available right from the index and hence it would use the index irrespective of the 30% rule.
Experts, please correct me if I am wrong.
I'm hosted at Godaddy and they are giving me a fit with a query that I have which is understandable. I am wondering if there is a better way to go about rewriting this query I have to make it resource friendly:
SELECT count(*) as count
FROM 'aaa_db'
WHERE 'callerid' LIKE '16602158641' AND 'clientid'='41'
I have over 1 million rows, and have duplicates therefore I wrote a small script to output the duplicates and delete them rather than changing tables etc.
Please help.
Having separate indexes on clientid and callerid causes MySQL to use just one index, or in some cases, attempt an index merge (MySQL 5.0+). Neither of these are as efficient as having a multi-column index.
Creating a multi-column index on both callerid and clientid columns will relieve CPU and disk IO resources (no table scan), however it will increase both disk storage and RAM usage. I guess you could give it a shot and see if Godaddy prefers that over the other.
The first thing I would try, is to ensure you have an index on the clientid column, then rewrite your query to look for clientid first, this should remove rows from consideration speeding up your query. Also, as Marcus stated, adding a multi-column index will be faster, but remember to add it as (clientid, callerid) as mysql reads indices from left to right.
SELECT count(*) as count
FROM aaa_db
WHERE clientid = 41 and callerid LIKE '16602158641';
Notice I removed the quotes from the clientid value, as it should be an int datatype, if it is not, converting to int datatype should also be faster. Also, I am not sure why you are doing a LIKE without the wildcard operator, if you could change that to an = that will also help.
I like Marcus' answer, but I would first try a single index on just callerid, because with values like 16602158641, how many rows can there be that match out of a million? Not very many, so the single index could match or exceed performance of the double index.
Also, remove LIKE and use =:
SELECT count(*) as count
FROM aaa_db
WHERE callerid = '16602158641' -- equals instead of like
AND clientid = '41'
I am trying to include in a MYSQL SELECT query a limitation.
My database is structured in a way, that if a record is found in column one then only 5000 max records with the same name can be found after that one.
Example:
mark
..mark repeated 5000 times
john
anna
..other millions of names
So in this table it would be more efficent to find the first Mark, and continue to search maximum 5000 rows down from that one.
Is it possible to do something like this?
Just make a btree index on the name column:
CREATE INDEX name ON your_table(name) USING BTREE
and mysql will silently do exactly what you want each time it looks for a name.
Try with:
SELECT name
FROM table
ORDER BY (name = 'mark') DESC
LIMIT 5000
Basicly you sort mark 1st then the rest follow up and gets limited.
Its actually quite difficult to understand your desired output .... but i think this might be heading in the right direction ?
(SELECT name
FROM table
WHERE name = 'mark'
LIMIT 5000)
UNION
(SELECT name
FROM table
WHERE name != 'mark'
ORDER BY name)
This will first get upto 5000 records with the first name as mark then get the remainder - you can add a limit to the second query if required ... using UNION
For performance you should ensure that the columns used by ORDER BY and WHERE are indexed accordingly
If you make sure that the column is properly indexed, MySQL will take care off optimisation for you.
Edit:
Thinking about it, I figured that this answer is only useful if I specify how to do that. user nobody beat me to the punch: CREATE INDEX name ON your_table(name) USING BTREE
This is exactly what database indexes are designed to do; this is what they are for. MySQL will use the index itself to optimise the search.