I have made mysql explain the following query:
SELECT carid,pic0,bio,url,site,applet
FROM cronjob_reloaded
WHERE
carid LIKE '%bmw%'
OR
carid LIKE '%mer%'
OR
age BETWEEN '5' AND '10'
OR
category IN ('used')
ORDER BY CASE
WHEN carid LIKE '%bmw%' OR carid LIKE '%mer%' THEN 1
WHEN age BETWEEN '5' AND '10' THEN 2
ELSE 3
END
And here is the explain result:
EXPLAIN SELECT carid, pic0, bio, url, site, applet
FROM cronjob_reloaded
WHERE carid LIKE '%bmw%'
OR carid LIKE '%mer%'
OR carid IS NOT NULL
AND age
BETWEEN '5'
AND '10'
What I do not understand it this:
Why is the key NULL?
Can I make this query faster? It takes 0.0035 sec - is this slow or fast for a 1000 rows table?
In my table carid is the primary key of the table.
MySQL did not find any indexes to use for the query.
The speed of the query depends on your CPU, and for so few rows, also on available RAM, system load, and disk speed. You can use BENCHMARK to run the query several times and time it with higher precision (e.g. you execute it 100,000 times and divide the total time by 100,000).
As for the indexing issue: your WHERE clause involves carid, age, category (and indirectly performerid). You ought to index on category first (since you ask a direct match on it), age, and finally carid.
CREATE INDEX test_index ON cronjob_reloaded ( category, age, carid );
This brings together most of the information that MySQL needs for the WHERE phase of the query in a single index operation.
Adding performerid may speed this up, or not, depending on several factors. I'd start without and maybe test it later on.
Update: the original query seems to have changed, and no performerid appears anymore.
Finally, 1000 rows usually requires so little time that MySQL might even decide not to use the index at all since it's faster to load everything and let the WHERE sort out its own.
As per the docs:
"If key is NULL, MySQL found no index to use for executing the query more efficiently."
Please refer below link for Official document on it.
Mysql Doc
Edit :
Here are the links for Index
How mysql Index work's - SO
How to create index
Hope this help !
Related
I have a pagination query which does range index scan on a large table:
create table t_dummy (
id int not null auto_increment,
field1 varchar(255) not null,
updated_ts timestamp null default null,
primary key (id),
key idx_name (updated_ts)
The query looks like this:
select * from t_dummy a
where a.field1 = 'VALUE'
and (a.updated_ts > 'some time' or (a.updated_ts = 'some time' and a.id > x)
order by a.updated_ts, a.id
limit 100
The explain plan show large cost with rows value being very high, however, it is using all the right indexes and the execution seems fast. Can someone please tell whether this means the query is inefficient?
EXPLAIN can be misleading. It can report a high value for rows, despite the fact that MySQL optimizes LIMIT queries to stop once enough rows have been found that satisfy your requested LIMIT (100 in your case).
The problem is, at the time the query does the EXPLAIN, it doesn't necessarily know how many rows it will have to examine to find at least 100 rows that satisfy the conditions in your WHERE clause.
So you can usually ignore the rows field of the EXPLAIN when you have a LIMIT query. It probably won't really have to examine so many rows.
If the execution is fast enough, don't worry about it. If it is not, consider a (field1,updated_ts) index and/or changing your query to
and a.updated_ts >= 'some time' and (a.updated_ts > 'some time' or a.id > x)
As Bill says, Explain cannot be trusted to take LIMIT into account.
The following will confirm that the query is touching only 100 rows:
FLUSH STATUS;
SELECT ...;
SHOW SESSION STATUS LIKE 'Handler%';
The Handler_read% values will probably add up to about 100. There will probably be no Handler_write% values -- they would indicate the creation of a temp table.
A tip: If you use LIMIT 101, you get the 100 rows to show, plus an indication of whether there are more rows. This, with very low cost, avoids having a [Next] button that sometimes brings up a blank page.
My tips on the topic: http://mysql.rjweb.org/doc.php/pagination
I have a MySQL table with nearly 4.000.000 rows containing income transactions of more than 100.000 employees.
There are three columns relevant in it, which are:
Employee ID [VARCHAR and INDEX] (not unique since one employee gets more than one income);
Type of Income [also VARCHAR and INDEX]
Value of the Income [Decimal; 10,2]
What I was looking to do seems to be very simple to me. I wanted to sum all the income occurrences grouping by each employee, filtering by one type.
For that, I was using the following code:
SELECT
SUM(`value`) AS `SumofValue`,
`type`,
`EmployeeID`
FROM
`Revenue`
GROUP BY `EmployeeID`
HAVING `type` = 'X'
And the result was supposed to be something like this:
SUM TYPE EMPLOYEE ID
R$ 250,00 X 250000008377
R$ 5.000,00 X 250000004321
R$ 3.200,00 X 250000005432
R$ 1.600,00 X 250000008765
....
However, this is taking a long time. I decide to use the LIMIT command to limit the results just to 1.000 rows and it is working, but if i want to do for the whole table, it would take approximately 1 hous according to my projections. This seems to be way too much time for something that does not look sooooo demandable to me (but i'm assuming i'm probably wrong). Not only that, but this is just the first step on an even more complex query that i intend to run in the future, in which i will group also by Employer ID, aside with Employee ID (one person can get income from more than one employer).
Is there any way to optimize this? Is there anything wrong with my code? Is there any secret path to increase the speed of this operation? Should I index the column of the value of the income as well? If this is a MySQL limitation, is there any option that could handle this better?
I would really appreaciate any help.
Thanks in advance
DISCLOSURE: This is a open government database. All this data is lawfully open to the public.
First, phrase the query using WHERE, rather than HAVING -- filter before doing the aggregation:
SELECT SUM(`value`) AS `SumofValue`,
MAX(type) as type,
EmployeeID
FROM Revenue r
WHERE `type` = 'X'
GROUP BY EmployeeID;
Next, try using this index: (type, EmployeeId, value). At the very least, this is a covering index for the query. MySQL (depending on the version) might be smart enough to use it for the aggregation as well.
As per your defined schema, Why you are using VARCHAR datatype for Employee ID and Type.
You can create reference table for Type with 1-->X, 2-->Y...So basically integer reference will be for type in transaction table.
Just create one dummy table with below one and execute your same query which was taking hour. Even you will see major change in execution plan as well.
CREATE TABLE test_transaction
(
Employee_ID BIGINT,
Type SMALLINT,
Income DECIMAL(10,2)
)
Create separate index on Employee_ID and Type column.
I am having a problem with the following task using MySQL. I have a table Records(id,enterprise, department, status). Where id is the primary key, and enterprise and department are foreign keys, and status is an integer value (0-CREATED, 1 - APPROVED, 2 - REJECTED).
Now, usually the application need to filter something for a concrete enterprise and department and status:
SELECT * FROM Records WHERE status = 0 AND enterprise = 11 AND department = 21
ORDER BY id desc LIMIT 0,10;
The order by is required, since I have to provide the user with the most recent records. For this query I have created an index (enterprise, department, status), and everything works fine. However, for some privileged users the status should be omitted:
SELECT * FROM Records WHERE enterprise = 11 AND department = 21
ORDER BY id desc LIMIT 0,10;
This obviously breaks the index - it's still good for filtering, but not for sorting. So, what should I do? I don't want create a separate index (enterprise, department), so what if I modify the query like this:
SELECT * FROM Records WHERE enterprise = 11 AND department = 21
AND status IN (0,1,2)
ORDER BY id desc LIMIT 0,10;
MySQL definitely does use the index now, since it's provided with values of status, but how quick will the sorting by primary key be? Will it take the recent 10 values for each status available, and then merge them, or will it first merge the ids for each status together, and only after that take the first ten (this way it's gonna be much slower I guess).
All of the queries will benefit from one composite query:
INDEX(enterprise, department, status, id)
enterprise and department can swapped, but keep the rest of the columns in that order.
The first query will use that index for both the WHERE and the ORDER BY, thereby be able to find the 10 rows without scanning the table or doing a sort.
The second query is missing status, so my index is less than perfect. This would be better:
INDEX(enterprise, department, id)
At that point, it works like above. (Note: If the table is InnoDB, then this 3-column index is identical to your 2-column INDEX(enterprise, department) -- the PK is silently included.)
The third query gets dicier because of the IN. Still, my 4 column index will be nearly the best. It will use the first 3 columns, but not be able to do the ORDER BY id, so it won't use id. And it won't be able to comsume the LIMIT. Hence the EXPLAIN will say Using temporary and/or Using filesort. Don't worry, performance should still be nice.
My second index is not as good for the third query.
See my Index Cookbook.
"How quick will sorting by id be"? That depends on two things.
Whether the sort can be avoided (see above);
How many rows in the query without the LIMIT;
Whether you are selecting TEXT columns.
I was careful to say whether the INDEX is used all the way through the ORDER BY, in which case there is no sort, and the LIMIT is folded in. Otherwise, all the rows (after filtering) are written to a temp table, sorted, then 10 rows are peeled off.
The "temp table" I just mentioned is necessary for various complex queries, such as those with subqueries, GROUP BY, ORDER BY. (As I have already hinted, sometimes the temp table can be avoided.) Anyway, the temp table comes in 2 flavors: MEMORY and MyISAM. MEMORY is favorable because it is faster. However, TEXT (and several other things) prevent its use.
If MEMORY is used then Using filesort is a misnomer -- the sort is really an in-memory sort, hence quite fast. For 10 rows (or even 100) the time taken is insignificant.
We have a table with about 25,000,000 rows called 'events' having the following schema:
TABLE events
- campaign_id : int(10)
- city : varchar(60)
- country_code : varchar(2)
The following query takes VERY long (> 2000 seconds):
SELECT COUNT(*) AS counted_events, country_code
FROM events
WHERE campaign_id` in (597)
GROUPY BY city, country_code
ORDER BY counted_events
We found out that it's because of the GROUP BY part.
There is already an index idx_campaign_id_city_country_code on (campaign_id, city, country_code) which is used.
Maybe someone can suggest a good solution to speed it up?
Update:
'Explain' shows that out of many possible index MySql uses this one: 'idx_campaign_id_city_country_code', for rows it shows: '471304' and for 'Extra' it shows: 'Using where; Using temporary; Using filesort' –
Here is the whole result of EXPLAIN:
id: '1'
select_type: 'SIMPLE'
table: 'events'
type: 'ref'
possible_keys: 'index_campaign,idx_campaignid_paid,idx_city_country_code,idx_city_country_code_campaign_id,idx_cid,idx_campaign_id_city_country_code'
key: 'idx_campaign_id_city_country_code'
key_len: '4'
ref: 'const'
rows: '471304'
Extra: 'Using where; Using temporary; Using filesort'
UPDATE:
Ok, I think it has been solved:
Looking at the pasted query here again I realized that I forget to mention here that there was one more column in the SELECT called 'country_name'. So the query was very slow then (including country_name), but I'll just leave it out and now the performance of the query is absolutely ok.
Sorry for that mistake!
So thank you for all your helpful comments, I'll upvote all the good answers! There were some really helpful additions, that I probably also we apply (like changing types etc).
without seeing what EXPLAIN says it's a long distance shot, anyway:
make an index on (city,country_code)
see if there's a way to use partitioning, your table is getting rather huge
if country code is always 2 chars change it to char
change numeric indexes to unsigned int
post entire EXPLAIN output
don't use IN() - better use:
WHERE campaign_id = 597
OR campaign_id = 231
OR ....
afaik IN() is very slow.
update: like nik0lias commented - IN() is faster than concatenating OR conditions.
Some ideas:
Given the nature and size of the table it would be a great candidate for partitioned tables by country. This way the events of every country would be stored in a different physical table even if it behaves as a virtual big table
Is country code an string? May be you have a country_id that could be easier to sort. (It may force you to create or change indexes)
Are you really using the city in the group by?
partitioning - especially by country will not help
column IN (const-list) is not slow, it is in fact a case with special optimization
The problem is, that MySQL doesn't use the index for sorting. I cannot say why, because it should. Could be a bug.
The best strategy to execute this query is to scan that sub-tree of the index where event_id=597. Since the index is then sorted by city_id, country_code no extra sorting is needed and rows can be counted while scanning.
So the indexes are already optimal for this query. MySQL is just not using them correctly.
I'm getting more information off line. It seems this is not a database problem at all, but
the schema is not normalized. The table contains not only country_code, but also country_name (this should be in an extra table).
the real query contains country_name in the select list. But since that column is not indexed, MySQL cannot use an index scan.
As soon as country_name is dropped from the select list, the query reverts to an index-only scan ("using index" in EXPLAIN output) and is blazingly fast.
I am running the following query and however I change it, it still takes almost 5 seconds to run which is completely unacceptable...
The query:
SELECT cat1, cat2, cat3, PRid, title, genre, artist, author, actors, imageURL,
lowprice, highprice, prodcatID, description
from products
where title like '%' AND imageURL <> '' AND cat1 = 'Clothing and accessories'
order by userrating desc
limit 500
I've tried taking out the "like %", taking out the "imageURl <> ''" but still the same. I've tried returning only 1 colum, still the same.
I have indexes on almost every column in the table, certainly all the columns mentioned in the query.
This is basically for a category listing. If I do a fulltext search for something in the title column which has a fulltext index, it takes less than a second.
Should I add another fulltext index to column cat1 and change the query focus to "match against" on that column?
Am I expecting too much?
The table has just short of 3 million rows.
You said you had an index on every column. Do you have an index such as?
alter table products add index (cat1, userrating)
If you don't, give it a try. Run that query and let me know if it run faster.
Also, I assume you're actually setting some kind of filter instead of the % on the title, field, right?
You should rather have the cat1 as a integer, then a string in these 3 million rows. You must also index correctly. If indexing all columns only improved, then it'd be a default thing the system would do.
Apart from that, title LIKE '%' doesn't do anything. I guess you use it to search so it becomes `title LIKE 'search%'
Do you use any sort of framework to fetch this? Getting 500 rows with a lot of columns can exhaust the system if your framework saves this to a large array. It may probably not be the case, but:
Try running a ordinary $query = mysql_query() and while($row = mysql_fetch_object($query)).
I suggest to add an index with the columns queried: title, imageURL and cat1.
Second improvement: use the SQL server cache, it will deadly improve the speed.
Last improvement: if you query is always like that, only the values change, then use prepared statements.
Well, I am quite sure that a % as the first char in a LIKE clause, gives you a full table scan for that column (in your case you won't have that full table scan executed because you already have restricting clauses in the AND clause).
Beside that try to add an index on cat1 column. Also, try to add other criterias to your query, in order to reduce the size of the dataset - your working data set (the number of rows that matches your query, without the LIMIT clause) might be too big also.