I have an explain result below , problem is how to calculate the total examined rows? please explain this detail ~ (first ask question~ If there is any mistake, please correct me, I will be very grateful)
id select_type type possible_keys key_len rows
1 PRIMARY ALL 1423656
1 PRIMARY eq_ref PRIMARY 8 1
1 PRIMARY ref 152 1
1 PRIMARY ALL 138
1 PRIMARY ALL 1388
1 PRIMARY ALL 1564
3 DERIVED ALL 1684
3 DERIVED eq_ref PRIMARY 8 1
2 DERIVED ALL 141
From the manual : https://dev.mysql.com/doc/refman/5.7/en/explain-output.html
rows (JSON name: rows)
The rows column indicates the number of rows MySQL believes it must
examine to execute the query.
For InnoDB tables, this number is an estimate, and may not always be
exact.
You have a very high number of 1.4 million for one of your table but the possible_keys column is empty. That means this is a table that is desperately crying out to be indexed.
A large number of rows to be examined, means just that. Mysql needs to read all those rows to give you your result.
If you had posted your tables and your query, we could have helped you figure out what those indexes ought to be.
Related
I have a datehelper table with every YYYY-MM-DD as DATE between the years 2000 and 2100. To this I'm joining a subquery for all unit transactions. unit.end is a DATETIME so my subquery simplifies it to DATE and uses that to join to the datehelper table.
In 5.6 this query takes a couple seconds to run a massive amount of transactions, and it derives a table that is auto keyed based on the DATE(unit.end) in the subquery and uses that to join everything else fairly quickly.
In 5.7, it takes 600+ seconds and I can't get it to derive a table or follow the much better execution plan that 5.6 used. Is there a flag I need to set or some way to prefer the old execution plan?
Here's the query:
EXPLAIN SELECT datehelper.id AS date, MONTH(datehelper.id)-1 AS month, DATE_FORMAT(datehelper.id,'%d')-1 AS day,
IFNULL(SUM(a.total),0) AS total, IFNULL(SUM(a.tax),0) AS tax, IFNULL(SUM(a.notax),0) AS notax
FROM datehelper
LEFT JOIN
(SELECT
DATE(unit.end) AS endDate,
getFinalPrice(unit.id) AS total, tax, getFinalPrice(unit.id)-tax AS notax
FROM unit
INNER JOIN products ON products.id=unit.productID
INNER JOIN prodtypes FORCE INDEX(primary) ON prodtypes.id=products.prodtypeID
WHERE franchiseID='1' AND void=0 AND checkout=1
AND end BETWEEN '2020-01-01' AND DATE_ADD('2020-01-01', INTERVAL 1 YEAR)
AND products.prodtypeID NOT IN (1,10)
) AS a ON a.endDate=datehelper.id
WHERE datehelper.id BETWEEN '2020-01-01' AND '2020-12-31'
GROUP BY datehelper.id ORDER BY datehelper.id;
5.6 result (much faster):
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY datehelper range PRIMARY PRIMARY 3 NULL 365 Using where; Using index
1 PRIMARY <derived2> ref <auto_key0> <auto_key0> 4 datehelper.id 10 NULL
2 DERIVED prodtypes index PRIMARY PRIMARY 4 NULL 10 Using where; Using index
2 DERIVED products ref PRIMARY,prodtypeID prodtypeID 4 prodtypes.id
9 Using index
2 DERIVED unit ref productID,end,void,franchiseID productID 9 products.id 2622 Using where
5.7 result (much slower, no auto key found):
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE datehelper NULL range PRIMARY PRIMARY 3 NULL 366 100.00 Using where; Using index
1 SIMPLE unit NULL ref productID,end,void,franchiseID franchiseID 4 const 181727 100.00 Using where
1 SIMPLE products NULL eq_ref PRIMARY,prodtypeID PRIMARY 8 barkops3.unit.productID 1 100.00 Using where
1 SIMPLE prodtypes NULL eq_ref PRIMARY PRIMARY 4 barkops3.products.prodtypeID 1 100.00 Using index
I found the problem. It was the optimizer_switch 'derived_merge' flag which is new to 5.7.
https://dev.mysql.com/doc/refman/5.7/en/derived-table-optimization.html
This flag overrides materialization of derived tables if the optimizer thinks the outer WHERE can be pushed down into a subquery. In this case, that optimization was enormously more costly than joining a materialized table on an auto_key.
There is a table tb_tag_article like
CREATE TABLE `tb_tag_article` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`tag_id` int(16) NOT NULL,
`article_id` int(16) NOT NULL,
PRIMARY KEY (`id`),
KEY `key_tag_id_article_id` (`tag_id`,`article_id`) USING BTREE,
) ENGINE=InnoDB AUTO_INCREMENT=365944 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
When I query like this, the result is 5120.
SELECT count(*) FROM tb_tag WHERE tag_id = 43
But when I explain the query like this
EXPLAIN SELECT count(*) FROM tb_tag WHERE tag_id = 43
examined rows is 13634.
+------+-------------+----------------+------+-----------------------+-----------------------+---------+-------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------------+------+-----------------------+-----------------------+---------+-------+-------+-------------+
| 1 | SIMPLE | tb_tag_article | ref | key_tag_id_article_id | key_tag_id_article_id | 4 | const | 13634 | Using index |
+------+-------------+----------------+------+-----------------------+-----------------------+---------+-------+-------+-------------+
The query use Index but the numbers of examined rows greater than count of real data.
What's the problem?
Q: What's the problem?
A: It doesn't look like there's any problem.
The value for the "rows" column in the EXPLAIN output is an estimate, not an exact number.
Ref: http://dev.mysql.com/doc/refman/5.5/en/explain-output.html
For evaluating the "cost" of each possible access path, the optimizer only needs estimates in order to compare the efficiency of using a range scan operation on index vs. a full scan of all rows in the table. The optimizer doesn't need "exact" counts of the total number rows in the table, or the number of rows that will satisfy a predicate.
For this simple query, there are only a couple of possible plans that MySQL will consider.
And that estimate of 13684 isn't that far off from the exact count of rows. It's off by a factor of 2.5, but MySQL is coming up with the right execution plan: using the index, rather than checking every row in the table.
There is no problem.
From MySQL reference (http://dev.mysql.com/doc/refman/5.0/en/explain-output.html#explain_rows):
The rows column indicates the number of rows MySQL believes it must
examine to execute the query.
For InnoDB tables, this number is an estimate, and may not always be
exact.
It might also be that because it has to analyze the index it takes into account the number of records in the index plus the number of records in the table. But that's just a hypothesis.
Also, it looks like there was a bug in MySQL 5.1 with Explain Row estimation causing the number to be wildly off: http://bugs.mysql.com/bug.php?id=53761 Depending on your version, this might explain some of the oddities.
The main take-away from the documentation seems to be to tree the EXPLAIN rows column with a grain of salt.
Below is my query to get 20 rows with genre_id 1.
EXPLAIN SELECT * FROM (`content`)
WHERE `genre_id` = '1'
AND `category` = 1
LIMIT 20
I have total 654 rows in content table with genre_id 1, I have index on genre_id and in above query I am limiting result to display only 20 records which is working fine but explain is showing 654 records under rows, I tried to add index on category but still same result and then also I removed AND category = 1 but same rows count:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE content ref genre_id genre_id 4 const 654 Using where
HERE I found the answer
LIMIT is not taken into account while estimating number of rows Even
if you have LIMIT which restricts how many rows will be examined MySQL
will still print full number
But also In comments another reply was posted:
LIMIT is now taken into account when estimating number of rows. I’m
not sure which version addressed this, but in 5.1.30, EXPLAIN
accurately takes LIMIT into account.
I am using MySQL 5.5.16 with InnoDB. so as per above comment its still not taking into account. So my question is does mysql go through all 654 rows to return 20 rows even I have set limit? Thanks
Reply from Rick James at MySQL
Does mysql LIMIT is taken into account when estimating number of rows in Explain?
No. (5.7 with JSON may be a different matter.)
I have been going through my slow queries and doing what I can to property optimize each one. I ran across this one, that I have been stuck on.
EXPLAIN SELECT pID FROM ds_products WHERE pLevel >0
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE ds_products ALL pLevel NULL NULL NULL 45939 Using where
I have indexed pLevel [tinyint(1)], but the query is not using it and doing a full table scan.
Here is the row count of this table for each value of pLevel:
pLevel count
0 34040
1 3078
2 7143
3 865
4 478
5 279
6 56
if I do the query for a specific value of pLevel, it does use the index:
EXPLAIN SELECT pID FROM ds_products WHERE pLevel =6
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE ds_products ref pLevel pLevel 1 const 1265
I've tried pLevel>=1 and pLevel<=6... but it still does a full scan
I've tried (pLevel=1 or pLevel=2 or pLevel=3 or pLevel=4 or pLevel=5 or pLevel=6) .... but it still does a full table scan.
Try using MySQL GROUP BY.
SELECT pLevel, COUNT(*)
FROM ds_products
GROUP BY pLevel
Edit:
This MySQL documentation article may be useful to you. How to Avoid Table Scans
SELECT count, item, itemid
FROM items
ORDER BY count DESC
LIMIT 20
Takes .0011
Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE items index NULL count 4 NULL 20
I have indexes on itemid(primary key) and count (INDEX)
Does anyone have suggestions for how this could be better accomplished?
It seems like your long_query_time variable/setting is extremely short. The default is 10 seconds, but if your query is taking 0.0011 seconds, it obviously shouldn't be logged with the default setting. Try increasing it to something reasonable for your setup (1 second+ probably) and see if this still happens.