LEFT OUTER JOIN with mySQL analyse all rows

LEFT OUTER JOIN with mySQL analyse all rows - mysql

my tables are simple:
mysql> desc muralentry ;
+-----------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_src_id | int(11) | NO | MUL | NULL | |
| content | longtext | NO | | NULL | |
+-----------------+------------------+------+-----+---------+----------------+
mysql> desc muralentry_user ;
+-----------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+----------------+
| muralentry_id | int(11) | NO | PRI | NULL | auto_increment |
| userinfo_id | int(11) | NO | MUL | NULL | |
+-----------------+------------------+------+-----+---------+----------------+
Im doing the following query:
SELECT DISTINCT *
FROM muralentry
LEFT OUTER JOIN muralentry_user ON (muralentry.id = muralentry_user.muralentry_id)
WHERE user_src_id = 1
The explain:
+----+-------------+----------------------------+------+-------------------------------------------------------+-------------------------------------+---------+------------------------------------------+------+-----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------------+------+-------------------------------------------------------+-------------------------------------+---------+------------------------------------------+------+-----------------+
| 1 | SIMPLE | muralentry | ref | muralentry_99bd10ae | muralentry_99bd10ae | 4 | const | 686 | Using temporary |
| 1 | SIMPLE | muralentry_user | ref | muralentry_id,muralentry_user_bcd7114e | muralentry_user_bcd7114e | 4 | muralentry.id | 15 | |
+----+-------------+----------------------------+------+-------------------------------------------------------+-------------------------------------+---------+------------------------------------------+------+-----------------+
Good result (for me :D)
But, when i add another where clause:
SELECT DISTINCT *
FROM muralentry
LEFT OUTER JOIN muralentry_user ON (muralentry.id = muralentry_user.muralentry_id)
WHERE user_src_id = 1 OR userinfo_id = 1;
The explain:
+----+-------------+----------------------------+------+-------------------------------------------------------+-------------------------------------+---------+------------------------------------------+---------+-----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------------+------+-------------------------------------------------------+-------------------------------------+---------+------------------------------------------+---------+-----------------+
| 1 | SIMPLE | muralentry | ALL | muralentry_99bd10ae | NULL | NULL | NULL | 1140932 | Using temporary |
| 1 | SIMPLE | muralentry_user | ref | muralentry_id,muralentry_user_bcd7114e | muralentry_user_bcd7114e | 4 | muralentry.id | 15 | Using where |
+----+-------------+----------------------------+------+-------------------------------------------------------+-------------------------------------+---------+------------------------------------------+---------+-----------------+
Wow... the result if ALOT worst...
How can i "fix" this?
Should i create some index to do this job? Or recreate my query?
I'm expecting the following result: 'muralentry' rows where the user is 'user_src_id' AND the 'muralentry_user' rows where he is 'userinfo_id'.
-- edit --
I edited the question because when I wrote an AND actually wanted an OR... sorry for that!

Related

Query is really slow when dealing with millions of records. Need help to optimise

I'm running this query
mysql> explain SELECT
recipients.id
FROM
recipients
JOIN recipient_contact_details ON recipient_contact_details.recipient_id = recipients.id
JOIN recipient_contact_preferences ON recipient_contact_preferences.recipient_id = recipients.id
LEFT JOIN recipient_has_recipient_tags ON recipient_has_recipient_tags.recipient_id = recipients.id
LEFT JOIN recipient_tags ON recipient_tags.id = recipient_has_recipient_tags.recipient_tag_id
LEFT JOIN recipient_tag_groups ON recipient_tag_groups.id = recipient_tags.recipient_tag_group_id
INNER JOIN location ON location.id = recipients.location_id
WHERE
1 = 1
AND FLOOR(
DATEDIFF(NOW(), recipients.dob) / 365
) > 15
AND recipients.`join_date` < '2016-02-27 16:35:46'
AND recipients.`last_attendance` > '2016-02-18 16:35:46'
AND location.deleted_at IS NULL
AND recipient_contact_details.type = 1
AND recipient_contact_details.
VALUE
!= '';
(I apologise for the length!) - It should return around 900+k rows, from a recipients table of 2.7+m records. Which, it does, but it takes around 25-30 seconds to run.
After running an explain I can see:
+----+-------------+-------------------------------+--------+------------------------------------------------------------------+------------------------------------------------------------------+---------+---------------------------------------------------------+-------+-----------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------------------------+--------+------------------------------------------------------------------+------------------------------------------------------------------+---------+---------------------------------------------------------+-------+-----------------------------------------------------------------+
| 1 | SIMPLE | location | ALL | PRIMARY,location_id_index | NULL | NULL | NULL | 156 | Using where |
| 1 | SIMPLE | recipients | ref | PRIMARY,recipients_location_id_index | recipients_location_id_index | 5 | homestead.location.id | 17918 | Using index condition; Using where |
| 1 | SIMPLE | recipient_contact_preferences | ref | recipient_contact_preferences_recipient_id_index | recipient_contact_preferences_recipient_id_index | 4 | homestead.recipients.id | 1 | Using where; Using index |
| 1 | SIMPLE | recipient_has_recipient_tags | ref | recipient_has_recipient_tags_recipient_id_recipient_tag_id_index | recipient_has_recipient_tags_recipient_id_recipient_tag_id_index | 4 | homestead.recipients.id | 2 | Using where; Using index |
| 1 | SIMPLE | recipient_contact_details | ref | recipient_contact_details_recipient_id_index | recipient_contact_details_recipient_id_index | 4 | homestead.recipients.id | 2 | Using index condition; Using where |
| 1 | SIMPLE | recipient_tags | eq_ref | PRIMARY | PRIMARY | 4 | homestead.recipient_has_recipient_tags.recipient_tag_id | 1 | Using where |
| 1 | SIMPLE | recipient_tag_groups | index | PRIMARY | PRIMARY | 4 | NULL | 2 | Using where; Using index; Using join buffer (Block Nested Loop) |
+----+-------------+-------------------------------+--------+------------------------------------------------------------------+------------------------------------------------------------------+---------+---------------------------------------------------------+-------+-----------------------------------------------------------------+
7 rows in set (0.00 sec)
As you can see, I've already added (what I think are relevant indexes to the various tables) . The location table is
mysql> desc location;
+------------------+------------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+------------------+------+-----+---------------------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| created_at | timestamp | NO | | 0000-00-00 00:00:00 | |
| updated_at | timestamp | NO | | 0000-00-00 00:00:00 | |
| name | varchar(255) | NO | | NULL | |
| deleted_at | timestamp | YES | | NULL | |
| org_website | varchar(255) | NO | | NULL | |
| from_name | varchar(255) | NO | | NULL | |
| reply_to_address | varchar(255) | NO | | NULL | |
| logo_path | varchar(255) | NO | | NULL | |
| colour | varchar(255) | NO | | NULL | |
| street_address | varchar(255) | NO | | NULL | |
| city | varchar(255) | NO | | NULL | |
| region | varchar(255) | NO | | NULL | |
| postcode | varchar(255) | NO | | NULL | |
| country | varchar(255) | NO | | NULL | |
| privacy_url | varchar(255) | NO | | NULL | |
| remote_id | bigint(20) | NO | MUL | 0 | |
+------------------+------------------+------+-----+---------------------+----------------+
17 rows in set (0.00 sec)
I'm quite new to optimising queries for such a large result set. I can see that the location table is having issues, but I'm unsure as to what to change to make a difference. Any help is greatly appreciated.

Please create an index on recipients.dob
CREATE INDEX idx_recipients_dob ON recepients(dob);
and rewrite this:
AND FLOOR(
DATEDIFF(NOW(), recipients.dob) / 365
) > 15
to this:
AND recipients.dob < NOW() - INTERVAL 15 YEAR
I think, this might already solve all your problems.
The rewrite is necessary, because MySQL can't use an index if there's any calculation on the indexed column. Plus it's easier to read and more accurate (you're forgetting leap years).
And these joins
LEFT JOIN recipient_has_recipient_tags ON recipient_has_recipient_tags.recipient_id = recipients.id
LEFT JOIN recipient_tags ON recipient_tags.id = recipient_has_recipient_tags.recipient_tag_id
LEFT JOIN recipient_tag_groups ON recipient_tag_groups.id = recipient_tags.recipient_tag_group_id
are not necessary when you don't use these tables anyway.

Poor performance in OR statement MySQL

I have two tables with the following structure:
Table counts.
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| bin1 | varchar(20) | NO | MUL | NULL | |
| bin2 | varchar(20) | NO | MUL | NULL | |
| count | float(6,2) | NO | | NULL | |
+-------+-------------+------+-----+---------+----------------+
Table coordinates.
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| bin | varchar(20) | NO | PRI | NULL | |
| chr | varchar(20) | NO | MUL | NULL | |
| start | int(11) | NO | MUL | NULL | |
| end | int(11) | NO | MUL | NULL | |
+-------+-------------+------+-----+---------+-------+
First, I want to get all coordinates.bin that matches my conditions using:
describe select bin from coordinates where chr="chr1" AND (start between 1 AND 1000000000 OR end between 1 AND 1000000000);
+------+-------------+-------------+------+-----------------------------------+-------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------+------+-----------------------------------+-------+---------+-------+------+--------------------------+
| 1 | SIMPLE | coordinates | ref | chr,chr_2,chr_3,start,start_2,end | chr_2 | 22 | const | 4929 | Using where; Using index |
+------+-------------+-------------+------+-----------------------------------+-------+---------+-------+------+--------------------------+
and it works fine. Then I want, with the coordinates.bin, filter table counts by mapping counts.bin1 and counts.bin2. I tried different queries without success:
describe select * from counts inner join (select bin from coordinates where chr="chr1" AND (start between 1 AND 1000000000 OR end between 1 AND 1000000000)) as subquery on bin=bin1 or bin=bin2;
+------+-------------+-------------+------+-------------------------------------------+-------+---------+-------+----------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------+------+-------------------------------------------+-------+---------+-------+----------+------------------------------------------------+
| 1 | SIMPLE | coordinates | ref | PRIMARY,chr,chr_2,chr_3,start,start_2,end | chr_2 | 22 | const | 4929 | Using where; Using index |
| 1 | SIMPLE | counts | ALL | bin1,bin2 | NULL | NULL | NULL | 30763816 | Range checked for each record (index map: 0x6) |
+------+-------------+-------------+------+-------------------------------------------+-------+---------+-------+----------+------------------------------------------------+
describe select * from coordinates inner join counts on (counts.bin1=coordinates.bin or counts.bin2=coordinates.bin) where coordinates.chr="chr1" AND (coordinates.start between 1 AND 1000000000 OR coordinates.end between 1 AND 1000000000);
+------+-------------+-------------+------+-------------------------------------------+-------+---------+-------+----------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------+------+-------------------------------------------+-------+---------+-------+----------+------------------------------------------------+
| 1 | SIMPLE | coordinates | ref | PRIMARY,chr,chr_2,chr_3,start,start_2,end | chr_2 | 22 | const | 4929 | Using where; Using index |
| 1 | SIMPLE | counts | ALL | bin1,bin2 | NULL | NULL | NULL | 30763816 | Range checked for each record (index map: 0x6) |
+------+-------------+-------------+------+-------------------------------------------+-------+---------+-------+----------+------------------------------------------------+
but they are really slow. No key is used.

mysql primary key returning less results than compound composite index

I have inherited a database schema which has some design issues
Note that there are another 9 keys on the table which I haven't listed below, the keys in question look like
+-------+------------+----------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| users | 0 | PRIMARY | 1 | userid | A | 604 | NULL | NULL | | BTREE | | |
| users | 1 | userid_2 | 1 | userid | A | 604 | NULL | NULL | | BTREE | | |
| users | 1 | userid_2 | 2 | age | A | 604 | NULL | NULL | YES | BTREE | | |
| users | 1 | userid_2 | 3 | image | A | 604 | 255 | NULL | YES | BTREE | | |
| users | 1 | userid_2 | 4 | gender | A | 604 | NULL | NULL | YES | BTREE | | |
| users | 1 | userid_2 | 5 | last_login | A | 604 | NULL | NULL | YES | BTREE | | |
| users | 1 | userid_2 | 6 | latitude | A | 604 | NULL | NULL | YES | BTREE | | |
| users | 1 | userid_2 | 7 | longitude | A | 604 | NULL | NULL | YES | BTREE | | |
+-------+------------+----------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
In a table with the following fields.
+--------------------------------+---------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------------------+---------------------+------+-----+-------------------+----------------+
| userid | int(11) | NO | PRI | NULL | auto_increment |
| age | int(11) | YES | | NULL | |
| image | varchar(500) | YES | | | |
| gender | varchar(10) | YES | | NULL | |
| last_login | timestamp | YES | MUL | NULL | |
| latitude | varchar(20) | YES | MUL | NULL | |
| longitude | varchar(20) | YES | | NULL | |
+--------------------------------+---------------------+------+-----+-------------------+----------------+
Running an explain statement and forcing it to use userid_2, it uses 522 rows
describe SELECT userid, age FROM users USE INDEX(userid_2) WHERE `userid` >=100 and age >27 limit 10 ;
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------+
| 1 | SIMPLE | users | index | userid_2 | userid_2 | 941 | NULL | 522 | Using where; Using index |
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------+
1 row in set (0.02 sec)
if I don't force it to use the index it is just using the primary key, which only consists of the userid and only uses 261 rows
mysql> describe SELECT userid, age FROM users WHERE userid >=100 and age >27 limit 10 ;
+----+-------------+-------+-------+--------------------------------------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+--------------------------------------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | users | range | PRIMARY,users_user_ids_key,userid,userid_2 | PRIMARY | 4 | NULL | 261 | Using where |
+----+-------------+-------+-------+--------------------------------------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)
Questions
Why is it examining more rows when it uses the compound composite index?
Why isn't the query using the userid_2 index if its not specified in the query?

That row count is only an estimate based on indexed value distribution.
You have two options:
Execute ANALYZE TABLE mytable to recalculate distributions and then re-try the describe
Don't worry about stuff that doesn't matter... rows is just an estimate anyway

2 exactly same mysql queries give 2 differents 'explain' output : why?

I found a very weird mysql behaviour : when I run a specific query twice, the explain of this query is different the second time :
query = SELECT `twstats_twwordstrend`.`id`, `twstats_twwordstrend`.`created`, `twstats_twwordstrend`.`freq`, `twstats_twwordstrend`.`word_id` FROM `twstats_twwordstrend` INNER JOIN `twstats_twwords` ON (`twstats_twwordstrend`.`word_id` = `twstats_twwords`.`id`) WHERE (`twstats_twwords`.`name` = '#ladygaga' AND `twstats_twwordstrend`.`created` > '2011-01-28 01:30:19' );
1st query execution and then run explain :
mysql> EXPLAIN SELECT `twstats_twwordstrend`.`id`, `twstats_twwordstrend`.`created`, `twstats_twwordstrend`.`freq`, `twstats_twwordstrend`.`word_id` FROM `twstats_twwordstrend` INNER JOIN `twstats_twwords` ON (`twstats_twwordstrend`.`word_id` = `twstats_twwords`.`id`) WHERE (`twstats_twwords`.`name` = '#ladygaga' AND `twstats_twwordstrend`.`created` > '2011-01-28 01:30:19' );
+----+-------------+----------------------+--------+-------------------------------+---------+---------+-------------------------------------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+--------+-------------------------------+---------+---------+-------------------------------------------+---------+-------------+
| 1 | SIMPLE | twstats_twwordstrend | ALL | twstats_twwordstrend_4b95d890 | NULL | NULL | NULL | 4877401 | Using where |
| 1 | SIMPLE | twstats_twwords | eq_ref | PRIMARY | PRIMARY | 4 | statweestics.twstats_twwordstrend.word_id | 1 | Using where |
+----+-------------+----------------------+--------+-------------------------------+---------+---------+-------------------------------------------+---------+-------------+
2 rows in set (0.00 sec)
2nd query execution and then run explain :
mysql> EXPLAIN SELECT `twstats_twwordstrend`.`id`, `twstats_twwordstrend`.`created`, `twstats_twwordstrend`.`freq`, `twstats_twwordstrend`.`word_id` FROM `twstats_twwordstrend` INNER JOIN `twstats_twwords` ON (`twstats_twwordstrend`.`word_id` = `twstats_twwords`.`id`) WHERE (`twstats_twwords`.`name` = '#ladygaga' AND `twstats_twwordstrend`.`created` > '2011-01-28 01:30:19' );
+----+-------------+----------------------+------+-------------------------------+-------------------------------+---------+---------------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+------+-------------------------------+-------------------------------+---------+---------------------------------+--------+-------------+
| 1 | SIMPLE | twstats_twwords | ALL | PRIMARY | NULL | NULL | NULL | 222994 | Using where |
| 1 | SIMPLE | twstats_twwordstrend | ref | twstats_twwordstrend_4b95d890 | twstats_twwordstrend_4b95d890 | 4 | statweestics.twstats_twwords.id | 15 | Using where |
+----+-------------+----------------------+------+-------------------------------+-------------------------------+---------+---------------------------------+--------+-------------+
2 rows in set (0.00 sec)
mysql> describe twstats_twwords;
+---------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| created | datetime | NO | | NULL | |
| name | varchar(140) | NO | | NULL | |
+---------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
mysql> describe twstats_twwordstrend;
+---------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+----------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| created | datetime | NO | | NULL | |
| freq | double | NO | | NULL | |
| word_id | int(11) | NO | MUL | NULL | |
+---------+----------+------+-----+---------+----------------+
4 rows in set (0.00 sec)
How this can be possible ??

Look at the rows column. The engine was able to gather more statistics -- so the next time it will try to use the better plan.
Happy coding.

MySQL select specific cols slower than select *

My MySQL is not strong, so please forgive any rookie mistakes. Short version:
SELECT locId,count,avg FROM destAgg_geo is significantly slower than SELECT * from destAgg_geo
prtt.destAgg is a table keyed on dst_ip (PRIMARY)
mysql> describe prtt.destAgg;
+---------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+------------------+------+-----+---------+-------+
| dst_ip | int(10) unsigned | NO | PRI | 0 | |
| total | float unsigned | YES | | NULL | |
| avg | float unsigned | YES | | NULL | |
| sqtotal | float unsigned | YES | | NULL | |
| sqavg | float unsigned | YES | | NULL | |
| count | int(10) unsigned | YES | | NULL | |
+---------+------------------+------+-----+---------+-------+
geoip.blocks is a table keyed on both startIpNum and endIpNum (PRIMARY)
mysql> describe geoip.blocks;
+------------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+-------+
| startIpNum | int(10) unsigned | NO | MUL | NULL | |
| endIpNum | int(10) unsigned | NO | | NULL | |
| locId | int(10) unsigned | NO | | NULL | |
+------------+------------------+------+-----+---------+-------+
destAgg_geo is a view:
CREATE VIEW destAgg_geo AS SELECT * FROM destAgg JOIN geoip.blocks
ON destAgg.dst_ip BETWEEN geoip.blocks.startIpNum AND geoip.blocks.endIpNum;
Here's the optimization plan for select *:
mysql> explain select * from destAgg_geo;
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| 1 | SIMPLE | blocks | ALL | start_end | NULL | NULL | NULL | 3486646 | |
| 1 | SIMPLE | destAgg | ALL | PRIMARY | NULL | NULL | NULL | 101893 | Range checked for each record (index map: 0x1) |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
Here's the optimization plan for select with specific columns:
mysql> explain select locId,count,avg from destAgg_geo;
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| 1 | SIMPLE | destAgg | ALL | PRIMARY | NULL | NULL | NULL | 101893 | |
| 1 | SIMPLE | blocks | ALL | start_end | NULL | NULL | NULL | 3486646 | Range checked for each record (index map: 0x1) |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
Here's the optimization plan for every column from destAgg and just the locId column from geoip.blocks:
mysql> explain select dst_ip,total,avg,sqtotal,sqavg,count,locId from destAgg_geo;
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| 1 | SIMPLE | blocks | ALL | start_end | NULL | NULL | NULL | 3486646 | |
| 1 | SIMPLE | destAgg | ALL | PRIMARY | NULL | NULL | NULL | 101893 | Range checked for each record (index map: 0x1) |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
Remove any column except dst_ip and the range check flips to blocks:
mysql> explain select dst_ip,avg,sqtotal,sqavg,count,locId from destAgg_geo;
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| 1 | SIMPLE | destAgg | ALL | PRIMARY | NULL | NULL | NULL | 101893 | |
| 1 | SIMPLE | blocks | ALL | start_end | NULL | NULL | NULL | 3486646 | Range checked for each record (index map: 0x1) |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
which is then much slower. What's going on here?
(Yes, I could just use the * query results and process from there, but I would like to know what's happening and why)
EDIT -- EXPLAIN on the VIEW query:
mysql> explain SELECT * FROM destAgg JOIN geoip.blocks ON destAgg.dst_ip BETWEEN geoip.blocks.startIpNum AND geoip.blocks.endIpNum;
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| 1 | SIMPLE | blocks | ALL | start_end | NULL | NULL | NULL | 3486646 | |
| 1 | SIMPLE | destAgg | ALL | PRIMARY | NULL | NULL | NULL | 101893 | Range checked for each record (index map: 0x1) |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+

MySQL can tell you if you run EXPLAIN PLAN on both queries.
The first query with the columns doesn't include any key columns, so my guess is it has to do a TABLE SCAN.
The second query with the "SELECT *" includes the primary key, so it can use the index.

The range filter is applied last, so the problem is that the query optimizer is choosing to join the larger table first in one case, and the smaller table first in another. Perhaps someone with more knowledge of the optimizer can tell us why it's joining the tables in a different order for each.
I think the real goal here should be to try to get the JOIN to use an index, so the order of the join wouldn't matter so much.

I would try putting a compisite index on locId,count,avg and see if that doesn't improve speed.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

LEFT OUTER JOIN with mySQL analyse all rows - mysql

Related

Query is really slow when dealing with millions of records. Need help to optimise

Poor performance in OR statement MySQL

mysql primary key returning less results than compound composite index

2 exactly same mysql queries give 2 differents 'explain' output : why?

MySQL select specific cols slower than select *

Categories

Resources