Poor performance in OR statement MySQL - mysql

I have two tables with the following structure:
Table counts.
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| bin1 | varchar(20) | NO | MUL | NULL | |
| bin2 | varchar(20) | NO | MUL | NULL | |
| count | float(6,2) | NO | | NULL | |
+-------+-------------+------+-----+---------+----------------+
Table coordinates.
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| bin | varchar(20) | NO | PRI | NULL | |
| chr | varchar(20) | NO | MUL | NULL | |
| start | int(11) | NO | MUL | NULL | |
| end | int(11) | NO | MUL | NULL | |
+-------+-------------+------+-----+---------+-------+
First, I want to get all coordinates.bin that matches my conditions using:
describe select bin from coordinates where chr="chr1" AND (start between 1 AND 1000000000 OR end between 1 AND 1000000000);
+------+-------------+-------------+------+-----------------------------------+-------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------+------+-----------------------------------+-------+---------+-------+------+--------------------------+
| 1 | SIMPLE | coordinates | ref | chr,chr_2,chr_3,start,start_2,end | chr_2 | 22 | const | 4929 | Using where; Using index |
+------+-------------+-------------+------+-----------------------------------+-------+---------+-------+------+--------------------------+
and it works fine. Then I want, with the coordinates.bin, filter table counts by mapping counts.bin1 and counts.bin2. I tried different queries without success:
describe select * from counts inner join (select bin from coordinates where chr="chr1" AND (start between 1 AND 1000000000 OR end between 1 AND 1000000000)) as subquery on bin=bin1 or bin=bin2;
+------+-------------+-------------+------+-------------------------------------------+-------+---------+-------+----------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------+------+-------------------------------------------+-------+---------+-------+----------+------------------------------------------------+
| 1 | SIMPLE | coordinates | ref | PRIMARY,chr,chr_2,chr_3,start,start_2,end | chr_2 | 22 | const | 4929 | Using where; Using index |
| 1 | SIMPLE | counts | ALL | bin1,bin2 | NULL | NULL | NULL | 30763816 | Range checked for each record (index map: 0x6) |
+------+-------------+-------------+------+-------------------------------------------+-------+---------+-------+----------+------------------------------------------------+
describe select * from coordinates inner join counts on (counts.bin1=coordinates.bin or counts.bin2=coordinates.bin) where coordinates.chr="chr1" AND (coordinates.start between 1 AND 1000000000 OR coordinates.end between 1 AND 1000000000);
+------+-------------+-------------+------+-------------------------------------------+-------+---------+-------+----------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------+------+-------------------------------------------+-------+---------+-------+----------+------------------------------------------------+
| 1 | SIMPLE | coordinates | ref | PRIMARY,chr,chr_2,chr_3,start,start_2,end | chr_2 | 22 | const | 4929 | Using where; Using index |
| 1 | SIMPLE | counts | ALL | bin1,bin2 | NULL | NULL | NULL | 30763816 | Range checked for each record (index map: 0x6) |
+------+-------------+-------------+------+-------------------------------------------+-------+---------+-------+----------+------------------------------------------------+
but they are really slow. No key is used.

Related

LEFT OUTER JOIN with mySQL analyse all rows

my tables are simple:
mysql> desc muralentry ;
+-----------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_src_id | int(11) | NO | MUL | NULL | |
| content | longtext | NO | | NULL | |
+-----------------+------------------+------+-----+---------+----------------+
mysql> desc muralentry_user ;
+-----------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+----------------+
| muralentry_id | int(11) | NO | PRI | NULL | auto_increment |
| userinfo_id | int(11) | NO | MUL | NULL | |
+-----------------+------------------+------+-----+---------+----------------+
Im doing the following query:
SELECT DISTINCT *
FROM muralentry
LEFT OUTER JOIN muralentry_user ON (muralentry.id = muralentry_user.muralentry_id)
WHERE user_src_id = 1
The explain:
+----+-------------+----------------------------+------+-------------------------------------------------------+-------------------------------------+---------+------------------------------------------+------+-----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------------+------+-------------------------------------------------------+-------------------------------------+---------+------------------------------------------+------+-----------------+
| 1 | SIMPLE | muralentry | ref | muralentry_99bd10ae | muralentry_99bd10ae | 4 | const | 686 | Using temporary |
| 1 | SIMPLE | muralentry_user | ref | muralentry_id,muralentry_user_bcd7114e | muralentry_user_bcd7114e | 4 | muralentry.id | 15 | |
+----+-------------+----------------------------+------+-------------------------------------------------------+-------------------------------------+---------+------------------------------------------+------+-----------------+
Good result (for me :D)
But, when i add another where clause:
SELECT DISTINCT *
FROM muralentry
LEFT OUTER JOIN muralentry_user ON (muralentry.id = muralentry_user.muralentry_id)
WHERE user_src_id = 1 OR userinfo_id = 1;
The explain:
+----+-------------+----------------------------+------+-------------------------------------------------------+-------------------------------------+---------+------------------------------------------+---------+-----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------------+------+-------------------------------------------------------+-------------------------------------+---------+------------------------------------------+---------+-----------------+
| 1 | SIMPLE | muralentry | ALL | muralentry_99bd10ae | NULL | NULL | NULL | 1140932 | Using temporary |
| 1 | SIMPLE | muralentry_user | ref | muralentry_id,muralentry_user_bcd7114e | muralentry_user_bcd7114e | 4 | muralentry.id | 15 | Using where |
+----+-------------+----------------------------+------+-------------------------------------------------------+-------------------------------------+---------+------------------------------------------+---------+-----------------+
Wow... the result if ALOT worst...
How can i "fix" this?
Should i create some index to do this job? Or recreate my query?
I'm expecting the following result: 'muralentry' rows where the user is 'user_src_id' AND the 'muralentry_user' rows where he is 'userinfo_id'.
-- edit --
I edited the question because when I wrote an AND actually wanted an OR... sorry for that!

MySQL optimization on join tables with range criteria

I am going to join two tables by using a single position in one table to the range (represented by two columns) in another table.
However, the performance is too slow, which is about 20 mins.
I have tried adding the index on the table or changing the query.
But the performance is still poor.
So, I am asking for optimization of the joining speed.
The following is the query to MySQL.
mysql> SELECT `inVar`.chrom, `inVar`.pos, `openChrom_K562`.score
-> FROM `inVar`
-> LEFT JOIN `openChrom_K562`
-> ON (
-> `inVar`.chrom=`openChrom_K562`.chrom AND
-> `inVar`.pos BETWEEN `openChrom_K562`.chromStart AND `openChrom_K562`.chromEnd
-> );
inVar and openChrom_K562 are the tables I used.
inVar stores the single position in each row.
openChrom_K562 stores the range information indicated by chromStart and chromEnd.
inVar contains 57902 rows and openChrom_K562 has 137373 rows respectively.
Fields on the tables.
mysql> DESCRIBE inVar;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| chrom | varchar(31) | NO | PRI | NULL | |
| pos | int(10) | NO | PRI | NULL | |
+-------+-------------+------+-----+---------+-------+
mysql> DESCRIBE openChrom_K562;
+------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| chrom | varchar(31) | NO | MUL | NULL | |
| chromStart | int(10) | NO | MUL | NULL | |
| chromEnd | int(10) | NO | | NULL | |
| score | int(10) | NO | | NULL | |
+------------+-------------+------+-----+---------+-------+
Index built in the tables
mysql> SHOW INDEX FROM inVar;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| inVar | 0 | PRIMARY | 1 | chrom | A | NULL | NULL | NULL | | BTREE | |
| inVar | 0 | PRIMARY | 2 | pos | A | 57902 | NULL | NULL | | BTREE | |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
mysql> SHOW INDEX FROM openChrom_K562;
+----------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| openChrom_K562 | 1 | start_end | 1 | chromStart | A | 137373 | NULL | NULL | | BTREE | |
| openChrom_K562 | 1 | start_end | 2 | chromEnd | A | 137373 | NULL | NULL | | BTREE | |
| openChrom_K562 | 1 | chrom_only | 1 | chrom | A | 22 | NULL | NULL | | BTREE | |
| openChrom_K562 | 1 | chrom_start | 1 | chrom | A | 22 | NULL | NULL | | BTREE | |
| openChrom_K562 | 1 | chrom_start | 2 | chromStart | A | 137373 | NULL | NULL | | BTREE | |
| openChrom_K562 | 1 | chrom_end | 1 | chrom | A | 22 | NULL | NULL | | BTREE | |
| openChrom_K562 | 1 | chrom_end | 2 | chromEnd | A | 137373 | NULL | NULL | | BTREE | |
+----------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
Execution plan on MySQL
mysql> EXPLAIN SELECT `inVar`.chrom, `inVar`.pos, score FROM `inVar` LEFT JOIN `openChrom_K562` ON ( inVar.chrom=openChrom_K562.chrom AND `inVar`.pos BETWEEN chromStart AND chromEnd );
+----+-------------+----------------+-------+--------------------------------------------+------------+---------+-----------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+-------+--------------------------------------------+------------+---------+-----------------+-------+-------------+
| 1 | SIMPLE | inVar | index | NULL | PRIMARY | 37 | NULL | 57902 | Using index |
| 1 | SIMPLE | openChrom_K562 | ref | start_end,chrom_only,chrom_start,chrom_end | chrom_only | 33 | tmp.inVar.chrom | 5973 | |
+----+-------------+----------------+-------+--------------------------------------------+------------+---------+-----------------+-------+-------------+
It seems it only optimizes by looking chrom in two tables. Then do the brute-force comparing in the tables.
Is there any ways to do the further optimization like indexing on the position?
(It is my first time posting the question, sorry for the poor posting quality.)
chrom_only is likely to be a bad index selection for your join as you only have chrom 22 values.
If I have interpreted this right the query should be faster if using start_end
SELECT `inVar`.chrom, `inVar`.pos, `openChrom_K562`.score
FROM `inVar`
LEFT JOIN `openChrom_K562`
USE INDEX (`start_end`)
ON (
`inVar`.chrom=`openChrom_K562`.chrom AND
`inVar`.pos BETWEEN `openChrom_K562`.chromStart AND `openChrom_K562`.chromEnd
)

mysql primary key returning less results than compound composite index

I have inherited a database schema which has some design issues
Note that there are another 9 keys on the table which I haven't listed below, the keys in question look like
+-------+------------+----------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| users | 0 | PRIMARY | 1 | userid | A | 604 | NULL | NULL | | BTREE | | |
| users | 1 | userid_2 | 1 | userid | A | 604 | NULL | NULL | | BTREE | | |
| users | 1 | userid_2 | 2 | age | A | 604 | NULL | NULL | YES | BTREE | | |
| users | 1 | userid_2 | 3 | image | A | 604 | 255 | NULL | YES | BTREE | | |
| users | 1 | userid_2 | 4 | gender | A | 604 | NULL | NULL | YES | BTREE | | |
| users | 1 | userid_2 | 5 | last_login | A | 604 | NULL | NULL | YES | BTREE | | |
| users | 1 | userid_2 | 6 | latitude | A | 604 | NULL | NULL | YES | BTREE | | |
| users | 1 | userid_2 | 7 | longitude | A | 604 | NULL | NULL | YES | BTREE | | |
+-------+------------+----------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
In a table with the following fields.
+--------------------------------+---------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------------------+---------------------+------+-----+-------------------+----------------+
| userid | int(11) | NO | PRI | NULL | auto_increment |
| age | int(11) | YES | | NULL | |
| image | varchar(500) | YES | | | |
| gender | varchar(10) | YES | | NULL | |
| last_login | timestamp | YES | MUL | NULL | |
| latitude | varchar(20) | YES | MUL | NULL | |
| longitude | varchar(20) | YES | | NULL | |
+--------------------------------+---------------------+------+-----+-------------------+----------------+
Running an explain statement and forcing it to use userid_2, it uses 522 rows
describe SELECT userid, age FROM users USE INDEX(userid_2) WHERE `userid` >=100 and age >27 limit 10 ;
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------+
| 1 | SIMPLE | users | index | userid_2 | userid_2 | 941 | NULL | 522 | Using where; Using index |
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------+
1 row in set (0.02 sec)
if I don't force it to use the index it is just using the primary key, which only consists of the userid and only uses 261 rows
mysql> describe SELECT userid, age FROM users WHERE userid >=100 and age >27 limit 10 ;
+----+-------------+-------+-------+--------------------------------------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+--------------------------------------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | users | range | PRIMARY,users_user_ids_key,userid,userid_2 | PRIMARY | 4 | NULL | 261 | Using where |
+----+-------------+-------+-------+--------------------------------------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)
Questions
Why is it examining more rows when it uses the compound composite index?
Why isn't the query using the userid_2 index if its not specified in the query?
That row count is only an estimate based on indexed value distribution.
You have two options:
Execute ANALYZE TABLE mytable to recalculate distributions and then re-try the describe
Don't worry about stuff that doesn't matter... rows is just an estimate anyway

Cannot figure out efficient SQL for 3-table INNER JOIN (MySQL) - "customers also bought" functionality

I'm trying to add the typical "customers who bought 'x' also bought 'y'" functionality to my website. Here is the table structure:
Table: qb_invoice
+--------------------------------+------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------------------+------------------+------+-----+-------------------+----------------+
| qbsql_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| TxnID | varchar(40) | YES | MUL | NULL | |
| Customer_ListID | varchar(40) | YES | MUL | NULL | |
| Customer_FullName | varchar(255) | YES | | NULL | |
+--------------------------------+------------------+------+-----+-------------------+----------------+
Table: qb_invoice_invoiceline
+-------------------------+------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------+------------------+------+-----+-------------------+----------------+
| qbsql_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| Invoice_TxnID | varchar(40) | YES | MUL | NULL | |
| Item_ListID | varchar(40) | YES | MUL | NULL | |
| Item_FullName | varchar(255) | YES | | NULL | |
+-------------------------+------------------+------+-----+-------------------+----------------+
Table: qb_customer
+-------------------------------------+------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------------------+------------------+------+-----+-------------------+----------------+
| qbsql_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| ListID | varchar(40) | YES | MUL | NULL | |
| Name | varchar(41) | YES | MUL | NULL | |
+-------------------------------------+------------------+------+-----+-------------------+----------------+
Given an Item_ListID I'd like a fast, efficient query to return a list of Item_ListID's along with a COUNT of the number of customers that ordered each item in the list, where all customers have in common the initially supplied Item_ListID.
Right now I have the following SQL that works, but is very slow:
SELECT qb_invoice_invoiceline.Item_FullName, count(*) as 'nummy'
FROM qb_invoice_invoiceline
WHERE qb_invoice_invoiceline.Invoice_TxnID =
ANY (SELECT qb_invoice.TxnID
FROM qb_invoice
INNER JOIN qb_customer ON qb_invoice.Customer_ListID = qb_customer.ListID
INNER JOIN qb_invoice_invoiceline ON qb_invoice.TxnID = qb_invoice_invoiceline.Invoice_TxnID
WHERE qb_invoice_invoiceline.Item_ListID = '1360000-57')
GROUP BY qb_invoice_invoiceline.Item_ListID
ORDER BY nummy DESC
I appreciate your help!
Here is the 'explain' output:
+----+--------------------+------------------------+-------+---------------------------+-------------+---------+-----------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------------------+-------+---------------------------+-------------+---------+-----------------------------------------+-------+----------------------------------------------+
| 1 | PRIMARY | qb_invoice_invoiceline | index | NULL | Item_ListID | 123 | NULL | 19690 | Using where; Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | qb_invoice_invoiceline | ref | Invoice_TxnID,Item_ListID | Item_ListID | 123 | const | 8 | Using where |
| 2 | DEPENDENT SUBQUERY | qb_invoice | ref | Customer_ListID,TxnID | TxnID | 123 | func | 206 | Using where |
| 2 | DEPENDENT SUBQUERY | qb_customer | ref | ListID | ListID | 123 | devdb.qb_invoice.Customer_ListID | 18 | Using where; Using index |
+----+--------------------+------------------------+-------+---------------------------+-------------+---------+-----------------------------------------+-------+----------------------------------------------+
Your query may be slow if there are no indexes available on the varchar fields that you are joining on. Can you give details on the indexes that are present on these tables?
I think that the query would benefit from indexes on qb_invoice.TxnID and qb_customer.ListID, and on qb_invoice_invoiceline.Item_ListID.

MySQL select specific cols slower than select *

My MySQL is not strong, so please forgive any rookie mistakes. Short version:
SELECT locId,count,avg FROM destAgg_geo is significantly slower than SELECT * from destAgg_geo
prtt.destAgg is a table keyed on dst_ip (PRIMARY)
mysql> describe prtt.destAgg;
+---------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+------------------+------+-----+---------+-------+
| dst_ip | int(10) unsigned | NO | PRI | 0 | |
| total | float unsigned | YES | | NULL | |
| avg | float unsigned | YES | | NULL | |
| sqtotal | float unsigned | YES | | NULL | |
| sqavg | float unsigned | YES | | NULL | |
| count | int(10) unsigned | YES | | NULL | |
+---------+------------------+------+-----+---------+-------+
geoip.blocks is a table keyed on both startIpNum and endIpNum (PRIMARY)
mysql> describe geoip.blocks;
+------------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+-------+
| startIpNum | int(10) unsigned | NO | MUL | NULL | |
| endIpNum | int(10) unsigned | NO | | NULL | |
| locId | int(10) unsigned | NO | | NULL | |
+------------+------------------+------+-----+---------+-------+
destAgg_geo is a view:
CREATE VIEW destAgg_geo AS SELECT * FROM destAgg JOIN geoip.blocks
ON destAgg.dst_ip BETWEEN geoip.blocks.startIpNum AND geoip.blocks.endIpNum;
Here's the optimization plan for select *:
mysql> explain select * from destAgg_geo;
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| 1 | SIMPLE | blocks | ALL | start_end | NULL | NULL | NULL | 3486646 | |
| 1 | SIMPLE | destAgg | ALL | PRIMARY | NULL | NULL | NULL | 101893 | Range checked for each record (index map: 0x1) |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
Here's the optimization plan for select with specific columns:
mysql> explain select locId,count,avg from destAgg_geo;
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| 1 | SIMPLE | destAgg | ALL | PRIMARY | NULL | NULL | NULL | 101893 | |
| 1 | SIMPLE | blocks | ALL | start_end | NULL | NULL | NULL | 3486646 | Range checked for each record (index map: 0x1) |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
Here's the optimization plan for every column from destAgg and just the locId column from geoip.blocks:
mysql> explain select dst_ip,total,avg,sqtotal,sqavg,count,locId from destAgg_geo;
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| 1 | SIMPLE | blocks | ALL | start_end | NULL | NULL | NULL | 3486646 | |
| 1 | SIMPLE | destAgg | ALL | PRIMARY | NULL | NULL | NULL | 101893 | Range checked for each record (index map: 0x1) |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
Remove any column except dst_ip and the range check flips to blocks:
mysql> explain select dst_ip,avg,sqtotal,sqavg,count,locId from destAgg_geo;
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| 1 | SIMPLE | destAgg | ALL | PRIMARY | NULL | NULL | NULL | 101893 | |
| 1 | SIMPLE | blocks | ALL | start_end | NULL | NULL | NULL | 3486646 | Range checked for each record (index map: 0x1) |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
which is then much slower. What's going on here?
(Yes, I could just use the * query results and process from there, but I would like to know what's happening and why)
EDIT -- EXPLAIN on the VIEW query:
mysql> explain SELECT * FROM destAgg JOIN geoip.blocks ON destAgg.dst_ip BETWEEN geoip.blocks.startIpNum AND geoip.blocks.endIpNum;
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
| 1 | SIMPLE | blocks | ALL | start_end | NULL | NULL | NULL | 3486646 | |
| 1 | SIMPLE | destAgg | ALL | PRIMARY | NULL | NULL | NULL | 101893 | Range checked for each record (index map: 0x1) |
+----+-------------+---------+------+---------------+------+---------+------+---------+------------------------------------------------+
MySQL can tell you if you run EXPLAIN PLAN on both queries.
The first query with the columns doesn't include any key columns, so my guess is it has to do a TABLE SCAN.
The second query with the "SELECT *" includes the primary key, so it can use the index.
The range filter is applied last, so the problem is that the query optimizer is choosing to join the larger table first in one case, and the smaller table first in another. Perhaps someone with more knowledge of the optimizer can tell us why it's joining the tables in a different order for each.
I think the real goal here should be to try to get the JOIN to use an index, so the order of the join wouldn't matter so much.
I would try putting a compisite index on locId,count,avg and see if that doesn't improve speed.