Optimizing query in the MySQL slow-query log - mysql

Our database is set up so that we have a credentials table that hold multiple different types of credentials (logins and the like). There's also a credential_pairs table that associates some of these types together (for instance, a user may have a password and security token).
In an attempt to see if a pair match, there is the following query:
SELECT DISTINCT cp.credential_id FROM credential_pairs AS cp
INNER JOIN credentials AS c1 ON (cp.primary_credential_id = c1.credential_id)
INNER JOIN credentials AS c2 ON (cp.secondary_credential_id = c2.credential_id)
WHERE c1.data = AES_ENCRYPT('Some Value 1', 'encryption key')
AND c2.data = AES_ENCRYPT('Some Value 2', 'encryption key');
This query works fine and gives us exactly what we need. HOWEVER, it is constantly showing in the slow query log (possibly due to lack of indexes?). When I ask MySQL to "explain" the query it gives me:
+----+-------------+-------+------+--------------------------------------------------------+---------------------+---------+-------+-------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+--------------------------------------------------------+---------------------+---------+-------+-------+--------------------------------+
| 1 | SIMPLE | c1 | ref | credential_id_UNIQUE,credential_id,ix_credentials_data | ix_credentials_data | 22 | const | 1 | Using where; Using temporary |
| 1 | SIMPLE | c2 | ref | credential_id_UNIQUE,credential_id,ix_credentials_data | ix_credentials_data | 22 | const | 1 | Using where |
| 1 | SIMPLE | cp | ALL | NULL | NULL | NULL | NULL | 69197 | Using where; Using join buffer |
+----+-------------+-------+------+--------------------------------------------------------+---------------------+---------+-------+-------+--------------------------------+
I have a feeling that last entry (where it shows 69197 rows) is probably the problem, but I am FAR from a DBA... help?
credentials table:
CREATE TABLE `credentials` (
`hidden_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`credential_id` varchar(255) NOT NULL,
`data` blob NOT NULL,
`credential_status` varchar(100) NOT NULL,
`insert_date` datetime NOT NULL,
`insert_user` int(10) unsigned NOT NULL,
`update_date` datetime DEFAULT NULL,
`update_user` int(10) unsigned DEFAULT NULL,
`delete_date` datetime DEFAULT NULL,
`delete_user` int(10) unsigned DEFAULT NULL,
`is_deleted` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`hidden_id`,`credential_id`),
UNIQUE KEY `credential_id_UNIQUE` (`credential_id`),
KEY `credential_id` (`credential_id`),
KEY `data` (`data`(10)),
KEY `credential_status` (`credential_status`(10))
) ENGINE=InnoDB AUTO_INCREMENT=1572 DEFAULT CHARSET=utf8;
credential_pairs Table:
CREATE TABLE `credential_pairs` (
`hidden_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`credential_id` varchar(255) NOT NULL,
`primary_credential_id` varchar(255) NOT NULL,
`secondary_credential_id` varchar(255) NOT NULL,
`is_deleted` tinyint(1) DEFAULT NULL,
PRIMARY KEY (`hidden_id`,`credential_id`),
KEY `primary_credential_id` (`primary_credential_id`(10)),
KEY `secondary_credential_id` (`secondary_credential_id`(10))
) ENGINE=InnoDB AUTO_INCREMENT=500 DEFAULT CHARSET=latin1;
credentials Indexes:
+-------------+------------+----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------------+------------+----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| credentials | 0 | PRIMARY | 1 | hidden_id | A | 186235 | NULL | NULL | | BTREE | |
| credentials | 0 | PRIMARY | 2 | credential_id | A | 186235 | NULL | NULL | | BTREE | |
| credentials | 0 | credential_id_UNIQUE | 1 | credential_id | A | 186235 | NULL | NULL | | BTREE | |
| credentials | 1 | credential_id | 1 | credential_id | A | 186235 | NULL | NULL | | BTREE | |
| credentials | 1 | ix_credentials_data | 1 | data | A | 186235 | 20 | NULL | | BTREE | |
+-------------+------------+----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
credential_pair Indexes:
+------------------+------------+---------------------------------------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+------------------+------------+---------------------------------------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+---------+
| credential_pairs | 0 | PRIMARY | 1 | hidden_id | A | 69224 | NULL | NULL | | BTREE | |
| credential_pairs | 0 | PRIMARY | 2 | credential_id | A | 69224 | NULL | NULL | | BTREE | |
| credential_pairs | 1 | ix_credential_pairs_credential_id | 1 | credential_id | A | 69224 | 36 | NULL | | BTREE | |
| credential_pairs | 1 | ix_credential_pairs_primary_credential_id | 1 | primary_credential_id | A | 69224 | 36 | NULL | | BTREE | |
| credential_pairs | 1 | ix_credential_pairs_secondary_credential_id | 1 | secondary_credential_id | A | 69224 | 36 | NULL | | BTREE | |
+------------------+------------+---------------------------------------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+---------+
UPDATE NOTES:
AFAICT: The DISTINCT was superfluous... nothing really needed it, so I dropped it. In an attempt to follow Fabrizio's advice to get a where on the credential_pairs lookup I then altered the statement to read as:
SELECT credential_id
FROM credential_pairs cp
WHERE cp.primary_credential_id = (SELECT credential_id FROM credentials WHERE data = AES_ENCRYPT('value 1','enc_key')) AND
cp.secondary_credential_id = (SELECT credential_id FROM credentials WHERE data = AES_ENCRYPT('value 2','enc_key'))
And.... nothing. The statement takes just as long and the explain looks pretty much the same. So, I added an index to the primary and secondary columns with:
ALTER TABLE credential_pairs ADD INDEX `idx_credential_pairs__primary_and_secondary`(`primary_credential_id`, `secondary_credential_id`);
And... nothing.
+----+-------------+-------------+-------+---------------------+---------------------------------------------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------------+---------------------------------------------+---------+------+-------+--------------------------+
| 1 | PRIMARY | cp | index | NULL | idx_credential_pairs__primary_and_secondary | 514 | NULL | 69217 | Using where; Using index |
| 3 | SUBQUERY | credentials | ref | ix_credentials_data | ix_credentials_data | 22 | | 1 | Using where |
| 2 | SUBQUERY | credentials | ref | ix_credentials_data | ix_credentials_data | 22 | | 1 | Using where |
+----+-------------+-------------+-------+---------------------+---------------------------------------------+---------+------+-------+--------------------------+
It says it's using the index, but it still looks like it's table scanning. So, I added a joint key (as per a'r's comment below) with:
ALTER TABLE credential_pairs ADD KEY (primary_credential_id, secondary_credential_id);
And... same result as with the index (are these functionally the same?).

The DISTINCT is what is generating the "Use temporary", you usually want to avoid those when possible
Plus you are scanning the whole credential_pair table as you do not have any conditions against it so no indexes are used and the whole table is returned before applying the WHERE
hope it makes sense
EDIT/ADD
Try by starting from a different table, if I understand correctly, you have Table A, a Table B and a Table AB and you are starting the select from AB, try to start it from A
I haven't tested this, but you could try:
SELECT cp.credential_id
FROM credentials AS c1
LEFT JOIN credential_pairs AS cp ON (c1.credential_id = cp.primary_credential_id)
LEFT JOIN credentials AS c2 ON (cp.secondary_credential_id = c2.credential_id)
WHERE
c1.data = AES_ENCRYPT('Some Value 1', 'encryption key')
AND c2.data = AES_ENCRYPT('Some Value 2', 'encryption key');
I have had luck in the past by moving select tables around

Related

why mysql explain show using index prefer bigint column than int column

Recently ,i got on a strange question,my test table structure:
CREATE TABLE `index_test` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`a` varchar(64) NOT NULL DEFAULT '',
`card_no` bigint(20) NOT NULL,
`card_no2` bigint(20) NOT NULL,
`optype` int(11) NOT NULL,
`optype2` int(11) NOT NULL,
`create_time` datetime NOT NULL DEFAULT '2000-01-01 00:00:00',
`_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `idx_a` (`a`),
KEY `idx_card_no` (`card_no`),
KEY `idx_card_no2` (`card_no2`),
KEY `idx_optype` (`optype`),
KEY `idx_optype2` (`optype2`)
) ENGINE=InnoDB AUTO_INCREMENT=10000 DEFAULT CHARSET=utf8;
5 major columns,a varchar,cardno and cardno2 are bigint,optype and optype2 are int,
as my experience,mysql index prefer select high cardinality、small data type and non null columns,but when i run explain query statements,a few problems occurred,here is my init data procedure
DELIMITER ;;
CREATE DEFINER=`xx`#`%` PROCEDURE `simple_insert`( )
BEGIN
DECLARE counter BIGINT DEFAULT 0;
my_loop: LOOP
SET counter=counter+1;
IF counter=10000 THEN
LEAVE my_loop;
END IF;
INSERT INTO `index_test` (`a`,`card_no`,`card_no2`,`optype`,`optype2`, `create_time`) VALUES (replace(uuid(), '-', ''),counter,counter%180, counter,counter%180,current_timestamp);
END LOOP my_loop;
END;;
DELIMITER ;
insert 10,000 row data,first i execute the statistics query
select * from information_schema.statistics where table_schema = 'test' and table_name = 'index_test';
output
+---------------+--------------+------------+------------+--------------+--------------+--------------+-------------+-----------+-------------+----------+--------+----------+------------+---------+---------------+
| TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME | NON_UNIQUE | INDEX_SCHEMA | INDEX_NAME | SEQ_IN_INDEX | COLUMN_NAME | COLLATION | CARDINALITY | SUB_PART | PACKED | NULLABLE | INDEX_TYPE | COMMENT | INDEX_COMMENT |
+---------------+--------------+------------+------------+--------------+--------------+--------------+-------------+-----------+-------------+----------+--------+----------+------------+---------+---------------+
| def | test | index_test | 0 | test | PRIMARY | 1 | id | A | 10089 | NULL | NULL | | BTREE | | |
| def | test | index_test | 1 | test | idx_a | 1 | a | A | 9999 | NULL | NULL | | BTREE | | |
| def | test | index_test | 1 | test | idx_card_no | 1 | card_no | A | 9999 | NULL | NULL | | BTREE | | |
| def | test | index_test | 1 | test | idx_card_no2 | 1 | card_no2 | A | 180 | NULL | NULL | | BTREE | | |
| def | test | index_test | 1 | test | idx_optype | 1 | optype | A | 9999 | NULL | NULL | | BTREE | | |
| def | test | index_test | 1 | test | idx_optype2 | 1 | optype2 | A | 180 | NULL | NULL | | BTREE | | |
+---------------+--------------+------------+------------+--------------+--------------+--------------+-------------+-----------+-------------+----------+--------+----------+------------+---------+---------------+
step 2:
explain select * from index_test where optype=9600 and a= 'e095af180f4911ea8d907036bd142a99';
output:
+----+-------------+------------+------------+------+------------------+-------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+------------------+-------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | index_test | NULL | ref | idx_a,idx_optype | idx_a | 194 | const | 1 | 5.00 | Using where |
+----+-------------+------------+------------+------+------------------+-------+---------+-------+------+----------+-------------+
as my experience ,varchar(64) space is bigger than int,so use int column is ok
step3:
explain select * from index_test where optype=9600 and card_no = 9600;
output
+----+-------------+------------+------------+------+------------------------+-------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+------------------------+-------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | index_test | NULL | ref | idx_card_no,idx_optype | idx_card_no | 8 | const | 1 | 5.00 | Using where |
+----+-------------+------------+------------+------+------------------------+-------------+---------+-------+------+----------+-------------+
so ,the question is why mysql query optimizer prefer use bigint column than int column,any one can help me or give some offcinal document links about this question ,thanks。
by the way,my test environment is macos(10.14.6) x64 and mysql server version is 5.7.26
I don't think INT vs BIGINT is the issue. First, let me mention better indexes:
For
where optype=9600 and a= 'e095af180f4911ea8d907036bd142a99'
Either of these "composite" indexes would be optimal, and better than what you have:
INDEX(optype, a)
INDEX(a, optype)
For
where optype=9600
and card_no = 9600
and a= 'e095af180f4911ea8d907036bd142a99'
any index starting with those 3 columns is optimal; any 2 would be "good", and single-column indexes would be poor, but better than no index.
The optimizer may be making probes to see which of your 3 poor indexes is best.
I can't explain why it did not list a as a "Possible key".

Optimizing join on derived table - EXPLAIN different on local and server

I have the following ugly query, which runs okay but not great, on my local machine (1.4 secs, running v5.7). On the server I'm using, which is running an older version of MySQL (v5.5), the query just hangs. It seems to get caught on "Copying to tmp table":
SELECT
SQL_CALC_FOUND_ROWS
DISTINCT p.parcel_number,
p.street_number,
p.street_name,
p.site_address_city_state,
p.number_of_units,
p.number_of_stories,
p.bedrooms,
p.bathrooms,
p.lot_area_sqft,
p.cost_per_sq_ft,
p.year_built,
p.sales_date,
p.sales_price,
p.id
FROM (
SELECT APN, property_case_detail_id FROM property_inspection AS pi
GROUP BY APN, property_case_detail_id
HAVING
COUNT(IF(status='Resolved Date', 1, NULL)) = 0
) as open_cases
JOIN property AS p
ON p.parcel_number = open_cases.APN
LIMIT 0, 1000;
mysql> show processlist;
+-------+-------------+-----------+--------------+---------+------+----------------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-------+-------------+-----------+--------------+---------+------+----------------------+------------------------------------------------------------------------------------------------------+
| 21120 | headsupcity | localhost | lead_housing | Query | 21 | Copying to tmp table | SELECT
SQL_CALC_FOUND_ROWS
DISTINCT p.parcel_number,
p.street_numbe |
| 21121 | headsupcity | localhost | lead_housing | Query | 0 | NULL | show processlist |
+-------+-------------+-----------+--------------+---------+------+----------------------+------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
Explains are different on my local machine and on the server, and I'm assuming the only reason my query runs at all on my local machine, is because of the key that is automatically created on the derived table:
Explain (local):
+----+-------------+------------+------------+------+---------------+-------------+---------+------------------------------+---------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+---------------+-------------+---------+------------------------------+---------+----------+---------------------------------+
| 1 | PRIMARY | p | NULL | ALL | NULL | NULL | NULL | NULL | 40319 | 100.00 | Using temporary |
| 1 | PRIMARY | <derived2> | NULL | ref | <auto_key0> | <auto_key0> | 8 | lead_housing.p.parcel_number | 40 | 100.00 | NULL |
| 2 | DERIVED | pi | NULL | ALL | NULL | NULL | NULL | NULL | 1623978 | 100.00 | Using temporary; Using filesort |
+----+-------------+------------+------------+------+---------------+-------------+---------+------------------------------+---------+----------+---------------------------------+
Explain (server):
+----+-------------+------------+------+---------------+------+---------+------+---------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+---------------+------+---------+------+---------+------------------------------------------+
| 1 | PRIMARY | p | ALL | NULL | NULL | NULL | NULL | 41369 | Using temporary |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 122948 | Using where; Distinct; Using join buffer |
| 2 | DERIVED | pi | ALL | NULL | NULL | NULL | NULL | 1718586 | Using temporary; Using filesort |
+----+-------------+------------+------+---------------+------+---------+------+---------+------------------------------------------+
Schemas:
mysql> explain property_inspection;
+-------------------------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------+--------------+------+-----+-------------------+-----------------------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| lblCaseNo | int(11) | NO | MUL | NULL | |
| APN | bigint(10) | NO | MUL | NULL | |
| date | varchar(50) | NO | | NULL | |
| status | varchar(500) | NO | | NULL | |
| property_case_detail_id | int(11) | YES | MUL | NULL | |
| case_type_id | int(11) | YES | MUL | NULL | |
| date_modified | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| update_status | tinyint(1) | YES | | 1 | |
| created_date | datetime | NO | | NULL | |
+-------------------------+--------------+------+-----+-------------------+-----------------------------+
10 rows in set (0.02 sec)
mysql> explain property; (not all columns, but you get the gist)
+----------------------------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+----------------------------+--------------+------+-----+-------------------+-----------------------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| parcel_number | bigint(10) | NO | | 0 | |
| date_modified | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| created_date | datetime | NO | | NULL | |
+----------------------------+--------------+------+-----+-------------------+-----------------------------+
Variables that might be relevant:
tmp_table_size: 16777216
innodb_buffer_pool_size: 8589934592
Any ideas on how to optimize this, and any idea why the explains are so different?
Since this is where the Optimizers are quite different, let's try to optimize
SELECT APN, property_case_detail_id FROM property_inspection AS pi
GROUP BY APN, property_case_detail_id
HAVING
COUNT(IF(status='Resolved Date', 1, NULL)) = 0
) as open_cases
Give this a try:
SELECT ...
FROM property AS p
WHERE NOT EXISTS ( SELECT 1 FROM property_inspection
WHERE status = 'Resolved Date'
AND p.parcel_number = APN )
ORDER BY ??? -- without this, the `LIMIT` is unpredictable
LIMIT 0, 1000;
or...
SELECT ...
FROM property AS p
LEFT JOIN property_inspection AS pi ON p.parcel_number = pi.APN
WHERE pi.status = 'Resolved Date'
AND pi.APN IS NULL
ORDER BY ??? -- without this, the `LIMIT` is unpredictable
LIMIT 0, 1000;
Index:
property_inspection: INDEX(status, parcel_number) -- in either order
MySQL 5.5 and 5.7 are quite different and the later has better optimizer so there is no surprise that explain plans are different.
You'd better provide SHOW CREATE TABLE property; and SHOW CREATE TABLE property_inspection; outputs as it will show indexes that are on your tables.
Your sub-query is the issue.
- Server tries to process 1.6M rows with no index and grouping everything.
- Having is quite expensive operation so you'd better avoid it, expecially in sub-queries.
- Grouping in this case is bad idea. You do not need the aggregation/counting. You need to check if the 'Resolved Date' status is just exists
Based on the information provided I'd recommend:
- Alter table property_inspection to reduce length of status column.
- Add index on the column. Use covering index (APN, property_case_detail_id, status) if possible (in this columns order).
- Change query to something like this:
SELECT
SQL_CALC_FOUND_ROWS
DISTINCT p.parcel_number,
...
p.id
FROM
property_inspection AS `pi1`
INNER JOIN property AS p ON (
p.parcel_number = `pi1`.APN
)
LEFT JOIN (
SELECT
`pi2`.property_case_detail_id
, `pi2`. APN
FROM
property_inspection AS `pi2`
WHERE
`status` = 'Resolved Date'
) AS exclude ON (
exclude.APN = `pi1`.APN
AND exclude.property_case_detail_id = `pi1`.property_case_detail_id
)
WHERE
exclude.APN IS NULL
LIMIT
0, 1000;

How do i query from a Join Table?

New to SQL, Assuming this is the right way to create a JOIN Table between my two main entities, do I have to hardcode insert data for the join table? How do I query from my join table? I'm using SQL Fiddle so I'm not sure if the links are being produced correctly to my foreign keys.
CREATE TABLE Organization(
`Organization_id` int NOT NULL,
PRIMARY KEY(`Organization_id`)
);
CREATE TABLE QuestionBank(
`Question_id` int NOT NULL,
`Question_text` VARCHAR(255) NOT NULL,
PRIMARY KEY(`Question_id`)
);
CREATE TABLE OrganizationQuestion(
`OrganizationQuestion_id` int NOT NULL,
`Question_id` int NOT NULL,
`Organization_id` int NOT NULL,
PRIMARY KEY(`OrganizationQuestion_id`),
FOREIGN KEY(`Question_id`) REFERENCES QuestionBank(`Question_id`),
FOREIGN KEY(`Organization_id`) REFERENCES Organization(`Organization_id`)
);
INSERT INTO Organization(`Organization_id`) VALUES(1);
INSERT INTO QuestionBank(`Question_id`, `Question_text`) VALUES(1, 'How did he perform?');
INSERT INTO OrganizationQuestion(`OrganizationQuestion_id`, `Question_id`, `Organization_id`)
VALUES(1, 1, 1);
This is your join:
select oq.OrganizationQuestion_id,
oq.Question_id,
oq.Organization_id,
o.Organization_id,
qb.Question_id,
qb.Question_text
from OrganizationQuestion oq
join Organization o
on o.Organization_id = oq.Organization_id
join QuestionBank qb
on qb.Question_id = oq.Question_id
+-------------------------+-------------+-----------------+-----------------+-------------+---------------------+
| OrganizationQuestion_id | Question_id | Organization_id | Organization_id | Question_id | Question_text |
+-------------------------+-------------+-----------------+-----------------+-------------+---------------------+
| 1 | 1 | 1 | 1 | 1 | How did he perform? |
+-------------------------+-------------+-----------------+-----------------+-------------+---------------------+
Which is not terribly interesting because you made almost everything a 1.
Output with EXPLAIN:
+----+-------------+-------+--------+-----------------------------+---------+---------+---------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-----------------------------+---------+---------+---------------------------------+------+-------------+
| 1 | SIMPLE | oq | ALL | Question_id,Organization_id | NULL | NULL | NULL | 1 | NULL |
| 1 | SIMPLE | o | eq_ref | PRIMARY | PRIMARY | 4 | so_gibberish.oq.Organization_id | 1 | Using index |
| 1 | SIMPLE | qb | eq_ref | PRIMARY | PRIMARY | 4 | so_gibberish.oq.Question_id | 1 | NULL |
+----+-------------+-------+--------+-----------------------------+---------+---------+---------------------------------+------+-------------+
Remember that KEYS (Indexes) are not used in queries with small tables. It takes longer to use the KEYS than just a table scan.
To view indexes on a table:
mysql> show indexes from OrganizationQuestion;
+----------------------+------------+-----------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------------+------------+-----------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| organizationquestion | 0 | PRIMARY | 1 | OrganizationQuestion_id | A | 1 | NULL | NULL | | BTREE | | |
| organizationquestion | 1 | Question_id | 1 | Question_id | A | 1 | NULL | NULL | | BTREE | | |
| organizationquestion | 1 | Organization_id | 1 | Organization_id | A | 1 | NULL | NULL | | BTREE | | |
+----------------------+------------+-----------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
See the MySQL Manual pages for SHOW INDEX and EXPLAIN
Edit 2 completely different question
To disallow content during an INSERT (say, 2 FK id's in one table being the same such a Mail 'sender' and 'recipient')
See the following Answer for generating a
signal sqlstate '45000';

MySQL break up left table in LEFT JOIN query for performance enhancement

I have the following MySQL query:
SELECT pool.username
FROM pool
LEFT JOIN sent ON pool.username = sent.username
AND sent.campid = 'YA1LGfh9'
WHERE sent.username IS NULL
AND pool.gender = 'f'
AND (`location` = 'united states' OR `location` = 'us' OR `location` = 'usa');
The problem is that the pool table contains millions of rows and this query takes over 12 minutes to complete. I realize that in this query, the entire left table (pool) is being scanned. The pool table has an auto incremented id row.
I would like to split this query into multiple queries so that rather than scanning the entire pool table I scan 1000 rows at a time and in the next query I would pick up where I left off (1000-2000,2000-3000) and so on using the id column to keep track.
How can I specify this in my query? Please show examples if you know the answer. Thank you.
here are my indexes if it helps:
mysql> show index from main.pool;
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| pool | 0 | PRIMARY | 1 | id | A | 9275039 | NULL | NULL | | BTREE | |
| pool | 1 | username | 1 | username | A | 9275039 | NULL | NULL | | BTREE | |
| pool | 1 | source | 1 | source | A | 1 | NULL | NULL | | BTREE | |
| pool | 1 | location | 1 | location | A | 38168 | NULL | NULL | | BTREE | |
| pool | 1 | pdex | 1 | gender | A | 2 | NULL | NULL | | BTREE | |
| pool | 1 | pdex | 2 | username | A | 9275039 | NULL | NULL | | BTREE | |
| pool | 1 | pdex | 3 | id | A | 9275039 | NULL | NULL | | BTREE | |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
8 rows in set (0.00 sec)
mysql> show index from main.sent;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| sent | 0 | PRIMARY | 1 | primary_key | A | 351 | NULL | NULL | | BTREE | |
| sent | 1 | username | 1 | username | A | 175 | NULL | NULL | | BTREE | |
| sent | 1 | sdex | 1 | campid | A | 7 | NULL | NULL | | BTREE | |
| sent | 1 | sdex | 2 | username | A | 351 | NULL | NULL | | BTREE | |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
and here is the explain for my query:
----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+-------+---------+--------------------------------------+
| 1 | SIMPLE | pool | ref | location,pdex | pdex | 5 | const | 6084332 | Using where |
| 1 | SIMPLE | sent | index | sdex | sdex | 309 | NULL | 351 | Using where; Using index; Not exists |
+----+-------------+-------+-------+---------------+------+---------+-------+---------+--------------------------------------+
here is the structure of the pool table:
| pool | CREATE TABLE `pool` (
`id` int(20) NOT NULL AUTO_INCREMENT,
`username` varchar(50) CHARACTER SET utf8 NOT NULL,
`source` varchar(10) CHARACTER SET utf8 NOT NULL,
`gender` varchar(1) CHARACTER SET utf8 NOT NULL,
`location` varchar(50) CHARACTER SET utf8 NOT NULL,
PRIMARY KEY (`id`),
KEY `username` (`username`),
KEY `source` (`source`),
KEY `location` (`location`),
KEY `pdex` (`gender`,`username`,`id`)
) ENGINE=MyISAM AUTO_INCREMENT=9327026 DEFAULT CHARSET=latin1 |
here is the structure of the sent table:
| sent | CREATE TABLE `sent` (
`primary_key` int(50) NOT NULL AUTO_INCREMENT,
`username` varchar(50) NOT NULL,
`from` varchar(50) NOT NULL,
`campid` varchar(255) NOT NULL,
`timestamp` int(20) NOT NULL,
PRIMARY KEY (`primary_key`),
KEY `username` (`username`),
KEY `sdex` (`campid`,`username`)
) ENGINE=MyISAM AUTO_INCREMENT=352 DEFAULT CHARSET=latin1 |
This produces a syntax error but this WHERE clause in the beginning is what im after:
SELECT pool.username
FROM pool
WHERE id < 1000
LEFT JOIN sent ON pool.username = sent.username
AND sent.campid = 'YA1LGfh9'
WHERE sent.username IS NULL
AND pool.gender = 'f'
AND (location = 'united states' OR location = 'us' OR location = 'usa');
Splitting your query does not sound like the right approach.
The better way would be to fetch some records from your existing query, send your messages and then continue fetching.
Your query could benefit from another compound index on
pool( location, gender, username )
This should allow to run your complete query from sdex and your new index.
If you really want to split the query, an easy approach could be to
SELECT MIN(id), MAX(id) FROM pool
and then loop from min to max in steps of 1000, and add id >= r AND id < r+1000 to your query.
This could return 0 rows if you have gaps, but it would never return more than 1000 rows at once. A different compound index on pool including (id, location, gender and maybe username) could help for this query.
Looks like its using pool.location
Could try adding an index on gender, might no be a lot of help though.
Rationalising location to a country code in your data, and indexing that would probably be useful.
But the first index to add looks to be campid to me, that could make a serious dent in the number of records it has to test.

MySQL index slowing down query

MySQL Server version: 5.0.95
Tables All: InnoDB
I am having an issue with a MySQL db query. Basically I am finding that if I index a particular varchar(50) field tag.name, my queries take longer (x10) than not indexing the field. I am trying to speed this query up, however my efforts seem to be counter productive.
The culprit line and field seems to be:
WHERE `t`.`name` IN ('news','home')
I have noticed that if i query the tag table directly without a join using the same criteria and with the name field indexed, i do not have the issue.. It actually works faster as expected.
EXAMPLE Query **
SELECT `a`.*, `u`.`pen_name`
FROM `tag_link` `tl`
INNER JOIN `tag` `t`
ON `t`.`tag_id` = `tl`.`tag_id`
INNER JOIN `article` `a`
ON `a`.`article_id` = `tl`.`link_id`
INNER JOIN `user` `u`
ON `a`.`user_id` = `u`.`user_id`
WHERE `t`.`name` IN ('news','home')
AND `tl`.`type` = 'article'
AND `a`.`featured` = 'featured'
GROUP BY `article_id`
LIMIT 0 , 5
EXPLAIN with index **
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+--------------------------+---------+---------+-------------------+------+-----------------------------------------------------------+
| 1 | SIMPLE | t | range | PRIMARY,name | name | 152 | NULL | 4 | Using where; Using index; Using temporary; Using filesort |
| 1 | SIMPLE | tl | ref | tag_id,link_id,link_id_2 | tag_id | 4 | portal.t.tag_id | 10 | Using where |
| 1 | SIMPLE | a | eq_ref | PRIMARY,fk_article_user1 | PRIMARY | 4 | portal.tl.link_id | 1 | Using where |
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | portal.a.user_id | 1 | |
+----+-------------+-------+--------+--------------------------+---------+---------+-------------------+------+-----------------------------------------------------------+
EXPLAIN without index **
+----+-------------+-------+--------+--------------------------+---------+---------+---------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+--------------------------+---------+---------+---------------------+------+-------------+
| 1 | SIMPLE | a | index | PRIMARY,fk_article_user1 | PRIMARY | 4 | NULL | 8742 | Using where |
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | portal.a.user_id | 1 | |
| 1 | SIMPLE | tl | ref | tag_id,link_id,link_id_2 | link_id | 4 | portal.a.article_id | 3 | Using where |
| 1 | SIMPLE | t | eq_ref | PRIMARY | PRIMARY | 4 | portal.tl.tag_id | 1 | Using where |
+----+-------------+-------+--------+--------------------------+---------+---------+---------------------+------+-------------+
TABLE CREATE
CREATE TABLE `tag` (
`tag_id` int(11) NOT NULL auto_increment,
`name` varchar(50) NOT NULL,
`type` enum('layout','image') NOT NULL,
`create_dttm` datetime default NULL,
PRIMARY KEY (`tag_id`)
) ENGINE=InnoDB AUTO_INCREMENT=43077 DEFAULT CHARSET=utf8
INDEXS
SHOW INDEX FROM tag_link;
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| tag_link | 0 | PRIMARY | 1 | tag_link_id | A | 42023 | NULL | NULL | | BTREE | |
| tag_link | 1 | tag_id | 1 | tag_id | A | 10505 | NULL | NULL | | BTREE | |
| tag_link | 1 | link_id | 1 | link_id | A | 14007 | NULL | NULL | | BTREE | |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
SHOW INDEX FROM article;
+---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| article | 0 | PRIMARY | 1 | article_id | A | 5723 | NULL | NULL | | BTREE | |
| article | 1 | fk_article_user1 | 1 | user_id | A | 1 | NULL | NULL | | BTREE | |
| article | 1 | create_dttm | 1 | create_dttm | A | 5723 | NULL | NULL | YES | BTREE | |
+---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
Final Solution
It seems that MySQL is just sorted the data incorrectly. In the end it turned out faster to look at the tag table as a sub query returning the ids.
It seems that article_id is the primary key for the article table.
Since you're grouping by article_id, MySQL needs to return the records in order by that column, in order to perform the GROUP BY.
You can see that without the index, it scans all records in the article table, but they're at least in order by article_id, so no later sort is required. The LIMIT optimization can be applied here, since it's already in order, it can just stop after it gets five rows.
In the query with the index on tag.name, instead of scanning the entire articles table, it utilizes the index, but against the tag table, and starts there. Unfortunately, when doing this, the records must later be sorted by article.article_id in order to complete the GROUP BY clause. The LIMIT optimization can't be applied since it must return the entire result set, then order it, in order to get the first 5 rows.
In this case, MySQL just guesses wrongly.
Without the LIMIT clause, I'm guessing that using the index is faster, which is maybe what MySQL was guessing.
How big are your tables?
I noticed in the first explain you have a "Using temporary; Using filesort" which is bad. Your query is likely being dumped to disc which makes it way slower than in memory queries.
Also try to avoid using "select *" and instead query the minimum fields needed.