I have projects table and each project has multiple categories assigned. The category mapping is stored in the project_category table. I want to list all recent projects that are not expired. Here is the schema, indexes and query.
Schema
Create table projects (
project_id Bigint UNSIGNED NOT NULL AUTO_INCREMENT,
project_title Varchar(300) NOT NULL,
date_added Datetime NOT NULL,
is_expired Bit(1) NOT NULL DEFAULT false,
Primary Key (project_id)) ENGINE = InnoDB;
Create table project_category (
project_category_id Int UNSIGNED NOT NULL AUTO_INCREMENT,
cat_id Int UNSIGNED NOT NULL,
project_id Bigint UNSIGNED NOT NULL,
Primary Key (project_category_id)) ENGINE = InnoDB;
Indexes
CREATE INDEX project_listing (is_expired, date_added) ON projects;
Create INDEX category_mapping_IDX ON project_category (project_id,cat_id);
Query
mysql> EXPLAIN
SELECT P.project_id
FROM projects P
INNER JOIN project_category C USING (project_id)
WHERE P.is_expired=false
AND C.cat_id=17
ORDER BY P.date_added DESC LIMIT 27840,10;
+----+-------------+-------+--------+--------------------------------------------+---------+---------+-------------------------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+--------------------------------------------+---------+---------+-------------------------+--------+---------------------------------+
| 1 | SIMPLE | C | ref | project_id,cat_id,category_mapping_IDX | cat_id | 4 | const | 185088 | Using temporary; Using filesort |
| 1 | SIMPLE | P | eq_ref | PRIMARY,is_expired_INX,project_listing_IDX | PRIMARY | 8 | freelancer.C.project_id | 1 | Using where |
+----+-------------+-------+--------+--------------------------------------------+---------+---------+-------------------------+--------+---------------------------------+
I am wondering why MySQL isn't using the index on project_category, and why it is doing a full sort?
I also tried the following query just to avoid file sorting, but it is not working either.
mysql> EXPLAIN
SELECT P.project_id
FROM projects P,
(
SELECT P.project_id
FROM projects P
INNER JOIN project_category C USING (project_id)
WHERE C.cat_id=17
) F
WHERE F.project_id=P.project_id
AND P.is_expired=FALSE
LIMIT 10;
+----+-------------+------------+--------+--------------------------------------------+---------+---------+-------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+--------------------------------------------+---------+---------+-------------------------+--------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 110920 | |
| 1 | PRIMARY | P | eq_ref | PRIMARY,is_expired_INX,project_listing_IDX | PRIMARY | 8 | F.project_id | 1 | Using where |
| 2 | DERIVED | C | ref | project_id,cat_id,category_mapping_IDX | cat_id | 4 | | 185088 | |
| 2 | DERIVED | P | eq_ref | PRIMARY | PRIMARY | 8 | freelancer.C.project_id | 1 | Using index |
+----+-------------+------------+--------+--------------------------------------------+---------+---------+-------------------------+--------+-------------+
Your problem is here:
Create INDEX category_mapping_IDX ON project_category (project_id,cat_id);
This index is not useful when you're trying to subselect a single cat_id, because cat_id is not the first part of the index. Think of the index as a concatenated string and you can see why it can not be used. Swap the order:
Create INDEX category_mapping_IDX ON project_category (cat_id, project_id);
Related
CREATE TABLE `app_user` (
`uid` int NOT NULL,
`uname` varchar(20) NOT NULL,
`upwd` varchar(20) DEFAULT NULL,
PRIMARY KEY (`uid`),
UNIQUE KEY `uname` (`uname`)
) ENGINE=InnoDB
This is table sql i used to test, and i insert a million records into it. When i use the following sql to count row. It will cost 20 seconds to execute.
select count(*) from app_user;
+----+-------------+----------+------------+-------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+------+---------+------+--------+----------+-------------+
| 1 | SIMPLE | app_user | NULL | index | NULL | uid | 4 | NULL | 996948 | 100.00 | Using index |
+----+-------------+----------+------------+-------+---------------+------+---------+------+--------+----------+-------------+
In this case, all records' uid are greater than 0. So i can use the sql like this to replace the first sql:
select count(*) from app_user where uid > 0; // In this case, all uid > 0
+----+-------------+----------+------------+-------+---------------+---------+---------+------+--------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+--------+----------+--------------------------+
| 1 | SIMPLE | app_user | NULL | range | PRIMARY,uid | PRIMARY | 4 | NULL | 498474 | 100.00 | Using where; Using index |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+--------+----------+--------------------------+
It just cost 500 milliseconds. Why does this happen?
I want to know why the second sql execute so fast.
MySQL version: 5.7.23
Engine: InnoDB
I created an application that monitors network devices from around the world with ICMP echo request packets. It pings devices on a regular interval and stores the results in a MySQL table.
I have a query that fetches the latest 100 up/down events for a given device, but it takes ~38 seconds to execute, which is way too long. I'm trying to optimize the query but I'm kind of lost.
The query:
select
c.id as clusterId,
c.name as cluster,
m.id as machineId,
m.label as machine,
h.id as pingResultId,
h.timePinged as `timestamp`,
h.status
from pinger_history h
join pinger_history_updown ud on ud.pingResultId = h.id
join pinger_machine_ip_addresses i on h.machineIpId = i.id
join pinger_machines m on i.machineId = m.id
join pinger_clusters c on m.clusterId = c.id
where h.deviceId = ?
order by h.id desc
limit 100
Explain query output:
+----+-------------+-------+------------+--------+------------------------------+---------+---------+---------------------------+--------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+------------------------------+---------+---------+---------------------------+--------+----------+----------------------------------------------+
| 1 | SIMPLE | ud | NULL | index | PRIMARY | PRIMARY | 4 | NULL | 111239 | 100.00 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | h | NULL | eq_ref | PRIMARY,deviceId,machineIpId | PRIMARY | 4 | dashboard.ud.pingResultId | 1 | 5.00 | Using where |
| 1 | SIMPLE | i | NULL | eq_ref | PRIMARY,machineId | PRIMARY | 4 | dashboard.h.machineIpId | 1 | 100.00 | NULL |
| 1 | SIMPLE | m | NULL | eq_ref | PRIMARY,clusterId | PRIMARY | 4 | dashboard.i.machineId | 1 | 100.00 | Using where |
| 1 | SIMPLE | c | NULL | eq_ref | PRIMARY | PRIMARY | 4 | dashboard.m.clusterId | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+------------------------------+---------+---------+---------------------------+--------+----------+----------------------------------------------+
The pinger_history table consists of around 483,750,000 rows and pinger_history_updown around 115,520 rows. The other tables are small in comparison (less than 300 rows).
If anyone has experience in optimizing queries or debugging bottlenecks then all help will be greatly appreciated.
Edit:
I added the missing order by h.id desc to the query and I made pinger_history the first table in the query.
Here are the create table queries for pinger_history and pinger_history_updown:
pinger_history:
mysql> show create table pinger_history;
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| pinger_history | CREATE TABLE `pinger_history` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`deviceId` int(10) unsigned NOT NULL,
`machineIpId` int(10) unsigned NOT NULL,
`minRoundTripTime` decimal(6,1) unsigned DEFAULT NULL,
`maxRoundTripTime` decimal(6,1) unsigned DEFAULT NULL,
`averageRoundTripTime` decimal(6,1) unsigned DEFAULT NULL,
`packetLossRatio` decimal(3,2) unsigned DEFAULT NULL,
`timePinged` datetime NOT NULL,
`status` enum('Up','Unstable','Down') DEFAULT NULL,
`firstOppositeStatusPingResultId` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `deviceId` (`deviceId`),
KEY `machineIpId` (`machineIpId`),
KEY `timePinged` (`timePinged`),
KEY `firstOppositeStatusPingResultId` (`firstOppositeStatusPingResultId`),
CONSTRAINT `pinger_history_ibfk_2` FOREIGN KEY (`machineIpId`) REFERENCES `pinger_machine_ip_addresses` (`id`),
CONSTRAINT `pinger_history_ibfk_4` FOREIGN KEY (`deviceId`) REFERENCES `pinger_devices` (`id`) ON DELETE CASCADE,
CONSTRAINT `pinger_history_ibfk_5` FOREIGN KEY (`firstOppositeStatusPingResultId`) REFERENCES `pinger_history` (`id`) ON DELETE SET NULL
) ENGINE=InnoDB AUTO_INCREMENT=483833283 DEFAULT CHARSET=utf8mb4 |
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
pinger_history_updown:
+-----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| pinger_history_updown | CREATE TABLE `pinger_history_updown` (
`pingResultId` int(10) unsigned NOT NULL,
`notified` tinyint(1) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`pingResultId`),
CONSTRAINT `pinger_history_updown_ibfk_1` FOREIGN KEY (`pingResultId`) REFERENCES `pinger_history` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 |
+-----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Edit 2:
Here is the output of show index for pinger_history:
mysql> show index from pinger_history;
+----------------+------------+---------------------------------+--------------+---------------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------+------------+---------------------------------+--------------+---------------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| pinger_history | 0 | PRIMARY | 1 | id | A | 443760800 | NULL | NULL | | BTREE | | |
| pinger_history | 1 | deviceId | 1 | deviceId | A | 288388 | NULL | NULL | | BTREE | | |
| pinger_history | 1 | machineIpId | 1 | machineIpId | A | 71598 | NULL | NULL | | BTREE | | |
| pinger_history | 1 | timePinged | 1 | timePinged | A | 38041236 | NULL | NULL | | BTREE | | |
| pinger_history | 1 | firstOppositeStatusPingResultId | 1 | firstOppositeStatusPingResultId | A | 8973 | NULL | NULL | YES | BTREE | | |
+----------------+------------+---------------------------------+--------------+---------------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
Edit 3:
Here is the explain output when I add straight_join:
Note that the query takes almost 2 minutes with straight_join but around 36 seconds without.
+----+-------------+-------+------------+--------+------------------------------+----------+---------+-------------------------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+------------------------------+----------+---------+-------------------------+--------+----------+-------------+
| 1 | SIMPLE | h | NULL | ref | PRIMARY,deviceId,machineIpId | deviceId | 4 | const | 344062 | 100.00 | Using where |
| 1 | SIMPLE | ud | NULL | eq_ref | PRIMARY | PRIMARY | 4 | dashboard.h.id | 1 | 100.00 | Using index |
| 1 | SIMPLE | i | NULL | eq_ref | PRIMARY,machineId | PRIMARY | 4 | dashboard.h.machineIpId | 1 | 100.00 | NULL |
| 1 | SIMPLE | m | NULL | eq_ref | PRIMARY,clusterId | PRIMARY | 4 | dashboard.i.machineId | 1 | 100.00 | Using where |
| 1 | SIMPLE | c | NULL | eq_ref | PRIMARY | PRIMARY | 4 | dashboard.m.clusterId | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+------------------------------+----------+---------+-------------------------+--------+----------+-------------+
create index on following columns
pingResultId ,machineIpId ,clusterId .pinger_clusters .id,
pinger_machine_ip_addresses.id,pinger_history.id,pinger_machines.id,deviceId ,
pinger_history_updown.pingResultId
indexing will reduce the time taken to fetch the data
You wrote that your query fetches the latest 100 events for device, but there is no ORDER BY clause in your SQL.
Add ORDER BY h.id DESCto your query and create composite index on (devideId, id) fields.
I would add the "STRAIGHT_JOIN" keyword and also put the pinger_history into the first position. Then, include an index on pinger_history by the DeviceID to optimize the WHERE clause. Your other tables would probably already have an index on their respective ID keys implied and should be good. The STRAIGHT_JOIN clause tells MySQL to run the query in the table/join order I gave you, don't imply something else.
select STRAIGHT_JOIN
c.id as clusterId,
c.name as cluster,
m.id as machineId,
m.label as machine,
h.id as pingResultId,
h.timePinged as `timestamp`,
h.status
from
pinger_history h
join pinger_history_updown ud
on h.id = ud.pingResultId
join pinger_machine_ip_addresses i
on h.machineIpId = i.id
join pinger_machines m
on i.machineId = m.id
join pinger_clusters c
on m.clusterId = c.id
where
h.deviceId = ?
order by
h.id desc
limit 100
Since you DO want the most recent records, I would definitely have and index on your pinger_history table on (DeviceID, ID ) -- change your existing key of DeviceID only and change it to (DeviceID, ID)
This way, the WHERE clause is FIRST optimized to get the Device ID records. By having the ID as part of the index, but in the second position, the ORDER by can utilize that to get the most recent first for you.
Plan A: Get rid of pinger_history_updown and move notified into pinger_history. Perhaps augment status to indicate "CameUp" and "WentDown". Pro: That will make the query much faster since it will be able to use INDEX(deviceId). Con: It makes pinger_history a little bigger; adding columns to a huge table will take time.
Plan B: Add deviceId to pinger_history_updown and have INDEX(deviceID, pingResultId). Pro: Much faster query. Con: Redundant data (deviceid) is frowned on.
Plan C: Add an index hint to force the execution to start with pinger_history. Con: "What helps today may hurt tomorrow." (STRAIGHT_JOIN was tested and found to be slower.)
Plan D: See if ANALYZE TABLE for each table will help. Pro: Quick and cheap. Con: May not help.
Plan E: Change to ORDER BY deviceId DESC, id DESC. Pro: Cheap and easy to try. Con: May not help.
Plan F: In pinger_history, change
PRIMARY KEY (`id`),
KEY `deviceId` (`deviceId`),
to
PRIMARY KEY(deviceId, id),
KEY(id)
This will make the desired rows "clustered" much better. Pros: Much faster. Con: ALTER TABLE will take a long time for that huge table.
Plan G: Assume it is an explode-implode problem and move the LIMIT into a derived table:
select c.id as clusterId, c.name as cluster, m.id as machineId,
m.label as machine, h2.id as pingResultId, h2.timePinged as `timestamp`,
h2.status
FROM
( -- "derived table"
SELECT ud1.pingResultId
FROM pinger_history_updown AS ud1
JOIN pinger_history AS h1 ON ud1.pingResultId = h1.id
WHERE h1.deviceId = ?
ORDER BY ud1.pingResultId
LIMIT 100 -- only needed here
) AS ud2
JOIN pinger_history AS h2 ON ud2.pingResultId = h2.id
join pinger_machine_ip_addresses i ON h.machineIpId = i.id
join pinger_machines m ON i.machineId = m.id
join pinger_clusters c ON m.clusterId = c.id
order by h2.id desc -- Yes, this is repeated
Pro: May make better use of 'covering' INDEX(deviceId), especially if merged with Plan B.
Summary: Start with D and E.
I have a database which consists of three tables, with the following structure:
restaurant table: restaurant_id, location_id, rating. Example: 1325, 77, 4.5
restaurant_name table: restaurant_id, language, name. Example: 1325, 'en', 'Pizza Express'
location_name table: location_id, language, name. Example: 77, 'en', 'New York'
I would like to get the restaurant info in English, sorted by location name and restaurant name, and use the LIMIT clause to paginate the result. So my SQL is:
SELECT ln.name, rn.name
FROM restaurant r
INNER JOIN location_name ln
ON r.location_id = ln.location_id
AND ln.language = 'en'
INNER JOIN restaurant_name rn
ON r.restaurant_id = rn.restaurant_id
AND rn.language = 'en'
ORDER BY ln.name, rn.name
LIMIT 0, 50
This is terribly slow - so I refined my SQL with deferred JOIN, which make things a lot faster (from over 10 seconds to 2 seconds):
SELECT ln.name, rn.name
FROM restaurant r
INNER JOIN (
SELECT r.restaurant_id
FROM restaurant r
INNER JOIN location_name ln
ON r.location_id = ln.location_id
AND ln.language = 'en'
INNER JOIN restaurant_name rn
ON r.restaurant_id = rn.restaurant_id
AND rn.language = 'en'
ORDER BY ln.name, rn.name
LIMIT 0, 50
) r1
ON r.restaurant_id = r1.restaurant_id
INNER JOIN location_name ln
ON r.location_id = ln.location_id
AND ln.language = 'en'
INNER JOIN restaurant_name rn
ON r.restaurant_id = rn.restaurant_id
AND rn.language = 'en'
ORDER BY ln.name, rn.name
2 seconds is unfortunately still not very acceptable to the user, so I go and check the EXPLAIN of the my query, and it appears that the slow part is on the ORDER BY clause, which I see "Using temporary; Using filesort". I checked the official reference manual about ORDER BY optimization and I come across this statement:
In some cases, MySQL cannot use indexes to resolve the ORDER BY,
although it may still use indexes to find the rows that match the
WHERE clause. Examples:
The query joins many tables, and the columns in the ORDER BY are not
all from the first nonconstant table that is used to retrieve rows.
(This is the first table in the EXPLAIN output that does not have a
const join type.)
So for my case, given that the two columns I'm ordering by are from the nonconstant joined tables, index cannot be used. My question is, is there any other approach I can take to speed things up, or what I've done so far is already the best I can achieve?
Thanks in advance for your help!
EDIT 1
Below is the EXPLAIN output with the ORDER BY clause:
+----+-------------+------------+--------+--------------------------+-----------------------+---------+--------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+--------------------------+-----------------------+---------+--------------------------------+------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 50 | |
| 1 | PRIMARY | rn | ref | idx_restaurant_name_1 | idx_restaurant_name_1 | 1538 | r1.restaurant_id,const,const | 1 | Using where |
| 1 | PRIMARY | r | eq_ref | PRIMARY,idx_restaurant_1 | PRIMARY | 4 | r1.restaurant_id | 1 | |
| 1 | PRIMARY | ln | ref | idx_location_name_1 | idx_location_name_1 | 1538 | test.r.location_id,const,const | 1 | Using where |
| 2 | DERIVED | rn | ALL | idx_restaurant_name_1 | NULL | NULL | NULL | 8484 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | r | eq_ref | PRIMARY,idx_restaurant_1 | PRIMARY | 4 | test.rn.restaurant_id | 1 | |
| 2 | DERIVED | ln | ref | idx_location_name_1 | idx_location_name_1 | 1538 | test.r.location_id | 1 | Using where |
+----+-------------+------------+--------+--------------------------+-----------------------+---------+--------------------------------+------+----------------------------------------------+
Below is the EXPLAIN output without the ORDER BY clause:
+----+-------------+------------+--------+--------------------------+-----------------------+---------+--------------------------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+--------------------------+-----------------------+---------+--------------------------------+------+--------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 50 | |
| 1 | PRIMARY | rn | ref | idx_restaurant_name_1 | idx_restaurant_name_1 | 1538 | r1.restaurant_id,const,const | 1 | Using where |
| 1 | PRIMARY | r | eq_ref | PRIMARY,idx_restaurant_1 | PRIMARY | 4 | r1.restaurant_id | 1 | |
| 1 | PRIMARY | ln | ref | idx_location_name_1 | idx_location_name_1 | 1538 | test.r.location_id,const,const | 1 | Using where |
| 2 | DERIVED | rn | index | idx_restaurant_name_1 | idx_restaurant_name_1 | 1538 | NULL | 8484 | Using where; Using index |
| 2 | DERIVED | r | eq_ref | PRIMARY,idx_restaurant_1 | PRIMARY | 4 | test.rn.restaurant_id | 1 | |
| 2 | DERIVED | ln | ref | idx_location_name_1 | idx_location_name_1 | 1538 | test.r.location_id | 1 | Using where; Using index |
+----+-------------+------------+--------+--------------------------+-----------------------+---------+--------------------------------+------+--------------------------+
EDIT 2
Below are the DDL of the table. I built them for illustrating this problem only, the real table has much more columns.
CREATE TABLE restaurant (
restaurant_id INT NOT NULL AUTO_INCREMENT,
location_id INT NOT NULL,
rating INT NOT NULL,
PRIMARY KEY (restaurant_id),
INDEX idx_restaurant_1 (location_id)
);
CREATE TABLE restaurant_name (
restaurant_id INT NOT NULL,
language VARCHAR(255) NOT NULL,
name VARCHAR(255) NOT NULL,
INDEX idx_restaurant_name_1 (restaurant_id, language),
INDEX idx_restaurant_name_2 (name)
);
CREATE TABLE location_name (
location_id INT NOT NULL,
language VARCHAR(255) NOT NULL,
name VARCHAR(255) NOT NULL,
INDEX idx_location_name_1 (location_id, language),
INDEX idx_location_name_2 (name)
);
Based on the EXPLAIN numbers, there could be about 170 "pages" of restaurants (8484/50)? I suggest that that is impractical for paging through. I strongly recommend you rethink the UI. In doing so, the performance problem you state will probably vanish.
For example, the UI could be 2 steps instead of 170 to get to the restaurants in Zimbabwe. Step 1, pick a country. (OK, that might be page 5 of the countries.) Step 2, view the list of restaurants in that country; it would be only a few pages to flip through. Much better for the user; much better for the database.
Addenda
In order to optimize the pagination, get the paginated list of pages from a single table (so that you can 'remember where you left off'). Then join the language table(s) to look up the translations. Note that this only looks up on page's worth of translations, not thousands.
Have found an inefficient query in our system. content holds versions of slides, and this is supposed to select the highest version of a slide by id.
SELECT `content`.*
FROM (`content`)
JOIN (
SELECT max(version) as `version` from `content`
WHERE `slide_id` = '16901'
group by `slide_id`
) c ON `c`.`version` = `content`.`version`;
EXPLAIN
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------+------+----------+--------------------------+
| 1 | PRIMARY | <derived2> | NULL | system | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL |
| 1 | PRIMARY | content | NULL | ref | PRIMARY,version | PRIMARY | 8 | const | 9703 | 100.00 | NULL |
| 2 | DERIVED | content | NULL | ref | PRIMARY,fk_content_slides_idx,thumbnail_asset_id,version,slide_id | fk_content_slides_idx | 8 | const | 1 | 100.00 | Using where; Using index |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------+------+----------+--------------------------+
One big issue is that it returns almost all the slides in the system as the outer query does not filter by slide id. After adding that I get...
SELECT `content`.*
FROM (`content`)
JOIN (
SELECT max(version) as `version` from `content`
WHERE `slide_id` = '16901' group by `slide_id`
) c ON `c`.`version` = `content`.`version`
WHERE `slide_id` = '16901';
EXPLAIN
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------------+------+----------+--------------------------+
| 1 | PRIMARY | <derived2> | NULL | system | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL |
| 1 | PRIMARY | content | NULL | const | PRIMARY,fk_content_slides_idx,version,slide_id | PRIMARY | 16 | const,const | 1 | 100.00 | NULL |
| 2 | DERIVED | content | NULL | ref | PRIMARY,fk_content_slides_idx,thumbnail_asset_id,version,slide_id | fk_content_slides_idx | 8 | const | 1 | 100.00 | Using where; Using index |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------------+------+----------+--------------------------+
That reduces the amount of rows down to one correctly, but doesnt really speed things up.
There are indexes on version, slide_id and a unique key on version AND slide_id.
Is there anything else I can do to speed this up?
Use a TOP LIMIT 1 insetead of Max ?
m
MySQL seems to take an index (version, slide_id) to join the tables. You should get a better result with
SELECT `content`.*
FROM `content`
FORCE INDEX FOR JOIN (fk_content_slides_idx)
join (
SELECT `slide_id`, max(version) as `version` from `content`
WHERE `slide_id` = '16901' group by `slide_id`
) c ON `c`.`slide_id` = `content`.`slide_id` and `c`.`version` = `content`.`version`
You need an index that has slide_id as first column, I just guessed that's fk_content_slides_idx, if not, take another one.
The part FORCE INDEX FOR JOIN (fk_content_slides_idx) is just to enforce it, you should try if mysql takes it by itself without forcing (it should).
You might get even a slightly better result with an index (slide_id, version), it depends on the amount of data (e.g. the number of versions per id) if you see a difference (but you should not spam indexes, and you already have a lot on this table, but you can try it for fun.)
Just a suggestion i think you should avoid the group by slide_id because you are filter by one slide_id only (16901)
SELECT `content`.*
FROM (`content`)
JOIN (
SELECT max(version) as `version` from `content`
WHERE `slide_id` = '16901'
) c ON `c`.`version` = `content`.`version`
WHERE `slide_id` = '16901';
I am creating a table as
create table temp_test2 (
date_id int(11) NOT NULL DEFAULT '0',
`date` date NOT NULL,
`day` int(11) NOT NULL,
PRIMARY KEY (date_id)
);
create table temp_test1 (
date_id int(11) NOT NULL DEFAULT '0',
`date` date NOT NULL,
`day` int(11) NOT NULL,
PRIMARY KEY (date_id)
);
explain select * from temp_test as t inner join temp_test2 as t2 on (t2.date_id =t.date_id) limit 3;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
| 1 | SIMPLE | t | ALL | date_id | NULL | NULL | NULL | 4 | NULL |
| 1 | SIMPLE | t2 | ALL | date_id | NULL | NULL | NULL | 4 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
why the code_id key is not used in both the table, but when I use code_id=something in on condition it's using the key,
explain select * from temp_test as t inner join temp_test2 as t2 on (t2.date_id =t.date_id and t.date_id =1) limit 3;
+----+-------------+-------+-------+-------------------------------------+---------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+-------------------------------------+---------+---------+-------+------+-------+
| 1 | SIMPLE | t | const | PRIMARY,date_id,date_id_2,date_id_3 | PRIMARY | 4 | const | 1 | NULL |
| 1 | SIMPLE | t2 | ref | date_id,date_id_2,date_id_3 | date_id | 4 | const | 1 | NULL |
+----+-------------+-------+-------+-------------------------------------+---------+---------+-------+------+-------+
I tried (unique,composite primary,composite) key also but it is not working.
Can anyone explain why this so?
Because your tables contain a very small number of records, the optimiser decides that it is not worth using the index. A table scan will do just as good.
Also, you selected all fields (SELECT *), if it used the index for executing the JOIN a row scan would still be required to get the full contents.
The query would be more likely to use the index if:
you selected only the date_id field
there were more than 4 rows in temp_test