I have a table with items:
CREATE TABLE `ost_content` (
`uid` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`type` enum('media','serial','season','series') NOT NULL,
`alias` varchar(200) NOT NULL,
`views` mediumint(7) NOT NULL DEFAULT '0',
`ratings_count` enum('0','1','2','4','5') NOT NULL DEFAULT '0',
`ratings_sum` mediumint(5) NOT NULL DEFAULT '0',
`upload_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`conversion_status` enum('converting','error','success','announcement') NOT NULL DEFAULT 'converting',
PRIMARY KEY (`uid`),
UNIQUE KEY `idx_uid_type` (`uid`,`type`),
KEY `idx_type` (`type`),
KEY `idx_upload_date DESC` (`upload_date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
And table, that connect items with categories:
CREATE TABLE `ost_categories2media` (
`categories2media_id` mediumint(6) unsigned NOT NULL AUTO_INCREMENT,
`categories2media_category_id` smallint(5) unsigned NOT NULL,
`categories2media_uid` mediumint(8) unsigned NOT NULL,
PRIMARY KEY (`categories2media_id`),
KEY `categories2media_media_id` (`categories2media_uid`),
KEY `categories2media_category_id` (`categories2media_category_id`)
) ENGINE=InnoDB AUTO_INCREMENT=501114 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Than, I executing query:
SELECT
c1.uid,
c1.alias,
c1.type,
c1.views,
c1.upload_date,
c1.ratings_sum,
c1.ratings_count,
c1.conversion_status
FROM
ost_content c1
LEFT JOIN ost_categories2media c2m ON c2m.categories2media_uid = c1.uid
WHERE
c2m.categories2media_category_id = '53'
AND c1.conversion_status IN ('success', 'announcement')
AND c1.type IN ('serial', 'media')
ORDER BY
c1.upload_date DESC
LIMIT 16, 16
It executing slow, categories2media_category_id check many rows:
+----+-------------+-------+--------+--------------------------------------------------------+------------------------------+---------+---------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+--------------------------------------------------------+------------------------------+---------+---------------------------------+-------+----------------------------------------------+
| 1 | SIMPLE | c2m | ref | categories2media_media_id,categories2media_category_id | categories2media_category_id | 2 | const | 32076 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | c1 | eq_ref | PRIMARY,idx_uid_type,idx_type | PRIMARY | 3 | uakino.c2m.categories2media_uid | 1 | Using where |
+----+-------------+-------+--------+--------------------------------------------------------+------------------------------+---------+---------------------------------+-------+----------------------------------------------+
How I can optimize or rewrite this query?
Mysql indexes are like cooks, too many of them aren't very useful because mysql uses only one index per table. Let's look at ost_categories2media,
That's three separate indexes on three columns. You are better off with two indexes like this.
PRIMARY KEY (`categories2media_id`),
KEY `categories2media_media_id` (`categories2media_uid`,`categories2media_category_id`)
Now mysql no longer has to decide between an index on categories2media_uid or categories2media_category_id it has an index that covers both!
Looking at your ost_content table we see
PRIMARY KEY (`uid`),
UNIQUE KEY `idx_uid_type` (`uid`,`type`),
KEY `idx_type` (`type`),
KEY `idx_upload_date DESC` (`upload_date`)
Some of these indexes are a bit redundant. Any query that filters on the uid field can use the PK while any query that filters on type can use idx_type that means idx_uid_type is there just to enforce the uniqueness. But we can make it more usefull like this:
PRIMARY KEY (`uid`),
UNIQUE KEY `idx_uid_type` (`type`,`uid`),
KEY `idx_upload_date DESC` (`upload_date`)
We've got rid of one index! that ought to make your indexes a lot faster. You still have an index on upload_date that isn't used in this particulary query. So how about a composite index for that?
PRIMARY KEY (`uid`),
UNIQUE KEY `idx_uid_type` (`type`,`uid`),
KEY `idx_upload_date DESC` (`uid`,`upload_date`)
First, the LEFT JOIN is not necessary. So, you can write the query as:
SELECT c.*
FROM ost_content c JOIN
ost_categories2media c2m
ON c2m.categories2media_uid = c.uid
WHERE c2m.categories2media_category_id = '53' AND
c.conversion_status IN ('success', 'announcement') AND
c.type IN ('serial', 'media')
ORDER BY c.upload_date DESC
LIMIT 16, 16;
Unfortunately, your conditions on the content table are not simple = conditions. If they were, and index on ost_content(conversion_status, type, uid) would be recommended. This might still be the better option.
Another option is to go the other way: An index on ost_categories2media(categories2media_category_id, categories2media_uid).
You might find that the first composite index and this query work best:
SELECT c.*
FROM ((SELECT c.*
FROM ost_content c JOIN
ost_categories2media c2m
ON c2m.categories2media_uid = c.uid
WHERE c2m.categories2media_category_id = '53' AND
c.conversion_status = 'success' AND
c.type IN ('serial', 'media')
) UNION ALL
(SELECT c.*
FROM ost_content c JOIN
ost_categories2media c2m
ON c2m.categories2media_uid = c.uid
WHERE c2m.categories2media_category_id = '53' AND
c.conversion_status = 'announcement' AND
c.type IN ('serial', 'media')
)
) c
ORDER BY c.upload_date DESC
LIMIT 16, 16;
This looks more complicated, but each subquery can take advantage of the index, so it might have improved performance.
Related
I have a problem with the speed of query. Question is similar to this one, but can't find solution. Explain says that MySQL is using: Using where; Using index; Using temporary; Using filesort
Slow query:
select
distinct(`books`.`id`)
from `books`
join `books_genres` on `books_genres`.`book_id` = `books`.`id`
where
`books`.`is_status` = 'active' and `books`.`master_book` = 'true'
and `books_genres`.`genre_id` in(380,381,384,385,1359)
order by
`books`.`livelib_read_num` DESC, `books`.`id` DESC
limit 0,25
#25 rows (0.319 s)
But if I remove order statement from query it is really fast:
select sql_no_cache
distinct(`books`.`id`)
from `books`
join `books_genres` on `books_genres`.`book_id` = `books`.`id`
where
`books`.`is_status` = 'active' and `books`.`master_book` = 'true'
and `books_genres`.`genre_id` in(380,381,384,385,1359)
limit 0,25
#25 rows (0.005 s)
Explain:
+------+-------------+--------------+--------+---------------------------------------------------------------------------------------------------------------------+------------------+---------+--------------------------------+--------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+--------------+--------+---------------------------------------------------------------------------------------------------------------------+------------------+---------+--------------------------------+--------+-----------------------------------------------------------+
| 1 | SIMPLE | books_genres | range | book_id,categorie_id,book_id2,genre_id_book_id | genre_id_book_id | 10 | NULL | 194890 | Using where; Using index; Using temporary; Using filesort |
| 1 | SIMPLE | books | eq_ref | PRIMARY,is_status,master_book,is_status_master_book,is_status_master_book_indexed,is_status_donor_no_ru_master_book | PRIMARY | 4 | knigogid3.books_genres.book_id | 1 | Using where |
+------+-------------+--------------+--------+---------------------------------------------------------------------------------------------------------------------+------------------+---------+--------------------------------+--------+-----------------------------------------------------------+
2 rows in set (0.00 sec)
My tables:
CREATE TABLE `books_genres` (
`book_id` int(11) DEFAULT NULL,
`genre_id` int(11) DEFAULT NULL,
`sort` tinyint(4) DEFAULT NULL,
UNIQUE KEY `book_id` (`book_id`,`genre_id`),
KEY `categorie_id` (`genre_id`),
KEY `sort` (`sort`),
KEY `book_id2` (`book_id`),
KEY `genre_id_book_id` (`genre_id`,`book_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `books` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`is_status` enum('active','parser','incorrect','extremist','delete','fulldeteled') NOT NULL DEFAULT 'active',
`livelib_book_id` int(11) DEFAULT NULL,
`master_book` enum('true','false') DEFAULT 'true'
PRIMARY KEY (`id`),
KEY `is_status` (`is_status`),
KEY `master_book` (`master_book`),
KEY `livelib_book_id` (`livelib_book_id`),
KEY `livelib_read_num` (`livelib_read_num`),
KEY `is_status_master_book` (`is_status`,`master_book`),
KEY `livelib_book_id_master_book` (`livelib_book_id`,`master_book`),
KEY `is_status_master_book_indexed` (`is_status`,`master_book`,`indexed`),
KEY `is_status_donor_no_ru_master_book` (`is_status`,`donor`,`no_ru`,`master_book`),
KEY `livelib_url_master_book_is_status` (`livelib_url`,`master_book`,`is_status`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Problems with books_genres.
It has no PRIMARY KEY.
All columns are nullable. Will you ever insert a row with any NULLs?
Recommend (after saying NOT NULL on all columns):
PRIMARY KEY(`book_id`,`genre_id`)
INDEX(genre_id, book_id, sort)
and remove all the rest.
I don't see livelib_read_num in the table???
In the other table, remove any indexes that are the exact prefix of some other index.
These might help with speed. (Again, filter out prefix indexes that are redundant.) (These are "covering" indexes, which helps a little.)
books: INDEX(is_status, master_book, livelib_read_num, id)
books: INDEX(livelib_read_num, id, is_status, master_book)
The second index may cause the Optimizer to give preference to ORDER BY. (That is a risky optimization, since it might have to scan the entire index without finding 25 relevant rows.)
SELECT sql_no_cache
`books`.`id`
FROM
`books`
use index(books_idx_is_stat_master_livelib_id)
WHERE
(
1 = 1
AND `books`.`is_status` = 'active'
AND `books`.`master_book` = 'true'
)
AND (
EXISTS (
SELECT
1
FROM
`books_genres`
WHERE
(
`books_genres`.`book_id` = `books`.`id`
)
AND (
`books_genres`.`genre_id` IN (
380, 381, 384, 385, 1359
)
)
)
)
ORDER BY
`books`.`livelib_read_num` DESC,
`books`.`id` DESC LIMIT 0,
25;
25 rows in set (0.07 sec)
I have a query with a JOIN on three tables that is taking a very long time to run. I created an index on one of my tables for the foreign key (user_shared_url_id) and two columns (event_result, enabled) in the WHERE clause, so it's an index of three columns total. There seems to be no different from when I simply use an index of the foreign key (user_shared_url_id). The other two tables are using single column indexes. My users table has about 20,000 rows, but the other two tables are quite large, with ~20 million rows. I can't get a query that takes less than a minute or so to finish. Can anyone think of any potential optimizations I can make to speed this up? Are there other indexes or improvements to my custom index that I can work with?
The tables:
CREATE TABLE `users` (
`user_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`roles` varchar(500) DEFAULT NULL,
`first_name` varchar(200) DEFAULT NULL,
`last_name` varchar(100) DEFAULT NULL,
`org_id` int(11) unsigned NOT NULL,
`user_email` varchar(100) NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`user_id`),
KEY `org_id` (`org_id`),
KEY `status` (`status`),
KEY `org_id_user_id` (`org_id`,`user_id`)
) ENGINE=MyISAM AUTO_INCREMENT=162524 DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC
CREATE TABLE `user_shared_urls` (
`user_id` int(11) unsigned NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`user_shared_url_id` int(11) NOT NULL AUTO_INCREMENT,
`target_url` text,
PRIMARY KEY (`user_shared_url_id`),
KEY `user_id` (`user_id`),
KEY `user_id_usu_id` (`user_id`,`user_shared_url_id`)
) ENGINE=InnoDB AUTO_INCREMENT=62449105 DEFAULT CHARSET=utf8 |
CREATE TABLE `user_share_events` (
`user_share_event_id` int(11) NOT NULL AUTO_INCREMENT,
`event_result` tinyint(1) unsigned DEFAULT NULL,
`user_shared_url_id` int(11) NOT NULL,
`enabled` tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`user_share_event_id`),
KEY `user_shared_url_id` (`user_shared_url_id`),
KEY `usuid_enabled_result` (`user_shared_url_id`,`enabled`,`event_result`)
) ENGINE=InnoDB AUTO_INCREMENT=35067339 DEFAULT CHARSET=utf8 |
My indexes:
CREATE INDEX org_id_user_id ON users(org_id, user_id);
CREATE INDEX user_id_usu_id ON user_shared_urls(user_id, user_shared_url_id);
CREATE INDEX usuid_enabled_result ON user_share_events(user_shared_url_id,enabled,event_result);
My query:
SELECT
users.user_id,
users.user_email "user_email",
users.roles "role",
CONCAT(users.first_name, ' ', users.last_name) "name",
usus.target_url
FROM
users
JOIN user_shared_urls usus ON usus.user_id = users.user_id
JOIN user_share_events uses ON usus.user_shared_url_id = uses.user_shared_url_id
WHERE
users.org_id = 1523
AND
uses.enabled = '1'
AND
uses.event_result = 1
Explain output of the above query:
+----+-------------+-------+------+----------------------------------------------------------------------------------+--------------------+---------+--------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------------------------------------------------------------+--------------------+---------+--------------------------------+------+-------------+
| 1 | SIMPLE | users | ref | PRIMARY,org_id,org_id_user_id | org_id | 4 | const | 1235 | NULL |
| 1 | SIMPLE | usus | ref | PRIMARY,user_id,user_id_usu_id | user_id_usu_id | 4 | luster.users.user_id | 213 | NULL |
| 1 | SIMPLE | uses | ref | user_shared_url_id,user_and_service,result_service_occurred,usuid_enabled_result | user_shared_url_id | 4 | luster.usus.user_shared_url_id | 1 | Using where |
+----+-------------+-------+------+----------------------------------------------------------------------------------+--------------------+---------+--------------------------------+------+-------------+
3 rows in set (0.00 sec)
(Please use SHOW CREATE TABLE; it is more descriptive than DESCRIBE.)
Change that index you added to
INDEX(user_shared_url_id, -- = and used for the JOIN
enabled, -- =
event_result) -- Last (not an = test)
The order of columns in an INDEX is important. Start with the columns that are tested for = (or IS NULL).
Then remove the FORCE INDEX and run the EXPLAIN again.
Are these tables in a 1:many relationship? Tell us which way.
Another comment: If event_result really has only two values (true/false) and you are using NULL for false, then change the query from
uses.event_result IS NOT NULL
to
uses.event_result = 1
The point is that the Optimizer likes to optimize =, but sees NOT NULL as being any of 256 possible values; very far from =. With this query change, your index should work. And even be picked without using FORCE.
For this query:
SELECT u.user_id, u.user_email, u.roles "role",
CONCAT(u.first_name, ' ', u.last_name) "name",
usu.target_url
FROM user_shared_urls usu JOIN
users u
ON usu.user_id = u.user_id JOIN
user_share_events usev
ON usus.user_shared_url_id = usev.user_shared_url_id
WHERE u.org_id = 1010 AND
usev.event_result IS NOT NULL AND
usev.enabled = 1;
Probably the best indexes are:
users(org_id, user_id)
user_shared_urls(user_id, user_shared_url_id)
user_share_events(user_shared_url_id, enabled, event_result)
This assumes that the filtering on org_id is more selective than the other filters.
I have a slow query, without the group by is fast (0.1-0.3 seconds), but with the (required) group by the duration is around 10-15s.
The query joins two tables, events (near 50 million rows) and events_locations (5 million rows).
Query:
SELECT `e`.`id` AS `event_id`,`e`.`time_stamp` AS `time_stamp`,`el`.`latitude` AS `latitude`,`el`.`longitude` AS `longitude`,
`el`.`time_span` AS `extra`,`e`.`entity_id` AS `asset_name`, `el`.`other_id` AS `geozone_id`,
`el`.`group_alias` AS `group_alias`,`e`.`event_type_id` AS `event_type_id`,
`e`.`entity_type_id`AS `entity_type_id`, el.some_id
FROM events e
INNER JOIN events_locations el ON el.event_id = e.id
WHERE 1=1
AND el.other_id = '1'
AND time_stamp >= '2018-01-01'
AND time_stamp <= '2019-06-02'
GROUP BY `e`.`event_type_id` , `el`.`some_id` , `el`.`group_alias`;
Table events:
CREATE TABLE `events` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`event_type_id` int(11) NOT NULL,
`entity_type_id` int(11) NOT NULL,
`entity_id` varchar(64) NOT NULL,
`alias` varchar(64) NOT NULL,
`time_stamp` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `entity_id` (`entity_id`),
KEY `event_type_idx` (`event_type_id`),
KEY `idx_events_time_stamp` (`time_stamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Table events_locations
CREATE TABLE `events_locations` (
`event_id` bigint(20) NOT NULL,
`latitude` double NOT NULL,
`longitude` double NOT NULL,
`some_id` bigint(20) DEFAULT NULL,
`other_id` bigint(20) DEFAULT NULL,
`time_span` bigint(20) DEFAULT NULL,
`group_alias` varchar(64) NOT NULL,
KEY `some_id_idx` (`some_id`),
KEY `idx_events_group_alias` (`group_alias`),
KEY `idx_event_id` (`event_id`),
CONSTRAINT `fk_event_id` FOREIGN KEY (`event_id`) REFERENCES `events` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The explain:
+----+-------------+-------+--------+---------------------------------+---------+---------+-------------------------------------------+----------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------------------------+---------+---------+-------------------------------------------+----------+------------------------------------------------+
| 1 | SIMPLE | ea | ALL | 'idx_event_id' | NULL | NULL | NULL | 5152834 | 'Using where; Using temporary; Using filesort' |
| 1 | SIMPLE | e | eq_ref | 'PRIMARY,idx_events_time_stamp' | PRIMARY | '8' | 'name.ea.event_id' | 1 | |
+----+-------------+----------------+---------------------------------+---------+---------+-------------------------------------------+----------+------------------------------------------------+
2 rows in set (0.08 sec)
From the doc:
Temporary tables can be created under conditions such as these:
If there is an ORDER BY clause and a different GROUP BY clause, or if the ORDER BY or GROUP BY contains columns from tables other than the first table in the join queue, a temporary table is created.
DISTINCT combined with ORDER BY may require a temporary table.
If you use the SQL_SMALL_RESULT option, MySQL uses an in-memory temporary table, unless the query also contains elements (described later) that require on-disk storage.
I already tried:
Create an index by 'el.some_id , el.group_alias'
Decrease the varchar size to 20
Increase the size of sort_buffer_size and read_rnd_buffer_size;
Any suggestions for performance tuning would be much appreciated!
In your case events table has time_span as indexing property. So before joining both tables first select required records from events table for specific date range with required details. Then join the event_location by using table relation properties.
Check your MySql Explain keyword to check how does your approach your table records. It will tell you how much rows are scanned for before selecting required records.
Number of rows that are scanned also involve in query execution time. Use my below logic to reduce the number of rows that are scanned.
SELECT
`e`.`id` AS `event_id`,
`e`.`time_stamp` AS `time_stamp`,
`el`.`latitude` AS `latitude`,
`el`.`longitude` AS `longitude`,
`el`.`time_span` AS `extra`,
`e`.`entity_id` AS `asset_name`,
`el`.`other_id` AS `geozone_id`,
`el`.`group_alias` AS `group_alias`,
`e`.`event_type_id` AS `event_type_id`,
`e`.`entity_type_id` AS `entity_type_id`,
`el`.`some_id` as `some_id`
FROM
(select
`id` AS `event_id`,
`time_stamp` AS `time_stamp`,
`entity_id` AS `asset_name`,
`event_type_id` AS `event_type_id`,
`entity_type_id` AS `entity_type_id`
from
`events`
WHERE
time_stamp >= '2018-01-01'
AND time_stamp <= '2019-06-02'
) AS `e`
JOIN `events_locations` `el` ON `e`.`event_id` = `el`.`event_id`
WHERE
`el`.`other_id` = '1'
GROUP BY
`e`.`event_type_id` ,
`el`.`some_id` ,
`el`.`group_alias`;
The relationship between these tables is 1:1, so, I asked me why is a group by required and I found some duplicated rows, 200 in 50000 rows. So, somehow, my system is inserting duplicates and someone put that group by (years ago) instead of seek of the bug.
So, I will mark this as solved, more or less...
I have a query generated by Entity Framework, that looks like this:
SELECT
`Extent1`.`Id`,
`Extent1`.`Name`,
`Extent1`.`ExpireAfterUTC`,
`Extent1`.`FileId`,
`Extent1`.`FileHash`,
`Extent1`.`PasswordHash`,
`Extent1`.`Size`,
`Extent1`.`TimeStamp`,
`Extent1`.`TimeStampOffset`
FROM `files` AS `Extent1` INNER JOIN `containers` AS `Extent2` ON `Extent1`.`ContainerId` = `Extent2`.`Id`
ORDER BY
`Extent1`.`Id` ASC LIMIT 0,10
It runs painfully slow.
I have indexes on files.Id (PK), files.ContainerId(FK), containers.Id(PK) and I don't understand why mysql seems to be doing a full sort before returning the required records, even though there already is an index on the Id column.
Further more, this data is displayed in a grid which supports filters, sorts and pagination and a good use of the indexes is highly required.
Here are the table definitions:
CREATE TABLE `files` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`FileId` varchar(100) NOT NULL,
`ContainerId` int(11) NOT NULL,
`ContainerGuid` binary(16) NOT NULL,
`Guid` binary(16) NOT NULL,
`Name` varchar(1000) NOT NULL,
`ExpireAfterUTC` datetime DEFAULT NULL,
`PasswordHash` binary(32) DEFAULT NULL,
`FileHash` tinyblob NOT NULL,
`Size` bigint(20) NOT NULL,
`TimeStamp` double NOT NULL,
`TimeStampOffset` double NOT NULL,
`FilePostId` int(11) NOT NULL,
`FilePostGuid` binary(16) NOT NULL,
`AttributeId` int(11) NOT NULL,
PRIMARY KEY (`Id`),
UNIQUE KEY `FileId_UNIQUE` (`FileId`),
KEY `Files_ContainerId_FK` (`ContainerId`),
KEY `Files_AttributeId_FK` (`AttributeId`),
KEY `Files_FileId_index` (`FileId`),
KEY `Files_FilePostId_index` (`FilePostId`),
KEY `Files_Guid_index` (`Guid`),
CONSTRAINT `Files_AttributeId_FK` FOREIGN KEY (`AttributeId`) REFERENCES `attributes` (`Id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `Files_ContainerId_FK` FOREIGN KEY (`ContainerId`) REFERENCES `containers` (`Id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `Files_FilePostsId_FK` FOREIGN KEY (`FilePostId`) REFERENCES `fileposts` (`Id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=977942 DEFAULT CHARSET=utf8;
CREATE TABLE `containers` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`Name` varchar(255) NOT NULL,
`Guid` binary(16) NOT NULL,
`AesKey` binary(32) NOT NULL,
`FileCount` int(10) unsigned NOT NULL DEFAULT '0',
`Size` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`Id`),
KEY `Containers_Guid_index` (`Guid`),
KEY `Containers_Name_index` (`Name`)
) ENGINE=InnoDB AUTO_INCREMENT=76 DEFAULT CHARSET=utf8;
You will notice there are some other relationships in the files table, which I have left out just to simplify the query without affecting the observed behavior.
Here is also an output from EXPLAIN EXTENDED:
+----+-------------+---------+-------+----------------------+-----------------------+---------+----------------------------------+-------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+-------+----------------------+-----------------------+---------+----------------------------------+-------+----------+----------------------------------------------+
| 1 | SIMPLE | Extent2 | index | PRIMARY | Containers_Guid_index | 16 | NULL | 9 | 100.00 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | Extent1 | ref | Files_ContainerId_FK | Files_ContainerId_FK | 4 | netachmentgeneraltest.Extent2.Id | 73850 | 100.00 | |
+----+-------------+---------+-------+----------------------+-----------------------+---------+----------------------------------+-------+----------+----------------------------------------------+
Files table has ~900000 records (and counting) and containers has 9.
This issue only occurs when ORDER BY is present.
Also, I can't do much in terms of modifying the query because it is generated by Entity Framework. I did as much as I could with the LINQ query in order to simplify it (at first it had some horrible sub queries which executed even slower).
Query hints (as in force index) are not a solution here either, because EF does not support such features.
I am mostly hoping to find some database level optimizations to do.
For those who didn't spot the tags, the database in question is MySql.
MySQL only uses one index per table. Right now, it's preferring to use the foreign key index so the join is efficient, but that means that the sort is not using an index.
Try creating a compound index on ContainerId, filedID
This is essentially your query:
SELECT e1.*
FROM `files` e1 INNER JOIN
`containers` e2
ON e1.`ContainerId` = e2.`Id`
ORDER BY e1.`Id` ASC
LIMIT 0, 10;
You can try an index on files(id, ContainerId). This might inspire MySQL to use the composite index, focused on the order by.
It would probably be more likely if the query were phrased as:
SELECT e1.*
FROM `files` e1
WHERE EXISTS (SELECT 1 FROM containers e2 WHERE e1.`ContainerId` = e2.`Id`)
ORDER BY e1.`Id` ASC
LIMIT 0, 10;
There is one way that does work to use the indexes. However, it depends on something in MySQL that is not documented to work (although it does in practice). The following will read the data in order, but it incurs the overhead of materializing the subquery -- but not for a sort:
SELECT e1.*
FROM (SELECT e1.*
FROM files e1
ORDER BY e1.id ASC
) e1
WHERE EXISTS (SELECT 1 FROM containers e2 WHERE e1.`ContainerId` = e2.`Id`)
LIMIT 0, 10;
I'm having problems with a query optimization. The following query takes more than 30 seconds to get the expected result.
SELECT tbl_history.buffet_q_rating, tbl_history.cod_stock, tbl_history.bqqq_change_month, stocks.ticker, countries.country, stocks.company
FROM tbl_history
INNER JOIN stocks ON tbl_history.cod_stock = stocks.cod_stock
INNER JOIN exchange ON stocks.cod_exchange = exchange.cod_exchange
INNER JOIN countries ON exchange.cod_country = countries.cod_country
WHERE exchange.cod_country =125
AND DATE = '2011-07-25'
AND bqqq_change_month IS NOT NULL
AND buffet_q_rating IS NOT NULL
ORDER BY bqqq_change_month DESC
LIMIT 10
The tables are:
CREATE TABLE IF NOT EXISTS `tbl_history` (
`cod_stock` int(11) NOT NULL DEFAULT '0',
`date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`price` decimal(11,3) DEFAULT NULL,
`buffet_q_rating` decimal(11,4) DEFAULT NULL,
`bqqq_change_day` decimal(11,2) DEFAULT NULL,
`bqqq_change_month` decimal(11,2) DEFAULT NULL,
(...)
PRIMARY KEY (`cod_stock`,`date`),
KEY `cod_stock` (`cod_stock`),
KEY `buf_rating` (`buffet_q_rating`),
KEY `data` (`date`),
KEY `bqqq_change_month` (`bqqq_change_month`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `stocks` (
`cod_stock` int(11) NOT NULL AUTO_INCREMENT,
`cod_exchange` int(11) DEFAULT NULL,
PRIMARY KEY (`cod_stock`),
KEY `exchangestocks` (`cod_exchange`),
KEY `codstock` (`cod_stock`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=0 ;
CREATE TABLE IF NOT EXISTS `exchange` (
`cod_exchange` int(11) NOT NULL AUTO_INCREMENT,
`exchange` varchar(100) DEFAULT NULL,
`cod_country` int(11) DEFAULT NULL,
PRIMARY KEY (`cod_exchange`),
KEY `countriesexchange` (`cod_country`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=0 ;
CREATE TABLE IF NOT EXISTS `countries` (
`cod_country` int(11) NOT NULL AUTO_INCREMENT,
`country` varchar(100) DEFAULT NULL,
`initial_amount` double DEFAULT NULL,
PRIMARY KEY (`cod_country`),
KEY `codcountry` (`cod_country`),
KEY `country` (`country`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=0 ;
The first table have more than 20 million rows, the second have 40k and the others have just a few rows (maybe 100).
Them problem seems to be the "order by" but I have no idea how to optimize it.
I already tried some things searching on google/stackoverflow but I was unable to get good results
Can someone give me some advice?
EDIT:
Forgot the EXPLAIN result:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE countries const PRIMARY,codcountry PRIMARY 4 const 1 Using temporary; Using filesort
1 SIMPLE exchange ref PRIMARY,countriesexchange countriesexchange 5 const 15 Using where
1 SIMPLE stocks ref PRIMARY,exchangestocks,codstock exchangestocks 5 databaseName.exchange.cod_exchange 661 Using where
1 SIMPLE tbl_history eq_ref PRIMARY,cod_stock,buf_rating,data,bqqq_change_mont... PRIMARY 12 v.stocks.cod_stock,const 1 Using where
UPDATE
this is the new EXPLAIN I got:
id select_type table type possible_keys key key_len ref rows Extra |
1 SIMPLE tbl_history range monthstats monthstats 14 NULL 80053 Using where; Using index |
1 SIMPLE countries ref country country 4 const 1 Using index |
1 SIMPLE exchange ref PRIMARY,cod_country,countryexchange countryexchange 5 const 5 Using where; Using index |
1 SIMPLE stocks ref info4stats info4stats 9 databaseName.exchange.cod_exchange,databaseName.stock_... 1 Using where; Using index |
I would try to preemptively start with the Country records for 125 and work in reverse. By using a Straight_join will force the order of your query as entered...
I would also have an index on your Tbl_History table by the COD_Stock and DATE( date ). So the query will properly and efficiently match the join condition on the pre-qualified date portion of the date/time field.
SELECT STRAIGHT_JOIN
th.buffet_q_rating,
th.cod_stock,
th.bqqq_change_month,
stocks.ticker,
c.country,
s.company
FROM
Exchange e
join Countries c
on e.Cod_Country = c.Cod_Country
join Stocks s
on e.cod_exchange = s.cod_exchange
join tbl_history th
on s.cod_stock = th.cod_stock
AND th.`Date` = '2011-07-25'
AND th.bqqq_change_month IS NOT NULL
AND th.buffet_q_rating IS NOT NULL
WHERE
e.Cod_Country = 125
ORDER BY
th.bqqq_change_month DESC
LIMIT 10
If you want to limit the result, why do you do it after you join all the table?
Try to reduce the size of those big tables first (LIMIT or WHERE them) before joining them with other tables.
But you have to be sure that your original query and your modified query means the same.
Update (Sample) :
select
tbl_user.user_id,
tbl_group.group_name
from
tbl_grp_user
inner join
(
select
tbl_user.user_id,
tbl_user.user_name
from
tbl_user
limit
5
) as tbl_user
on
tbl_user.user_id = tbl_grp_user.user_id
inner join
(
select
group_id,
group_name
from
tbl_group
where
tbl_group.group_id > 5
) as tbl_group
on
tbl_group.group_id = tbl_grp_user.group_id
Hopefully, query above will give you a hint