Optimise comparing data in two big MySQL tables - mysql

How could I optimise query, which will find all records, which:
have activation_request.date_confirmed not null
and
do not have related string value in another table: activation_request.email =
user.username shouldn't return any record
I tried:
SELECT email
FROM activation_request l
LEFT JOIN user r ON r.username = l.email
WHERE l.date_confirmed is not null
AND r.username IS NULL
and
SELECT email
FROM activation_request
WHERE date_confirmed is not null
AND NOT EXISTS (SELECT 1
FROM user
WHERE user.username = activation_request.email
)
but both tables have xxx.xxx.xxx records hence after all night running those queries unfortunatelly I haven't got any results.
Create statements:
CREATE TABLE `activation_request` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`version` bigint(20) NOT NULL,
`date_confirmed` datetime DEFAULT NULL,
`email` varchar(255) NOT NULL,
(...)
PRIMARY KEY (`id`),
KEY `emailIdx` (`email`),
KEY `reminderSentIdx` (`date_reminder_sent`),
KEY `idx_resent_needed` (`date_reminder_sent`,`date_confirmed`),
) ENGINE=InnoDB AUTO_INCREMENT=103011867 DEFAULT CHARSET=utf8;
CREATE TABLE `user` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`version` bigint(20) NOT NULL,
`username` varchar(255) NOT NULL,
(...)
PRIMARY KEY (`id`),
UNIQUE KEY `Q52plW9W7TJWZcLj00K3FmuhwMSw4F7vmxJGyjxz5iiINVR9fXyacEoq4rHppb` (`username`),
) ENGINE=InnoDB AUTO_INCREMENT=431400048 DEFAULT CHARSET=latin1;
Explain for LEFT JOIN:
[[id:1, select_type:SIMPLE, table:l, type:ALL, possible_keys:null,
key:null, key_len:null, ref:null, rows:49148965, Extra:Using where],
[id:1, select_type:SIMPLE, table:r, type:index, possible_keys:null,
key:Q52plW9W7TJWZcLj00K3FmuhwMSw4F7vmxJGyjxz5iiINVR9fXyacEoq4rHppb,
key_len:257, ref:null, rows:266045508, Extra:Using where; Not exists;
Using index; Using join buffer (Block Nested Loop)]] [[id:1,
select_type:SIMPLE, table:l, type:ALL, possible_keys:null, key:null,
key_len:null, ref:null, rows:49148965, Extra:Using where], [id:1,
select_type:SIMPLE, table:r, type:index, possible_keys:null,
key:Q52plW9W7TJWZcLj00K3FmuhwMSw4F7vmxJGyjxz5iiINVR9fXyacEoq4rHppb,
key_len:257, ref:null, rows:266045508, Extra:Using where; Not exists;
Using index; Using join buffer (Block Nested Loop)]]
After adding indexes on staging db (with slightly less data, but the same structure) query is now running ~24h and still no results):
$ show processlist;
| Id | User | Host | db | Command | Time | State | Info
| 64 | root | localhost | staging_db | Query | 110072 | Sending data | SELECT ar.email FROM activation_request ar WHERE ar.date_confirmed is not null AND NOT EXISTS (SELE |
Mysql version:
$ select version();
5.6.16-1~exp1
All other commands on the list are Sleep so there is no other query running and possibly disturbing/locking rows.

For this query:
SELECT ar.email
FROM activation_request ar
WHERE ar.date_confirmed is not null AND
NOT EXISTS (SELECT 1
FROM user u
WHERE u.username = ar.email
)
I would recommend indexes on activation_request(date_confirmed, email) and user(username).
Unless you have a really humongous amount of data, though, your problem may be that tables are locked.

Related

Need help optimizing sql JOIN query and indexes on large tables

I have a query with a JOIN on three tables that is taking a very long time to run. I created an index on one of my tables for the foreign key (user_shared_url_id) and two columns (event_result, enabled) in the WHERE clause, so it's an index of three columns total. There seems to be no different from when I simply use an index of the foreign key (user_shared_url_id). The other two tables are using single column indexes. My users table has about 20,000 rows, but the other two tables are quite large, with ~20 million rows. I can't get a query that takes less than a minute or so to finish. Can anyone think of any potential optimizations I can make to speed this up? Are there other indexes or improvements to my custom index that I can work with?
The tables:
CREATE TABLE `users` (
`user_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`roles` varchar(500) DEFAULT NULL,
`first_name` varchar(200) DEFAULT NULL,
`last_name` varchar(100) DEFAULT NULL,
`org_id` int(11) unsigned NOT NULL,
`user_email` varchar(100) NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`user_id`),
KEY `org_id` (`org_id`),
KEY `status` (`status`),
KEY `org_id_user_id` (`org_id`,`user_id`)
) ENGINE=MyISAM AUTO_INCREMENT=162524 DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC
CREATE TABLE `user_shared_urls` (
`user_id` int(11) unsigned NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`user_shared_url_id` int(11) NOT NULL AUTO_INCREMENT,
`target_url` text,
PRIMARY KEY (`user_shared_url_id`),
KEY `user_id` (`user_id`),
KEY `user_id_usu_id` (`user_id`,`user_shared_url_id`)
) ENGINE=InnoDB AUTO_INCREMENT=62449105 DEFAULT CHARSET=utf8 |
CREATE TABLE `user_share_events` (
`user_share_event_id` int(11) NOT NULL AUTO_INCREMENT,
`event_result` tinyint(1) unsigned DEFAULT NULL,
`user_shared_url_id` int(11) NOT NULL,
`enabled` tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`user_share_event_id`),
KEY `user_shared_url_id` (`user_shared_url_id`),
KEY `usuid_enabled_result` (`user_shared_url_id`,`enabled`,`event_result`)
) ENGINE=InnoDB AUTO_INCREMENT=35067339 DEFAULT CHARSET=utf8 |
My indexes:
CREATE INDEX org_id_user_id ON users(org_id, user_id);
CREATE INDEX user_id_usu_id ON user_shared_urls(user_id, user_shared_url_id);
CREATE INDEX usuid_enabled_result ON user_share_events(user_shared_url_id,enabled,event_result);
My query:
SELECT
users.user_id,
users.user_email "user_email",
users.roles "role",
CONCAT(users.first_name, ' ', users.last_name) "name",
usus.target_url
FROM
users
JOIN user_shared_urls usus ON usus.user_id = users.user_id
JOIN user_share_events uses ON usus.user_shared_url_id = uses.user_shared_url_id
WHERE
users.org_id = 1523
AND
uses.enabled = '1'
AND
uses.event_result = 1
Explain output of the above query:
+----+-------------+-------+------+----------------------------------------------------------------------------------+--------------------+---------+--------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------------------------------------------------------------+--------------------+---------+--------------------------------+------+-------------+
| 1 | SIMPLE | users | ref | PRIMARY,org_id,org_id_user_id | org_id | 4 | const | 1235 | NULL |
| 1 | SIMPLE | usus | ref | PRIMARY,user_id,user_id_usu_id | user_id_usu_id | 4 | luster.users.user_id | 213 | NULL |
| 1 | SIMPLE | uses | ref | user_shared_url_id,user_and_service,result_service_occurred,usuid_enabled_result | user_shared_url_id | 4 | luster.usus.user_shared_url_id | 1 | Using where |
+----+-------------+-------+------+----------------------------------------------------------------------------------+--------------------+---------+--------------------------------+------+-------------+
3 rows in set (0.00 sec)
(Please use SHOW CREATE TABLE; it is more descriptive than DESCRIBE.)
Change that index you added to
INDEX(user_shared_url_id, -- = and used for the JOIN
enabled, -- =
event_result) -- Last (not an = test)
The order of columns in an INDEX is important. Start with the columns that are tested for = (or IS NULL).
Then remove the FORCE INDEX and run the EXPLAIN again.
Are these tables in a 1:many relationship? Tell us which way.
Another comment: If event_result really has only two values (true/false) and you are using NULL for false, then change the query from
uses.event_result IS NOT NULL
to
uses.event_result = 1
The point is that the Optimizer likes to optimize =, but sees NOT NULL as being any of 256 possible values; very far from =. With this query change, your index should work. And even be picked without using FORCE.
For this query:
SELECT u.user_id, u.user_email, u.roles "role",
CONCAT(u.first_name, ' ', u.last_name) "name",
usu.target_url
FROM user_shared_urls usu JOIN
users u
ON usu.user_id = u.user_id JOIN
user_share_events usev
ON usus.user_shared_url_id = usev.user_shared_url_id
WHERE u.org_id = 1010 AND
usev.event_result IS NOT NULL AND
usev.enabled = 1;
Probably the best indexes are:
users(org_id, user_id)
user_shared_urls(user_id, user_shared_url_id)
user_share_events(user_shared_url_id, enabled, event_result)
This assumes that the filtering on org_id is more selective than the other filters.

MySql group by optimization - avoid tmp table and/or filesort

I have a slow query, without the group by is fast (0.1-0.3 seconds), but with the (required) group by the duration is around 10-15s.
The query joins two tables, events (near 50 million rows) and events_locations (5 million rows).
Query:
SELECT `e`.`id` AS `event_id`,`e`.`time_stamp` AS `time_stamp`,`el`.`latitude` AS `latitude`,`el`.`longitude` AS `longitude`,
`el`.`time_span` AS `extra`,`e`.`entity_id` AS `asset_name`, `el`.`other_id` AS `geozone_id`,
`el`.`group_alias` AS `group_alias`,`e`.`event_type_id` AS `event_type_id`,
`e`.`entity_type_id`AS `entity_type_id`, el.some_id
FROM events e
INNER JOIN events_locations el ON el.event_id = e.id
WHERE 1=1
AND el.other_id = '1'
AND time_stamp >= '2018-01-01'
AND time_stamp <= '2019-06-02'
GROUP BY `e`.`event_type_id` , `el`.`some_id` , `el`.`group_alias`;
Table events:
CREATE TABLE `events` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`event_type_id` int(11) NOT NULL,
`entity_type_id` int(11) NOT NULL,
`entity_id` varchar(64) NOT NULL,
`alias` varchar(64) NOT NULL,
`time_stamp` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `entity_id` (`entity_id`),
KEY `event_type_idx` (`event_type_id`),
KEY `idx_events_time_stamp` (`time_stamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Table events_locations
CREATE TABLE `events_locations` (
`event_id` bigint(20) NOT NULL,
`latitude` double NOT NULL,
`longitude` double NOT NULL,
`some_id` bigint(20) DEFAULT NULL,
`other_id` bigint(20) DEFAULT NULL,
`time_span` bigint(20) DEFAULT NULL,
`group_alias` varchar(64) NOT NULL,
KEY `some_id_idx` (`some_id`),
KEY `idx_events_group_alias` (`group_alias`),
KEY `idx_event_id` (`event_id`),
CONSTRAINT `fk_event_id` FOREIGN KEY (`event_id`) REFERENCES `events` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The explain:
+----+-------------+-------+--------+---------------------------------+---------+---------+-------------------------------------------+----------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------------------------+---------+---------+-------------------------------------------+----------+------------------------------------------------+
| 1 | SIMPLE | ea | ALL | 'idx_event_id' | NULL | NULL | NULL | 5152834 | 'Using where; Using temporary; Using filesort' |
| 1 | SIMPLE | e | eq_ref | 'PRIMARY,idx_events_time_stamp' | PRIMARY | '8' | 'name.ea.event_id' | 1 | |
+----+-------------+----------------+---------------------------------+---------+---------+-------------------------------------------+----------+------------------------------------------------+
2 rows in set (0.08 sec)
From the doc:
Temporary tables can be created under conditions such as these:
If there is an ORDER BY clause and a different GROUP BY clause, or if the ORDER BY or GROUP BY contains columns from tables other than the first table in the join queue, a temporary table is created.
DISTINCT combined with ORDER BY may require a temporary table.
If you use the SQL_SMALL_RESULT option, MySQL uses an in-memory temporary table, unless the query also contains elements (described later) that require on-disk storage.
I already tried:
Create an index by 'el.some_id , el.group_alias'
Decrease the varchar size to 20
Increase the size of sort_buffer_size and read_rnd_buffer_size;
Any suggestions for performance tuning would be much appreciated!
In your case events table has time_span as indexing property. So before joining both tables first select required records from events table for specific date range with required details. Then join the event_location by using table relation properties.
Check your MySql Explain keyword to check how does your approach your table records. It will tell you how much rows are scanned for before selecting required records.
Number of rows that are scanned also involve in query execution time. Use my below logic to reduce the number of rows that are scanned.
SELECT
`e`.`id` AS `event_id`,
`e`.`time_stamp` AS `time_stamp`,
`el`.`latitude` AS `latitude`,
`el`.`longitude` AS `longitude`,
`el`.`time_span` AS `extra`,
`e`.`entity_id` AS `asset_name`,
`el`.`other_id` AS `geozone_id`,
`el`.`group_alias` AS `group_alias`,
`e`.`event_type_id` AS `event_type_id`,
`e`.`entity_type_id` AS `entity_type_id`,
`el`.`some_id` as `some_id`
FROM
(select
`id` AS `event_id`,
`time_stamp` AS `time_stamp`,
`entity_id` AS `asset_name`,
`event_type_id` AS `event_type_id`,
`entity_type_id` AS `entity_type_id`
from
`events`
WHERE
time_stamp >= '2018-01-01'
AND time_stamp <= '2019-06-02'
) AS `e`
JOIN `events_locations` `el` ON `e`.`event_id` = `el`.`event_id`
WHERE
`el`.`other_id` = '1'
GROUP BY
`e`.`event_type_id` ,
`el`.`some_id` ,
`el`.`group_alias`;
The relationship between these tables is 1:1, so, I asked me why is a group by required and I found some duplicated rows, 200 in 50000 rows. So, somehow, my system is inserting duplicates and someone put that group by (years ago) instead of seek of the bug.
So, I will mark this as solved, more or less...

Mysql join acting weird

I have a very simple query but I am a beginner at this and I couldn't understand really what the problem is as it's not working properly in second case:
SELECT a.user_name, a.password, a.id, r.role_name
FROM accounts as a
JOIN roles as r ON a.role=r.id
SELECT accounts.user_name, accounts.password, accounts.id, roles.role_name
FROM accounts
JOIN roles ON accounts.role=roles.id
SELECT *
FROM accounts as a
JOIN roles as r ON a.role=r.id
accounts.role and roles.id linked with foreign key. I try to select everything using * in the third query but it didn't even get anything from second table only got everything from first table(as in the last photo). So what might be the problem ?
This behaviour has no sense, all fields must to appear when you use *
Let's do a test on SQL Fiddle
MySQL 5.6 Schema Setup:
create table t1 ( i int, a char);
insert into t1 values (1,'a');
create table t2 ( i int, b char);
insert into t2 values (1,'a');
Query 1:
select *
from t1 inner join t2 on t1.i = t2.i
Results:
| i | a | i | b |
|---|---|---|---|
| 1 | a | 1 | a |
Query 2:
select *
from t1 x inner join t2 y on x.i = y.i
Results:
| i | a | i | b |
|---|---|---|---|
| 1 | a | 1 | a |
You can see all times all fields appear. May be is a issue with the program you use to connect and make the queries. Check for twice you are executing all the sentence, not only the firsts 2 lines, also check if they are a scroll bar to see more data.
I've restarted the database with given codes:
$mysqli->query('
CREATE TABLE `crm`.`roles`
(
`id` TINYINT(1) NOT NULL AUTO_INCREMENT,
`role_name` VARCHAR(20) NOT NULL,
`edit` TINYINT(1) NOT NULL DEFAULT 0,
PRIMARY KEY (`id`)
);') or die($mysqli->error);
$mysqli->query('
CREATE TABLE `crm`.`accounts`
(
`id` INT NOT NULL AUTO_INCREMENT,
`role` TINYINT(1) NOT NULL DEFAULT 1,
`user_name` VARCHAR(20) NOT NULL,
`password` VARCHAR(100) NOT NULL,
`email` VARCHAR(100) NOT NULL,
`first_name` VARCHAR(50) NOT NULL,
`last_name` VARCHAR(50) NOT NULL,
`hash` VARCHAR(32) NOT NULL,
`active` BOOL NOT NULL DEFAULT 0,
PRIMARY KEY (`id`),
FOREIGN KEY (`role`) REFERENCES roles(`id`)
);') or die($mysqli->error);
and every combination of SELECT is working fine now. I don't know what the problem was since I don't remember making any changes on the tables.

Slow MySql query with order by limit with index

I have a query generated by Entity Framework, that looks like this:
SELECT
`Extent1`.`Id`,
`Extent1`.`Name`,
`Extent1`.`ExpireAfterUTC`,
`Extent1`.`FileId`,
`Extent1`.`FileHash`,
`Extent1`.`PasswordHash`,
`Extent1`.`Size`,
`Extent1`.`TimeStamp`,
`Extent1`.`TimeStampOffset`
FROM `files` AS `Extent1` INNER JOIN `containers` AS `Extent2` ON `Extent1`.`ContainerId` = `Extent2`.`Id`
ORDER BY
`Extent1`.`Id` ASC LIMIT 0,10
It runs painfully slow.
I have indexes on files.Id (PK), files.ContainerId(FK), containers.Id(PK) and I don't understand why mysql seems to be doing a full sort before returning the required records, even though there already is an index on the Id column.
Further more, this data is displayed in a grid which supports filters, sorts and pagination and a good use of the indexes is highly required.
Here are the table definitions:
CREATE TABLE `files` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`FileId` varchar(100) NOT NULL,
`ContainerId` int(11) NOT NULL,
`ContainerGuid` binary(16) NOT NULL,
`Guid` binary(16) NOT NULL,
`Name` varchar(1000) NOT NULL,
`ExpireAfterUTC` datetime DEFAULT NULL,
`PasswordHash` binary(32) DEFAULT NULL,
`FileHash` tinyblob NOT NULL,
`Size` bigint(20) NOT NULL,
`TimeStamp` double NOT NULL,
`TimeStampOffset` double NOT NULL,
`FilePostId` int(11) NOT NULL,
`FilePostGuid` binary(16) NOT NULL,
`AttributeId` int(11) NOT NULL,
PRIMARY KEY (`Id`),
UNIQUE KEY `FileId_UNIQUE` (`FileId`),
KEY `Files_ContainerId_FK` (`ContainerId`),
KEY `Files_AttributeId_FK` (`AttributeId`),
KEY `Files_FileId_index` (`FileId`),
KEY `Files_FilePostId_index` (`FilePostId`),
KEY `Files_Guid_index` (`Guid`),
CONSTRAINT `Files_AttributeId_FK` FOREIGN KEY (`AttributeId`) REFERENCES `attributes` (`Id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `Files_ContainerId_FK` FOREIGN KEY (`ContainerId`) REFERENCES `containers` (`Id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `Files_FilePostsId_FK` FOREIGN KEY (`FilePostId`) REFERENCES `fileposts` (`Id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=977942 DEFAULT CHARSET=utf8;
CREATE TABLE `containers` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`Name` varchar(255) NOT NULL,
`Guid` binary(16) NOT NULL,
`AesKey` binary(32) NOT NULL,
`FileCount` int(10) unsigned NOT NULL DEFAULT '0',
`Size` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`Id`),
KEY `Containers_Guid_index` (`Guid`),
KEY `Containers_Name_index` (`Name`)
) ENGINE=InnoDB AUTO_INCREMENT=76 DEFAULT CHARSET=utf8;
You will notice there are some other relationships in the files table, which I have left out just to simplify the query without affecting the observed behavior.
Here is also an output from EXPLAIN EXTENDED:
+----+-------------+---------+-------+----------------------+-----------------------+---------+----------------------------------+-------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+-------+----------------------+-----------------------+---------+----------------------------------+-------+----------+----------------------------------------------+
| 1 | SIMPLE | Extent2 | index | PRIMARY | Containers_Guid_index | 16 | NULL | 9 | 100.00 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | Extent1 | ref | Files_ContainerId_FK | Files_ContainerId_FK | 4 | netachmentgeneraltest.Extent2.Id | 73850 | 100.00 | |
+----+-------------+---------+-------+----------------------+-----------------------+---------+----------------------------------+-------+----------+----------------------------------------------+
Files table has ~900000 records (and counting) and containers has 9.
This issue only occurs when ORDER BY is present.
Also, I can't do much in terms of modifying the query because it is generated by Entity Framework. I did as much as I could with the LINQ query in order to simplify it (at first it had some horrible sub queries which executed even slower).
Query hints (as in force index) are not a solution here either, because EF does not support such features.
I am mostly hoping to find some database level optimizations to do.
For those who didn't spot the tags, the database in question is MySql.
MySQL only uses one index per table. Right now, it's preferring to use the foreign key index so the join is efficient, but that means that the sort is not using an index.
Try creating a compound index on ContainerId, filedID
This is essentially your query:
SELECT e1.*
FROM `files` e1 INNER JOIN
`containers` e2
ON e1.`ContainerId` = e2.`Id`
ORDER BY e1.`Id` ASC
LIMIT 0, 10;
You can try an index on files(id, ContainerId). This might inspire MySQL to use the composite index, focused on the order by.
It would probably be more likely if the query were phrased as:
SELECT e1.*
FROM `files` e1
WHERE EXISTS (SELECT 1 FROM containers e2 WHERE e1.`ContainerId` = e2.`Id`)
ORDER BY e1.`Id` ASC
LIMIT 0, 10;
There is one way that does work to use the indexes. However, it depends on something in MySQL that is not documented to work (although it does in practice). The following will read the data in order, but it incurs the overhead of materializing the subquery -- but not for a sort:
SELECT e1.*
FROM (SELECT e1.*
FROM files e1
ORDER BY e1.id ASC
) e1
WHERE EXISTS (SELECT 1 FROM containers e2 WHERE e1.`ContainerId` = e2.`Id`)
LIMIT 0, 10;

MySQL Optimization Problem

I'm having problems with a query optimization. The following query takes more than 30 seconds to get the expected result.
SELECT tbl_history.buffet_q_rating, tbl_history.cod_stock, tbl_history.bqqq_change_month, stocks.ticker, countries.country, stocks.company
FROM tbl_history
INNER JOIN stocks ON tbl_history.cod_stock = stocks.cod_stock
INNER JOIN exchange ON stocks.cod_exchange = exchange.cod_exchange
INNER JOIN countries ON exchange.cod_country = countries.cod_country
WHERE exchange.cod_country =125
AND DATE = '2011-07-25'
AND bqqq_change_month IS NOT NULL
AND buffet_q_rating IS NOT NULL
ORDER BY bqqq_change_month DESC
LIMIT 10
The tables are:
CREATE TABLE IF NOT EXISTS `tbl_history` (
`cod_stock` int(11) NOT NULL DEFAULT '0',
`date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`price` decimal(11,3) DEFAULT NULL,
`buffet_q_rating` decimal(11,4) DEFAULT NULL,
`bqqq_change_day` decimal(11,2) DEFAULT NULL,
`bqqq_change_month` decimal(11,2) DEFAULT NULL,
(...)
PRIMARY KEY (`cod_stock`,`date`),
KEY `cod_stock` (`cod_stock`),
KEY `buf_rating` (`buffet_q_rating`),
KEY `data` (`date`),
KEY `bqqq_change_month` (`bqqq_change_month`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `stocks` (
`cod_stock` int(11) NOT NULL AUTO_INCREMENT,
`cod_exchange` int(11) DEFAULT NULL,
PRIMARY KEY (`cod_stock`),
KEY `exchangestocks` (`cod_exchange`),
KEY `codstock` (`cod_stock`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=0 ;
CREATE TABLE IF NOT EXISTS `exchange` (
`cod_exchange` int(11) NOT NULL AUTO_INCREMENT,
`exchange` varchar(100) DEFAULT NULL,
`cod_country` int(11) DEFAULT NULL,
PRIMARY KEY (`cod_exchange`),
KEY `countriesexchange` (`cod_country`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=0 ;
CREATE TABLE IF NOT EXISTS `countries` (
`cod_country` int(11) NOT NULL AUTO_INCREMENT,
`country` varchar(100) DEFAULT NULL,
`initial_amount` double DEFAULT NULL,
PRIMARY KEY (`cod_country`),
KEY `codcountry` (`cod_country`),
KEY `country` (`country`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=0 ;
The first table have more than 20 million rows, the second have 40k and the others have just a few rows (maybe 100).
Them problem seems to be the "order by" but I have no idea how to optimize it.
I already tried some things searching on google/stackoverflow but I was unable to get good results
Can someone give me some advice?
EDIT:
Forgot the EXPLAIN result:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE countries const PRIMARY,codcountry PRIMARY 4 const 1 Using temporary; Using filesort
1 SIMPLE exchange ref PRIMARY,countriesexchange countriesexchange 5 const 15 Using where
1 SIMPLE stocks ref PRIMARY,exchangestocks,codstock exchangestocks 5 databaseName.exchange.cod_exchange 661 Using where
1 SIMPLE tbl_history eq_ref PRIMARY,cod_stock,buf_rating,data,bqqq_change_mont... PRIMARY 12 v.stocks.cod_stock,const 1 Using where
UPDATE
this is the new EXPLAIN I got:
id select_type table type possible_keys key key_len ref rows Extra |
1 SIMPLE tbl_history range monthstats monthstats 14 NULL 80053 Using where; Using index |
1 SIMPLE countries ref country country 4 const 1 Using index |
1 SIMPLE exchange ref PRIMARY,cod_country,countryexchange countryexchange 5 cons‌​t 5 Using where; Using index |
1 SIMPLE stocks ref info4stats info4stats 9 databaseName.exchange.cod_exchange,d‌​atabaseName.stock_... 1 Using where; Using index |
I would try to preemptively start with the Country records for 125 and work in reverse. By using a Straight_join will force the order of your query as entered...
I would also have an index on your Tbl_History table by the COD_Stock and DATE( date ). So the query will properly and efficiently match the join condition on the pre-qualified date portion of the date/time field.
SELECT STRAIGHT_JOIN
th.buffet_q_rating,
th.cod_stock,
th.bqqq_change_month,
stocks.ticker,
c.country,
s.company
FROM
Exchange e
join Countries c
on e.Cod_Country = c.Cod_Country
join Stocks s
on e.cod_exchange = s.cod_exchange
join tbl_history th
on s.cod_stock = th.cod_stock
AND th.`Date` = '2011-07-25'
AND th.bqqq_change_month IS NOT NULL
AND th.buffet_q_rating IS NOT NULL
WHERE
e.Cod_Country = 125
ORDER BY
th.bqqq_change_month DESC
LIMIT 10
If you want to limit the result, why do you do it after you join all the table?
Try to reduce the size of those big tables first (LIMIT or WHERE them) before joining them with other tables.
But you have to be sure that your original query and your modified query means the same.
Update (Sample) :
select
tbl_user.user_id,
tbl_group.group_name
from
tbl_grp_user
inner join
(
select
tbl_user.user_id,
tbl_user.user_name
from
tbl_user
limit
5
) as tbl_user
on
tbl_user.user_id = tbl_grp_user.user_id
inner join
(
select
group_id,
group_name
from
tbl_group
where
tbl_group.group_id > 5
) as tbl_group
on
tbl_group.group_id = tbl_grp_user.group_id
Hopefully, query above will give you a hint