Optimizing slow query in MySql InnoDB Database - mysql

I have a MySql database with a query that is running really slow. I'm trying the best I can to make it perform better but can't see what I'm doing wrong here. Maybe you can?
CREATE TABLE `tablea` (
`a` int(11) NOT NULL auto_increment,
`d` mediumint(9) default NULL,
`c` int(11) default NULL,
PRIMARY KEY (`a`),
KEY `d` USING BTREE (`d`)
) ENGINE=InnoDB AUTO_INCREMENT=1867710 DEFAULT CHARSET=utf8;
CREATE TABLE `tableb` (
`b` int(11) NOT NULL auto_increment,
`d` mediumint(9) default '1',
`c` int(10) NOT NULL,
`e` mediumint(9) default NULL,
PRIMARY KEY (`b`),
KEY `c` (`c`),
KEY `d` (`d`),
) ENGINE=InnoDB AUTO_INCREMENT=848150 DEFAULT CHARSET=utf8;
The query:
SELECT tablea.a, tableb.e
FROM tablea
INNER JOIN tableb ON (tablea.c=tableb.c) AND (tablea.d=tableb.d OR tableb.d=1)
WHERE tablea.d=100
This query takes like 10 seconds to run if tablea.d=100 gives 1500 rows and (tablea.d=tableb.d OR tableb.d=1) gives 1600 rows. This seems really slow. I need to make it much faster but I can't see what I'm doing wrong.
MySql EXPLAIN outputs:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tablea ref d d 4 const 1092 Using where
1 SIMPLE tableb ref c,d c 4 tablea.c 1 Using where

If I am not confused by the OR, the query is equivalent to:
SELECT tablea.a, tableb.e
FROM tablea
INNER JOIN tableb
ON tablea.c = tableb.c
WHERE tablea.d = 100
AND tableb.d IN (1,100)
Try it (using EXPLAIN) with various indexes. An index on the d field (in both tables) would help. Perhaps more, an index on (d,c).

Related

Optimizing MySQL CREATE TABLE Query

I have two tables I am trying to join in a third query and it seems to be taking far too long.
Here is the syntax I am using
CREATE TABLE active_users
(PRIMARY KEY ix_all (platform_id, login_year, login_month, person_id))
SELECT platform_id
, YEAR(my_timestamp) AS login_year
, MONTH(my_timestamp) AS login_month
, person_id
, COUNT(*) AS logins
FROM
my_login_table
GROUP BY 1,2,3,4;
CREATE TABLE active_alerts
(PRIMARY KEY ix_all (platform_id, alert_year, alert_month, person_id))
SELECT platform_id
, YEAR(alert_datetime) AS alert_year
, MONTH(alert_datetime) AS alert_month
, person_id
, COUNT(*) AS alerts
FROM
my_alert_table
GROUP BY 1,2,3,4;
CREATE TABLE all_data
(PRIMARY KEY ix_all (platform_id, theYear, theMonth, person_id))
SELECT a.platform_id
, a.login_year AS theyear
, a.login_month AS themonth
, a.person_id
, IFNULL(a.logins,0) AS logins
, IFNULL(b.alerts,0) AS job_alerts
FROM
active_users a
LEFT OUTER JOIN
active_alerts b
ON a.platform_id = b.platform_id
AND a.login_year = b.alert_year
AND a.login_month = b.alert_month
AND a.person_id = b.person_id;
The first table (logins) returns about half a million rows and takes less than 1 minute, the second table (alerts) returns about 200k rows and takes less than 1 minute.
If I run just the SELECT part of the third statement it runs in a few seconds, however as soon as I run it with the CREATE TABLE syntax it takes more than 30 minutes.
I have tried different types of indexes than a primary key, such as UNIQUE or INDEX as well as no key at all, but that doesn't seem to make much difference.
Is there something I can do to speed up the creation / insertion of this table?
EDIT:
Here is the output of the show create table statements
CREATE TABLE `active_users` (
`platform_id` int(11) NOT NULL,
`login_year` int(4) DEFAULT NULL,
`login_month` int(2) DEFAULT NULL,
`person_id` varchar(40) NOT NULL,
`logins` bigint(21) NOT NULL DEFAULT '0',
KEY `ix_all` (`platform_id`,`login_year`,`login_month`,`person_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
CREATE TABLE `alerts` (
`platform_id` int(11) NOT NULL,
`alert_year` int(4) DEFAULT NULL,
`alert_month` int(2) DEFAULT NULL,
`person_id` char(36) CHARACTER SET ascii COLLATE ascii_bin NOT NULL,
`alerts` bigint(21) NOT NULL DEFAULT '0',
KEY `ix_all` (`platform_id`,`alert_year`,`alert_month`,`person_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
and the output of the EXPLAIN
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE a (null) ALL (null) (null) (null) (null) 503504 100 (null)
1 SIMPLE b (null) ALL ix_all (null) (null) (null) 220187 100 Using where; Using join buffer (Block Nested Loop)
It's a bit of a hack but I figured out how to get it to run much faster.
I added a primary key to the third table on platform, year, month, person
I inserted the intersect data using an inner join, then insert ignore the left table plus a zero for alerts in a separate statement.

Improving query performance with Join, Full Table Scan

I am trying to improve the query performance on a stats reporting website for a Battlefield game, and am having a little bit of trouble with a very specific query. The issue I am having is that EXPLAIN is stating this query is doing a full table scan. This is troublesome because I expect this table to get very large (potentially 1 million rows or more). I am using MySQL 5.7 as my database of choice.
Here is my table and Query: http://pastebin.com/DsiGe2UB
--
-- Table structure for table `player_kit`
--
CREATE TABLE `player_kit` (
`id` TINYINT UNSIGNED NOT NULL,
`pid` INT UNSIGNED NOT NULL,
`time` INT UNSIGNED NOT NULL DEFAULT 0,
`kills` MEDIUMINT UNSIGNED NOT NULL DEFAULT 0,
`deaths` MEDIUMINT UNSIGNED NOT NULL DEFAULT 0,
PRIMARY KEY(`pid`,`id`),
FOREIGN KEY(`pid`) REFERENCES player(`id`) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY(`id`) REFERENCES kit(`id`) ON DELETE RESTRICT ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE `bf2stats`.`player_kit` ADD INDEX `reverse_ids` (`id`, `pid`);
--
-- My Full Scanning Query
-- SELECTS players, ordering them by kills and time in kit
--
SELECT p.name, p.rank, p.country, k.pid, k.kills, k.deaths, k.time
FROM player_kit AS k
INNER JOIN player AS p ON k.pid = p.id
WHERE k.id = 0 AND k.kills > 0
ORDER BY kills DESC, time DESC
LIMIT 0, 40
--
-- EXPLAIN results by MySQL
--
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE k NULL ref PRIMARY 1 const 75 32.11 Using index condition; Using where; Using filesort
1 SIMPLE p NULL eq_ref PRIMARY PRIMARY 4 bf2stats.k.pid 1 100.00 NULL
--
-- Additional Tables just in case, for reference
--
--
-- Table structure for table `kit`
--
CREATE TABLE `kit` (
`id` TINYINT UNSIGNED,
`name` VARCHAR(32) NOT NULL,
PRIMARY KEY(`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
--
-- Table structure for table `player`
--
CREATE TABLE `player` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`name` VARCHAR(32) UNIQUE NOT NULL,
`rank` TINYINT NOT NULL DEFAULT 0,
`country` CHAR(2) NOT NULL DEFAULT 'xx',
PRIMARY KEY(`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Here is the explain from phpMyAdmin:
I am hoping that one of you can help me improve the performance of this query, since any kind of index I have put on it does not seem to help much.
For this query:
SELECT p.name, p.rank, p.country, k.pid, k.kills, k.deaths, k.time
FROM player_kit k INNER JOIN
player p
ON k.pid = p.id
WHERE k.id = 0 AND k.kills > 0
ORDER BY kills DESC, time DESC
LIMIT 0, 40;
The optimal indexes are:
player_kit(id, kills, pid)
player(id) -- if this is not already there
You can also add the other columns in the index to get a covering index for the query.

Slow MySql query with order by limit with index

I have a query generated by Entity Framework, that looks like this:
SELECT
`Extent1`.`Id`,
`Extent1`.`Name`,
`Extent1`.`ExpireAfterUTC`,
`Extent1`.`FileId`,
`Extent1`.`FileHash`,
`Extent1`.`PasswordHash`,
`Extent1`.`Size`,
`Extent1`.`TimeStamp`,
`Extent1`.`TimeStampOffset`
FROM `files` AS `Extent1` INNER JOIN `containers` AS `Extent2` ON `Extent1`.`ContainerId` = `Extent2`.`Id`
ORDER BY
`Extent1`.`Id` ASC LIMIT 0,10
It runs painfully slow.
I have indexes on files.Id (PK), files.ContainerId(FK), containers.Id(PK) and I don't understand why mysql seems to be doing a full sort before returning the required records, even though there already is an index on the Id column.
Further more, this data is displayed in a grid which supports filters, sorts and pagination and a good use of the indexes is highly required.
Here are the table definitions:
CREATE TABLE `files` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`FileId` varchar(100) NOT NULL,
`ContainerId` int(11) NOT NULL,
`ContainerGuid` binary(16) NOT NULL,
`Guid` binary(16) NOT NULL,
`Name` varchar(1000) NOT NULL,
`ExpireAfterUTC` datetime DEFAULT NULL,
`PasswordHash` binary(32) DEFAULT NULL,
`FileHash` tinyblob NOT NULL,
`Size` bigint(20) NOT NULL,
`TimeStamp` double NOT NULL,
`TimeStampOffset` double NOT NULL,
`FilePostId` int(11) NOT NULL,
`FilePostGuid` binary(16) NOT NULL,
`AttributeId` int(11) NOT NULL,
PRIMARY KEY (`Id`),
UNIQUE KEY `FileId_UNIQUE` (`FileId`),
KEY `Files_ContainerId_FK` (`ContainerId`),
KEY `Files_AttributeId_FK` (`AttributeId`),
KEY `Files_FileId_index` (`FileId`),
KEY `Files_FilePostId_index` (`FilePostId`),
KEY `Files_Guid_index` (`Guid`),
CONSTRAINT `Files_AttributeId_FK` FOREIGN KEY (`AttributeId`) REFERENCES `attributes` (`Id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `Files_ContainerId_FK` FOREIGN KEY (`ContainerId`) REFERENCES `containers` (`Id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `Files_FilePostsId_FK` FOREIGN KEY (`FilePostId`) REFERENCES `fileposts` (`Id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=977942 DEFAULT CHARSET=utf8;
CREATE TABLE `containers` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`Name` varchar(255) NOT NULL,
`Guid` binary(16) NOT NULL,
`AesKey` binary(32) NOT NULL,
`FileCount` int(10) unsigned NOT NULL DEFAULT '0',
`Size` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`Id`),
KEY `Containers_Guid_index` (`Guid`),
KEY `Containers_Name_index` (`Name`)
) ENGINE=InnoDB AUTO_INCREMENT=76 DEFAULT CHARSET=utf8;
You will notice there are some other relationships in the files table, which I have left out just to simplify the query without affecting the observed behavior.
Here is also an output from EXPLAIN EXTENDED:
+----+-------------+---------+-------+----------------------+-----------------------+---------+----------------------------------+-------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+-------+----------------------+-----------------------+---------+----------------------------------+-------+----------+----------------------------------------------+
| 1 | SIMPLE | Extent2 | index | PRIMARY | Containers_Guid_index | 16 | NULL | 9 | 100.00 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | Extent1 | ref | Files_ContainerId_FK | Files_ContainerId_FK | 4 | netachmentgeneraltest.Extent2.Id | 73850 | 100.00 | |
+----+-------------+---------+-------+----------------------+-----------------------+---------+----------------------------------+-------+----------+----------------------------------------------+
Files table has ~900000 records (and counting) and containers has 9.
This issue only occurs when ORDER BY is present.
Also, I can't do much in terms of modifying the query because it is generated by Entity Framework. I did as much as I could with the LINQ query in order to simplify it (at first it had some horrible sub queries which executed even slower).
Query hints (as in force index) are not a solution here either, because EF does not support such features.
I am mostly hoping to find some database level optimizations to do.
For those who didn't spot the tags, the database in question is MySql.
MySQL only uses one index per table. Right now, it's preferring to use the foreign key index so the join is efficient, but that means that the sort is not using an index.
Try creating a compound index on ContainerId, filedID
This is essentially your query:
SELECT e1.*
FROM `files` e1 INNER JOIN
`containers` e2
ON e1.`ContainerId` = e2.`Id`
ORDER BY e1.`Id` ASC
LIMIT 0, 10;
You can try an index on files(id, ContainerId). This might inspire MySQL to use the composite index, focused on the order by.
It would probably be more likely if the query were phrased as:
SELECT e1.*
FROM `files` e1
WHERE EXISTS (SELECT 1 FROM containers e2 WHERE e1.`ContainerId` = e2.`Id`)
ORDER BY e1.`Id` ASC
LIMIT 0, 10;
There is one way that does work to use the indexes. However, it depends on something in MySQL that is not documented to work (although it does in practice). The following will read the data in order, but it incurs the overhead of materializing the subquery -- but not for a sort:
SELECT e1.*
FROM (SELECT e1.*
FROM files e1
ORDER BY e1.id ASC
) e1
WHERE EXISTS (SELECT 1 FROM containers e2 WHERE e1.`ContainerId` = e2.`Id`)
LIMIT 0, 10;

mysql select from table 1 join and order by table 2 , optimization filesort

read mysql post but couldnt find the solution.
Any pointers would be appreciated.
I have 2 tables A,B
Table A has folderid mapped to multiple feedid (1 to many )
Table B has feedid and data associated with it.
The following query gives Using index; Using temporary; Using filesort
SELECT A.feed_id,B.feed_id,B.entry_id
FROM feed_folder A
LEFT JOIN feed_data B on A.feed_id=B.feed_id
WHERE A.folder_id=29
AND B.entry_id <= 123
AND B.entry_created <= '2012-11-01 21:38:54'
ORDER by B.entry_created desc limit 0,20;
Any ideas as to how the temporary filesort can be avoided.
Following is the table structure
CREATE TABLE `feed_folder` (
`folder_id` bigint(20) unsigned NOT NULL,
`feed_id` bigint(20) unsigned NOT NULL,
UNIQUE KEY `folder_id` (`folder_id`,`feed_id`),
KEY `feed_id` (`feed_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE `feed_data` (
`entry_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`feed_id` bigint(20) unsigned NOT NULL,
`entry_created` datetime NOT NULL,
`entry_object` text,
`entry_permalink` varchar(150) NOT NULL,
`entry_orig_created` datetime NOT NULL,
PRIMARY KEY (`entry_id`),
UNIQUE KEY `feed_id_2` (`feed_id`,`entry_permalink`),
KEY `entry_created` (`entry_created`),
KEY `feed_id` (`feed_id`,`entry_created`),
KEY `feed_id_entry_id` (`feed_id`,`entry_id`),
KEY `entry_permalink` (`entry_permalink`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
Try this query:
SELECT * FROM(SELECT
A.feed_id,B.feed_id,B.entry_id
FROM feed_folder as A
INNER JOIN feed_data as B on A.feed_id=B.feed_id
WHERE A.folder_id=29
AND B.entry_id <= 123
AND B.entry_created <= '2012-11-01 21:38:54')joinAndSortTable
ORDER by B.entry_created desc limit 0,20;
The problem is in alias, you are forgot to give as A and as B
i'm also remind you about LEFT JOIN and RIGHT JOIN, INNER JOIN...
To join all records what you selected, Using INNER JOIN is a workaround.
Because INNER JOIN can join all columns with full accessed.
GOOD LUCK

MySQL Optimization Problem

I'm having problems with a query optimization. The following query takes more than 30 seconds to get the expected result.
SELECT tbl_history.buffet_q_rating, tbl_history.cod_stock, tbl_history.bqqq_change_month, stocks.ticker, countries.country, stocks.company
FROM tbl_history
INNER JOIN stocks ON tbl_history.cod_stock = stocks.cod_stock
INNER JOIN exchange ON stocks.cod_exchange = exchange.cod_exchange
INNER JOIN countries ON exchange.cod_country = countries.cod_country
WHERE exchange.cod_country =125
AND DATE = '2011-07-25'
AND bqqq_change_month IS NOT NULL
AND buffet_q_rating IS NOT NULL
ORDER BY bqqq_change_month DESC
LIMIT 10
The tables are:
CREATE TABLE IF NOT EXISTS `tbl_history` (
`cod_stock` int(11) NOT NULL DEFAULT '0',
`date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`price` decimal(11,3) DEFAULT NULL,
`buffet_q_rating` decimal(11,4) DEFAULT NULL,
`bqqq_change_day` decimal(11,2) DEFAULT NULL,
`bqqq_change_month` decimal(11,2) DEFAULT NULL,
(...)
PRIMARY KEY (`cod_stock`,`date`),
KEY `cod_stock` (`cod_stock`),
KEY `buf_rating` (`buffet_q_rating`),
KEY `data` (`date`),
KEY `bqqq_change_month` (`bqqq_change_month`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `stocks` (
`cod_stock` int(11) NOT NULL AUTO_INCREMENT,
`cod_exchange` int(11) DEFAULT NULL,
PRIMARY KEY (`cod_stock`),
KEY `exchangestocks` (`cod_exchange`),
KEY `codstock` (`cod_stock`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=0 ;
CREATE TABLE IF NOT EXISTS `exchange` (
`cod_exchange` int(11) NOT NULL AUTO_INCREMENT,
`exchange` varchar(100) DEFAULT NULL,
`cod_country` int(11) DEFAULT NULL,
PRIMARY KEY (`cod_exchange`),
KEY `countriesexchange` (`cod_country`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=0 ;
CREATE TABLE IF NOT EXISTS `countries` (
`cod_country` int(11) NOT NULL AUTO_INCREMENT,
`country` varchar(100) DEFAULT NULL,
`initial_amount` double DEFAULT NULL,
PRIMARY KEY (`cod_country`),
KEY `codcountry` (`cod_country`),
KEY `country` (`country`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=0 ;
The first table have more than 20 million rows, the second have 40k and the others have just a few rows (maybe 100).
Them problem seems to be the "order by" but I have no idea how to optimize it.
I already tried some things searching on google/stackoverflow but I was unable to get good results
Can someone give me some advice?
EDIT:
Forgot the EXPLAIN result:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE countries const PRIMARY,codcountry PRIMARY 4 const 1 Using temporary; Using filesort
1 SIMPLE exchange ref PRIMARY,countriesexchange countriesexchange 5 const 15 Using where
1 SIMPLE stocks ref PRIMARY,exchangestocks,codstock exchangestocks 5 databaseName.exchange.cod_exchange 661 Using where
1 SIMPLE tbl_history eq_ref PRIMARY,cod_stock,buf_rating,data,bqqq_change_mont... PRIMARY 12 v.stocks.cod_stock,const 1 Using where
UPDATE
this is the new EXPLAIN I got:
id select_type table type possible_keys key key_len ref rows Extra |
1 SIMPLE tbl_history range monthstats monthstats 14 NULL 80053 Using where; Using index |
1 SIMPLE countries ref country country 4 const 1 Using index |
1 SIMPLE exchange ref PRIMARY,cod_country,countryexchange countryexchange 5 cons‌​t 5 Using where; Using index |
1 SIMPLE stocks ref info4stats info4stats 9 databaseName.exchange.cod_exchange,d‌​atabaseName.stock_... 1 Using where; Using index |
I would try to preemptively start with the Country records for 125 and work in reverse. By using a Straight_join will force the order of your query as entered...
I would also have an index on your Tbl_History table by the COD_Stock and DATE( date ). So the query will properly and efficiently match the join condition on the pre-qualified date portion of the date/time field.
SELECT STRAIGHT_JOIN
th.buffet_q_rating,
th.cod_stock,
th.bqqq_change_month,
stocks.ticker,
c.country,
s.company
FROM
Exchange e
join Countries c
on e.Cod_Country = c.Cod_Country
join Stocks s
on e.cod_exchange = s.cod_exchange
join tbl_history th
on s.cod_stock = th.cod_stock
AND th.`Date` = '2011-07-25'
AND th.bqqq_change_month IS NOT NULL
AND th.buffet_q_rating IS NOT NULL
WHERE
e.Cod_Country = 125
ORDER BY
th.bqqq_change_month DESC
LIMIT 10
If you want to limit the result, why do you do it after you join all the table?
Try to reduce the size of those big tables first (LIMIT or WHERE them) before joining them with other tables.
But you have to be sure that your original query and your modified query means the same.
Update (Sample) :
select
tbl_user.user_id,
tbl_group.group_name
from
tbl_grp_user
inner join
(
select
tbl_user.user_id,
tbl_user.user_name
from
tbl_user
limit
5
) as tbl_user
on
tbl_user.user_id = tbl_grp_user.user_id
inner join
(
select
group_id,
group_name
from
tbl_group
where
tbl_group.group_id > 5
) as tbl_group
on
tbl_group.group_id = tbl_grp_user.group_id
Hopefully, query above will give you a hint