Query with LEFT JOIN on large table really slow - mysql

The following query takes approximately 12 seconds to execute. I have tried optimizing but was not able to. The table to be joined is pretty large (> 8.000.000 records).
SELECT
p0_.id AS id_0,
p0_.ean AS ean_1,
p0_.brand AS brand_2,
p0_.type AS type_3,
p0_.retail_price AS retail_price_4,
p0_.target_price AS target_price_5,
min(NULLIF(c1_.delivery_price, 0)) AS sclr_6,
COALESCE(((p0_.target_price - min(NULLIF(c1_.delivery_price, 0))) / p0_.target_price * -100), 0) AS sclr_7
FROM product p0_
LEFT JOIN crawl c1_ ON (
c1_.product_ean = p0_.ean AND (
c1_.crawl_date = p0_.last_crawl_date OR
p0_.last_crawl_date IS NULL
)
AND c1_.source_id IN (
SELECT o2_.source_id AS sclr_8
FROM organisation_source o2_
WHERE o2_.organisation_id = 5
)
)
WHERE p0_.organisation_id = 5 GROUP BY p0_.ean
I already tried writing the query in a lot of different ways, but unfortunately did not give me any performance win. If I remove the subquery in the last AND it does not help either.
See below the output of the EXPLAIN statement:
+------+--------------+-------+------+---------------------------------------------------+------------------+---------+------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------+-------+------+---------------------------------------------------+------------------+---------+------------------------+--------+-------------+
| 1 | PRIMARY | p0_ | ref | uniqueConstraint,IDX_D34A04AD9E6B1585 | uniqueConstraint | 5 | const | 69 | Using where |
| 1 | PRIMARY | c1_ | ref | IDX_product_ean,IDX_crawl_date | IDX_product_ean | 62 | admin_pricev-p.p0_.ean | 468459 | Using where |
| 2 | MATERIALIZED | o2_ | ref | PRIMARY,IDX_DD91A56E9E6B1585,IDX_DD91A56E953C1C61 | PRIMARY | 4 | const | 1 | Using index |
+------+--------------+-------+------+---------------------------------------------------+------------------+---------+------------------------+--------+-------------+
See below the CREATE TABLE statements of the product and crawl tabel:
CREATE TABLE `product` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`organisation_id` int(11) DEFAULT NULL,
`ean` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
`brand` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
`type` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
`retail_price` decimal(10,2) NOT NULL,
`target_price` decimal(10,2) NOT NULL,
`last_crawl_date` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uniqueConstraint` (`organisation_id`,`ean`),
KEY `IDX_D34A04AD9E6B1585` (`organisation_id`),
KEY `IDX_target_price` (`target_price`),
KEY `IDX_ean` (`ean`),
KEY `IDX_type` (`type`),
KEY `IDX_last_crawl_date` (`last_crawl_date`),
CONSTRAINT `FK_D34A04AD9E6B1585` FOREIGN KEY (`organisation_id`) REFERENCES `organisation` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=927 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE `crawl` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`source_id` int(11) DEFAULT NULL,
`store_id` int(11) DEFAULT NULL,
`product_ean` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
`crawl_date` datetime NOT NULL,
`takeaway_price` decimal(10,2) DEFAULT NULL,
`delivery_price` decimal(10,2) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `IDX_B4E9F1C2953C1C61` (`source_id`),
KEY `IDX_B4E9F1C2B092A811` (`store_id`),
KEY `IDX_product_ean` (`product_ean`),
KEY `IDX_takeaway_price` (`takeaway_price`),
KEY `IDX_crawl_date` (`crawl_date`),
CONSTRAINT `FK_B4E9F1C2953C1C61` FOREIGN KEY (`source_id`) REFERENCES `source` (`id`),
CONSTRAINT `FK_B4E9F1C2B092A811` FOREIGN KEY (`store_id`) REFERENCES `store` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=8606874 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Anyone has any idea how to improve the performance of this query? Many thanks! If more information is needed please let me know!

You can probably simplify the query to:
SELECT . . .
FROM product p0_ LEFT JOIN
crawl c1_
ON c1_.product_ean = p0_.ean AND
c1_.crawl_date = p0_.last_crawl_date AND
EXISTS (SELECT 1
FROM organisation_source o2_
WHERE o2_.organisation_id = 5 AND c1_.source_id = o2_.source_id
)
WHERE p0_.organisation_id = 5
GROUP BY p0_.ean;
The p0_.last_crawl_date IS NULL is presumably unnecessary. A LEFT JOIN will keep all rows in the first table even when there is a NULL in a comparison. Your logic matches all rows in the second table (that meet the other conditions). That may be what you want, but I am guessing not.
In MySQL, exists is sometimes faster than in, which is why I've rewritten that portion.
For this query, you can speed it up using indexes: product(organisation_id, ean, last_crawl_date), crawl(product_ean, crawl_date, source_id) and organisation_source(source_id, organisation_id).

Try with composite indexes on your LEFT JOINs
CREATE TABLE `product` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`organisation_id` int(11) DEFAULT NULL,
`ean` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
`brand` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
`type` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
`retail_price` decimal(10,2) NOT NULL,
`target_price` decimal(10,2) NOT NULL,
`last_crawl_date` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uniqueConstraint` (`organisation_id`,`ean`),
KEY `IDX_D34A04AD9E6B1585` (`organisation_id`),
KEY `IDX_target_price` (`target_price`),
KEY `IDX_ean` (`ean`),
KEY `IDX_type` (`type`),
KEY `IDX_last_crawl_date` (`last_crawl_date`),
INDEX `IDX_testing1` (`ean`,`last_crawl_date`),
CONSTRAINT `FK_D34A04AD9E6B1585` FOREIGN KEY (`organisation_id`) REFERENCES `organisation` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=927 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE `crawl` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`source_id` int(11) DEFAULT NULL,
`store_id` int(11) DEFAULT NULL,
`product_ean` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
`crawl_date` datetime NOT NULL,
`takeaway_price` decimal(10,2) DEFAULT NULL,
`delivery_price` decimal(10,2) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `IDX_B4E9F1C2953C1C61` (`source_id`),
KEY `IDX_B4E9F1C2B092A811` (`store_id`),
KEY `IDX_product_ean` (`product_ean`),
KEY `IDX_takeaway_price` (`takeaway_price`),
KEY `IDX_crawl_date` (`crawl_date`),
INDEX `IDX_testing2` ( `source_id`,`product_ean`,`crawl_date`),
CONSTRAINT `FK_B4E9F1C2953C1C61` FOREIGN KEY (`source_id`) REFERENCES `source` (`id`),
CONSTRAINT `FK_B4E9F1C2B092A811` FOREIGN KEY (`store_id`) REFERENCES `store` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=8606874 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

Related

Unable to optimize queries - Bad design or Bad query or Both?

CREATE TABLE `questions` (
`question_id` int(5) unsigned NOT NULL AUTO_INCREMENT,
`FK_quiz_id` int(5) unsigned NOT NULL,
`question_identify_text` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`question_id`),
KEY `FK_quiz_id` (`FK_quiz_id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE `question_multilingual` (
`question_multilingual_id` int(5) unsigned NOT NULL AUTO_INCREMENT,
`FK_question_id` int(5) unsigned NOT NULL,
`FK_language_id` smallint(3) unsigned NOT NULL,
`question` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`question_multilingual_id`),
UNIQUE KEY `FK_question_id` (`FK_question_id`,`FK_language_id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE `choices` (
`choice_id` int(5) unsigned NOT NULL AUTO_INCREMENT,
`FK_question_id` int(5) unsigned NOT NULL,
`choice_identify_text` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`choice_id`),
KEY `FK_question_id` (`FK_question_id`)
) ENGINE=MyISAM AUTO_INCREMENT=6 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE `choice_multilingual` (
`choice_multilingual_id` int(5) unsigned NOT NULL AUTO_INCREMENT,
`FK_choice_id` int(5) unsigned NOT NULL,
`FK_language_id` smallint(3) unsigned NOT NULL,
`choice` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`choice_multilingual_id`),
UNIQUE KEY `FK_choice_id` (`FK_choice_id`,`FK_language_id`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
These are the table structures.
EXPLAIN SELECT * FROM `questions` inner join question_multilingual on question_multilingual.FK_question_id = questions.question_id AND question_multilingual.FK_language_id = 1 INNER JOIN choices ON choices.FK_question_id = questions.question_id INNER JOIN choice_multilingual ON choice_multilingual.FK_choice_id = choices.choice_id AND choice_multilingual.FK_language_id = 1 WHERE FK_quiz_id = 1
1 SIMPLE choice_multilingual NULL ALL FK_choice_id NULL NULL NULL 14 10.00 Using where
1 SIMPLE question_multilingual NULL ref FK_question_id,FK_language_id FK_language_id 2 const 2 100.00 NULL
1 SIMPLE choices NULL eq_ref PRIMARY,FK_question_id PRIMARY 4 quizzes.choice_multilingual.FK_choice_id 1 14.29 Using where
1 SIMPLE questions NULL eq_ref PRIMARY,FK_quiz_id PRIMARY 4 quizzes.question_multilingual.FK_question_id 1 100.00 Using where
I tried to change composite index, single index. Tried to add columns.
The results varies but not getting the optimized one whatever i do.

How do I improve performance on a DISTINCT select across three joined tables?

I have the following tables in question:
Personas
ImpressionsPersonas [join table - Personas ManyToMany Impressions]
Impressions
My query looks like this, the EXPLAIN results are attached below:
SELECT
DISTINCT (Personas.id),
Personas.parent_id,
Personas.persona,
Personas.subpersonas_count,
Personas.is_subpersona,
Personas.impressions_count,
Personas.created,
Personas.modified
FROM personas as Personas
INNER JOIN
impressions_personas ImpressionsPersonas ON (
Personas.id = ImpressionsPersonas.persona_id
)
inner JOIN impressions Impressions ON (Impressions.id = ImpressionsPersonas.impression_id AND Impressions.timestamp >= "2016-06-01 00:00:00" AND Impressions.timestamp <= "2016-07-31 00:00:00")
EXPLAIN
+----+-------------+---------------------+--------+-----------------------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------------------+--------+-----------------------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+-----------------------+
| 1 | SIMPLE | Personas | ALL | PRIMARY | NULL | NULL | NULL | 159 | 100.00 | Using temporary |
| 1 | SIMPLE | ImpressionsPersonas | ref | impression_idx,persona_idx,comp_imp_persona,comp_imp_pri,comp_per_pri | persona_idx | 8 | gen1_d2go.Personas.id | 396 | 100.00 | Distinct |
| 1 | SIMPLE | Impressions | eq_ref | PRIMARY,timestamp,timestamp_id | PRIMARY | 8 | gen1_d2go.ImpressionsPersonas.impression_id | 1 | 100.00 | Using where; Distinct |
+----+-------------+---------------------+--------+-----------------------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+-----------------------+
3 rows in set, 1 warning (0.00 sec)
CREATE STATEMENT FOR PERSONAS
CREATE TABLE `personas` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`parent_id` bigint(20) unsigned DEFAULT NULL,
`persona` varchar(150) NOT NULL,
`subpersonas_count` int(10) unsigned DEFAULT '0',
`is_subpersona` tinyint(1) unsigned DEFAULT '0',
`impressions_count` bigint(20) unsigned DEFAULT '0',
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `lookup` (`parent_id`,`persona`),
KEY `parent_index` (`parent_id`),
KEY `persona` (`persona`),
KEY `persona_a_id` (`id`,`persona`),
CONSTRAINT `self_referential_join_to_self` FOREIGN KEY (`parent_id`) REFERENCES `personas` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=1049 DEFAULT CHARSET=utf8;
CREATE STATEMENT FOR IMPRESSIONS_PERSONAS
CREATE TABLE `impressions_personas` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`impression_id` bigint(20) unsigned NOT NULL,
`persona_id` bigint(20) unsigned NOT NULL,
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `impression_idx` (`impression_id`),
KEY `persona_idx` (`persona_id`),
KEY `comp_imp_persona` (`impression_id`,`persona_id`),
KEY `comp_imp_pri` (`impression_id`,`id`),
KEY `comp_per_pri` (`persona_id`,`id`),
CONSTRAINT `impression` FOREIGN KEY (`impression_id`) REFERENCES `impressions` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION,
CONSTRAINT `persona` FOREIGN KEY (`persona_id`) REFERENCES `personas` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=19387839 DEFAULT CHARSET=utf8;
CREATE STATEMENT FOR IMPRESSIONS
CREATE TABLE `impressions` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`device_id` bigint(20) unsigned NOT NULL,
`beacon_id` bigint(20) unsigned NOT NULL,
`zone_id` bigint(20) unsigned NOT NULL,
`application_id` bigint(20) unsigned DEFAULT NULL,
`timestamp` datetime NOT NULL,
`google_place_id` bigint(20) unsigned DEFAULT NULL,
`name` varchar(60) DEFAULT NULL,
`lat` decimal(15,10) DEFAULT NULL,
`lng` decimal(15,10) DEFAULT NULL,
`personas_count` int(10) unsigned DEFAULT '0',
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `device_idx` (`device_id`),
KEY `zone_idx` (`zone_id`),
KEY `beacon_id_idx2` (`beacon_id`),
KEY `timestamp` (`timestamp`),
KEY `appid_fk_idx_idx` (`application_id`),
KEY `comp_lookup` (`device_id`,`beacon_id`,`timestamp`),
KEY `timestamp_id` (`timestamp`,`id`),
CONSTRAINT `appid_fk_idx` FOREIGN KEY (`application_id`) REFERENCES `applications` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `beacon_id` FOREIGN KEY (`beacon_id`) REFERENCES `beacons` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION,
CONSTRAINT `device2` FOREIGN KEY (`device_id`) REFERENCES `devices` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `zone_FK` FOREIGN KEY (`zone_id`) REFERENCES `zones` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=1582724 DEFAULT CHARSET=utf8;
Now - when I run the query without the DISTINCT and using a COUNT(*), it pulls about 17,000,000 records. Running it with DISTINCT yields 112 records. I am not sure why there are so many records showing up when the explain showed only 159 and 396.
Some information about the tables:
The Personas table contains 159 records. The ImpressionsPersonas table contains about 12.6 million, and Impressions contains about 920,000 records.
What we are doing is selecting the Personas table and joining to the Impressions by way of the join table ImpressionsPersonas. There are filters applied to the Impressions table (date in this case).
Note: removing the date filter had a negligible impact on the execution time - which hovers right around 120s. Is there a way to filter these records down to cut down the execution time of this query?
I presume that you want to get the list of persons who have at least 1 impression within a specified time period. To get this, you can use such a correlated sub-query:
SELECT
Personas.id,
Personas.parent_id,
Personas.persona,
Personas.subpersonas_count,
Personas.is_subpersona,
Personas.impressions_count,
Personas.created,
Personas.modified
FROM personas as Personas
WHERE EXISTS(SELECT 1 FROM impressions_personas
LEFT JOIN impressions Impressions ON
Impressions.id = ImpressionsPersonas.impression_id
WHERE Personas.id = ImpressionsPersonas.persona_id
AND Impressions.timestamp >= "2016-06-01 00:00:00"
AND Impressions.timestamp <= "2016-07-31 00:00:00"
)
Create an INDEX on timestamp column of impressions table. And see if improved else try using the created index in the query(forcing index).
UPDATE
Use INDEX in the JOIN
SELECT
DISTINCT (Personas.id),
Personas.parent_id,
Personas.persona,
Personas.subpersonas_count,
Personas.is_subpersona,
Personas.impressions_count,
Personas.created,
Personas.modified
FROM
personas as Personas
INNER JOIN
impressions_personas ImpressionsPersonas ON (
Personas.id = ImpressionsPersonas.persona_id
)
INNER JOIN
impressions Impressions WITH(INDEX(timestamp)) ON
(Impressions.id = ImpressionsPersonas.impression_id AND
Impressions.timestamp >= "2016-06-01 00:00:00" AND
Impressions.timestamp <= "2016-07-31 11:59:59")
At the moment you're first joining three tables with millions of rows and then using DISTINCT to get only a few rows out of it. Better way is to first get only the required IDs and then use those to select the actual result data.
For example:
SELECT column, other FROM personas
WHERE id IN
(SELECT distinct persona_id
FROM impressions_personas
INNER JOIN impressions Impressions
ON Impressions.id = ImpressionsPersonas.impression_id
AND Impressions.timestamp >= "2016-06-01 00:00:00"
AND Impressions.timestamp <= "2016-07-31 00:00:00"))
This way the engine will only handle a single column for the whole procedure until getting the results.

slow mysql query with inner join 0846 sec

I have another problem when i used query with INNER JOIN
this query
SELECT *
FROM `engine4_product_file` INNER JOIN
`engine4_file`
ON engine4_product_file.fid = engine4_file.id
WHERE engine4_product_file.pid IN (3347,3346,3345,3343,3342,3337) and
engine4_file.active = 1 AND
engine4_file.ext IN ('jpg','gif','png','jpeg')
and this create table engine4_product_file
CREATE TABLE `engine4_product_file` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`fid` int(11) NOT NULL,
`pid` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `engine4_product_file` (`fid`),
KEY `pid` (`pid`)
) ENGINE=InnoDB AUTO_INCREMENT=6549 DEFAULT CHARSET=latin1
and this create table engine4_file
CREATE TABLE `engine4_file` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`uid` int(11) NOT NULL,
`name` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`url` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`active` int(11) NOT NULL DEFAULT '1',
`size` int(11) DEFAULT NULL,
`ext` varchar(10) DEFAULT NULL,
`folder` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=48801 DEFAULT CHARSET=latin1
this explain
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE engine4_product_file range engine4_product_file,pid pid 4 NULL 30 Using where
1 SIMPLE engine4_file eq_ref PRIMARY PRIMARY 4 akafine_social2.engine4_product_file.fid 1 Using where
Change your WHERE conditions
WHERE engine4_file.active = 1 AND
engine4_file.ext IN ('jpg','gif','png','jpeg') AND
engine4_product_file.pid IN (3347,3346,3345,3343,3342,3337)
Add an index
ALTER TABLE engine4_file ADD KEY (active,ext)

How to improve performance of query with two inner joins?

I have 3 tables :
CREATE TABLE `t_event` (
`id` int(10) NOT NULL auto_increment,
`title` varchar(100) NOT NULL,
`kind` int(10) NOT NULL,
`type` int(10) NOT NULL,
`short_desc` varchar(500) default NULL,
`long_desc` varchar(1500) default NULL,
`location` int(10) NOT NULL,
`price` decimal(11,0) NOT NULL,
`currency` int(11) NOT NULL default '1',
`remark_price` varchar(250) default NULL,
`remark_prerequisite` varchar(250) default NULL,
`date_start` date NOT NULL,
`date_end` date default NULL,
`date_remark` varchar(300) default NULL,
`time_start` time default NULL,
`time_end` time default NULL,
`remark_time` varchar(50) default NULL,
`leader` int(50) NOT NULL,
`leader2` int(100) NOT NULL,
`eve_contact_name` varchar(50) default NULL,
`eve_contact_phone` varchar(50) default NULL,
`eve_contact_email` varchar(50) default NULL,
`eve_contact_url` varchar(150) default NULL,
`eve_image_path` varchar(250) default NULL,
`provider` int(10) default NULL,
`timestamp` datetime NOT NULL,
`last_change` datetime NOT NULL default '0000-00-00 00:00:00',
`quality` int(10) default NULL,
`min_number` int(10) NOT NULL,
`max_number` int(10) NOT NULL,
`active_for_reservation` tinyint(1) NOT NULL,
`cancellation_day1` int(10) NOT NULL,
`cancellation_day2` int(10) NOT NULL,
`cancellation_fee1` varchar(255) NOT NULL,
`cancellation_fee2` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
KEY `FK_t_event_t_event_kind` (`kind`),
KEY `FK_t_event_t_event_type` (`type`),
KEY `FK_t_event_t_location` (`location`),
KEY `FK_t_event_t_currency` (`currency`),
KEY `FK_t_event_t_leader` (`leader`),
KEY `FK_t_event_t_provider` (`provider`),
KEY `FK_t_event_t_quality` (`quality`),
CONSTRAINT `FK_t_event_t_currency` FOREIGN KEY (`currency`) REFERENCES `t_currency` (`id`),
CONSTRAINT `FK_t_event_t_event_kind` FOREIGN KEY (`kind`) REFERENCES `t_event_kind` (`id`),
CONSTRAINT `FK_t_event_t_event_type` FOREIGN KEY (`type`) REFERENCES `t_event_type` (`id`),
CONSTRAINT `FK_t_event_t_leader` FOREIGN KEY (`leader`) REFERENCES `t_leader` (`id`),
CONSTRAINT `FK_t_event_t_location` FOREIGN KEY (`location`) REFERENCES `t_location` (`id`),
CONSTRAINT `FK_t_event_t_provider` FOREIGN KEY (`provider`) REFERENCES `t_provider` (`id`),
CONSTRAINT `FK_t_event_t_quality` FOREIGN KEY (`quality`) REFERENCES `t_quality` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=8432 DEFAULT CHARSET=latin1
CREATE TABLE `t_location` (
`id` int(10) NOT NULL auto_increment,
`loc_name` varchar(50) NOT NULL,
`loc_detail` varchar(50) default NULL,
`loc_adress1` varchar(50) NOT NULL,
`loc_adress2` varchar(50) default NULL,
`loc_country` int(50) NOT NULL default '1',
`loc_zip` varchar(50) NOT NULL,
`loc_loc` varchar(50) NOT NULL,
`loc_shortdesc` varchar(250) default NULL,
`loc_contact_name` varchar(250) default NULL,
`loc_contact_gender` int(10) default NULL,
`loc_contact_phone` varchar(250) default NULL,
`loc_contact_email` varchar(250) default NULL,
`loc_contact_url` varchar(250) default NULL,
`loc_image_path` varchar(250) default NULL,
`latitude` varchar(100) default NULL,
`longitude` varchar(100) default NULL,
`created` datetime NOT NULL,
`last_change` datetime NOT NULL default '0000-00-00 00:00:00',
`provider` int(10) NOT NULL default '1',
PRIMARY KEY (`id`),
UNIQUE KEY `id` USING BTREE (`id`),
KEY `FK_t_location_t_country` (`loc_country`),
KEY `FK_t_location_t_gender` (`loc_contact_gender`),
KEY `FK_t_location_t_provider` (`provider`),
CONSTRAINT `FK_t_location_t_country` FOREIGN KEY (`loc_country`) REFERENCES `t_country`(`id`),
CONSTRAINT `FK_t_location_t_provider` FOREIGN KEY (`provider`) REFERENCES `t_provider` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1287 DEFAULT CHARSET=latin1
CREATE TABLE `t_dates` (
`id` int(10) NOT NULL auto_increment,
`events_id` int(10) NOT NULL,
`events_start_date` date NOT NULL,
`events_end_date` date NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `IND_id` (`id`),
KEY `IND_events_id` (`events_id`),
CONSTRAINT `t_dates_ibfk_1` FOREIGN KEY (`events_id`) REFERENCES `t_event` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=32048 DEFAULT CHARSET=latin1
My Query is :
SELECT e.*,I.* ,d.*
FROM t_event AS e
INNER JOIN t_location AS I ON I.id = e.location
INNER JOIN t_dates AS d ON d.events_id = e.id
;
this query take 90s to be executed and return = 27727
The PROFILE command show that section "sending data" take almost the time of execution.
The EXPLAIN command is the following :
+----+------------+------+------+----------------------------+--------------------+---------+-----------+-------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+------------+------+------+----------------------------+--------------------+---------+-----------+-------+-------+
| 1 | SIMPLE | I | ALL | PRIMARY,id | NULL | NULL | NULL | 1143 | |
| 1 | SIMPLE | e | ref | PRIMARY,FK_t_event_t_location | FK_t_event_t_location | 4 | wu_db.I.id | 4 | |
| 1 | SIMPLE | d | ref | IND_events_id | IND_events_id | 4 | wu_db.e.id | 3 | |
+----+------------+------+------+----------------------------+--------------------+---------+-----------+-------+-------+
My point of view is that the big number of column is responsible of this slowdown but even when I write "SELECT e.id, I.events_id, d.id" it still take 16 s.
I think that I have to rewrite the query with LIMIT and OFFSET clause, what do you think?
number of records for each tables :
t_event = 7991
t_location = 1086
t_dates = 27727
Broadly speaking, MySQL can only filter records using one index from each table in a query.
That is, whilst your t_event table has indexes defined on both id and location, only one of those indexes can be used to satisfy your query. You can see this in your EXPLAIN output, which indicates that both the PRIMARY and FK_t_event_t_location keys were identified as possibly useful (with the latter actually selected for use).
Therefore, your join with t_dates, which involves a test on the id column, is being fulfilled with a table scan rather than an index lookup. Again, you can see this from the first row in the EXPLAIN output which shows type = ALL (table scan) and key = NULL (no index being used).
You should create a composite index on (id, location) for your t_event table:
ALTER TABLE t_event ADD INDEX (id, location);
My point of view is that the big number of column is responsible of this slowdown but even > when I write "SELECT e.id, I.events_id, d.id" it still take 16 s.
I think that I have to rewrite the query with LIMIT and OFFSET clause, what do you think?
I think you're right.
If you could speed up the JOIN by a factor of infinite, you would decrease to zero the "select" phase, and would leave the "sending data" part untouched - that's the other 74 seconds.
In other words, an infinite effort of query optimization would give you an advantage of 16 seconds out of 90 - around 18% overall.
If this is a batch query, then the time isn't so important; if it is not, as I believe, then I think it's really unlikely that someone is going to want a display, or even a synopsis, of some 27 thousands items.
Apart from a "paging" approach, or if a paging approach turned out not to be practical, or even in addition to a paging approach, you could see whether your application could use some kind of "filter" query (date ranges, location ranges).
So I'd study what WHERE conditions might be used to make that selection leaner.
If it is a Web application, you could SELECT only the IDs (the query you already tried, the one taking only 16 s; and with a WHERE, maybe even less), or as few columns as possible. Let's imagine that now you're displaying a very long page with lots of "forms" holding all the information, e.g.
...
+----------------------------------------+
| Date: .... Place: ...... Notes: .... |
| Leader: .... Cost: .... |
... and so on and so forth ...
+----------------------------------------+
| Date: .... Place: ...... Notes: .... |
... and so on and so forth ...
+----------------------------------------+
...
You could, instead, display only a very basic, minimal set of information, corresponding to the columns you have fetched:
...
+----------------------------------------+
| Date: .... Place: ...... <OPEN>|
+----------------------------------------+
| Date: .... Place: ...... <OPEN>|
+----------------------------------------+
| Date: .... Place: ...... <OPEN>|
+----------------------------------------+
| Date: .... Place: ...... <OPEN>|
...
At this point, the user will be able to quickly browse the list, but almost certainly won't open all those forms, but only two or three. When the user clicks on OPEN, a jQuery function could issue a very fast AJAX call to the server, supplying the data with the IDs; then three separate queries would retrieve all the relevant data in milliseconds.
The data would be json_encode()d and sent back to jQuery, and the form would "open" displaying all the information in "accordion" fashion:
+----------------------------------------+
| Date: .... Place: ...... <OPEN>|
+----------------------------------------+
| Date: .... Place: ...... <CLOSE>|
| large form with all the information
| ...
| ...
+----------------------------------------+
| Date: .... Place: ...... <OPEN>|
This way you would not need to immediately retrieve all the columns, especially those largish columns such as short_desc and long_desc, which can reach two whole Kb between them, and yet the user would experience very fast response.

How to set correct index for this MySql query?

I have this query showing up in MySql slow query log. (It is not slow, but it is not using indexes right). I need some help on how to set up the index right.
SELECT tbladded.amount*SUM(tbladdeditem.amount)
FROM tbladded
INNER JOIN tbladdeditem ON tbladded.addedid = tbladdeditem.addedid AND tbladdeditem.deleted='False'
WHERE tbladded.userid=100
AND tbladded.date='2012-01-01'
AND tbladded.deleted='False'
GROUP BY tbladded.addedid
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tbladded ref PRIMARY,userid_date userid_date 8 const,const 1 Using where
1 SIMPLE tbladdeditem ref addedid addedid 5 tbladded.addedid 1 Using where
This is how the tables look like:
CREATE TABLE `tbladded` (
`addedid` int(11) NOT NULL AUTO_INCREMENT,
`amount` double DEFAULT NULL,
`date` date DEFAULT NULL,
`userid` mediumint(9) DEFAULT NULL,
`deleted` enum('False','True') CHARACTER SET latin1 DEFAULT 'False',
PRIMARY KEY (`addedid`),
KEY `userid_date` (`userid`,`date`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `tbladdeditem` (
`addeditemid` int(11) NOT NULL AUTO_INCREMENT,
`amount` double DEFAULT NULL,
`addedid` int(11) DEFAULT NULL,
`userid` mediumint(9) DEFAULT NULL,
`deleted` enum('False','True') CHARACTER SET latin1 DEFAULT 'False',
PRIMARY KEY (`addeditemid`),
KEY `addedid` (`addedid`),
KEY `userid` (`userid`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
try this:
ALTER TABLE `tbladded` ADD INDEX
`tbladdedIndex` (`userid`, `date`, `deleted`);