There is a table users with primary key as user_id and an indexed column called verified.
Another table user_profile has PK as profile_id and FK as user_id and has a column - name
Now, I need to find all verified users and their names. so i need to join these 2 tables on user_id -
Query becomes -
select p.name from user_profile p inner join user u on p.user_id = u.user_id
where u.verified = 1;
There are 700000 records in profile table and same number of records in user table. This query above takes 13 seconds to run. Please let me know, how can I optimize the running time.
MySQL version 5.5 , YII
EDIT
CREATE TABLE IF NOT EXISTS `tbl_profile` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(10) unsigned NOT NULL,
`regyear` int(4) DEFAULT NULL,
`firstname` varchar(128) NOT NULL,
`gender` varchar(10) NOT NULL,
`occupation` int(5) NOT NULL,
`street` varchar(255) DEFAULT NULL,
`state` int(10) DEFAULT NULL,
`city` int(10) DEFAULT NULL,
`zip` int(10) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `state` (`state`),
KEY `firstname` (`firstname`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=626494 ;
--
-- Table structure for table tbl_user
CREATE TABLE IF NOT EXISTS `tbl_user` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`email` varchar(255) DEFAULT NULL,
`password` varchar(128) NOT NULL,
`createtime` int(10) NOT NULL DEFAULT '0',
`lastvisit` int(10) NOT NULL DEFAULT '0',
`status` int(1) NOT NULL DEFAULT '0',
`verified` int(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `email` (`email`),
KEY `status` (`status`),
KEY `verified` (`verified`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=626494 ;
Output of EXPLAIN SELECT - I have written same query as above but substituting 999 for 1 and using column status instead of verified, which is equivalent to problem statement.
EXPLAIN SELECT p.firstname
FROM tbl_profile p
INNER JOIN tbl_user u ON p.user_id = u.id
WHERE u.status =999
+----+-------------+-------+------+----------------+---------+---------+-------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------+---------+---------+-------------+--------+-------------+
| 1 | SIMPLE | u | ref | PRIMARY,status | status | 4 | const | 313333 | Using index |
| 1 | SIMPLE | p | ref | user_id | user_id | 4 | newone.u.id | 1 | |
+----+-------------+-------+------+----------------+---------+---------+-------------+--------+-------------+
Suggestion 1
Adding an index on (user_id, first_name) will improve efficiency of this specific query:
ALTER TABLE tbl_profile
ADD INDEX user_id_first_name_IX -- just a name for the index
(user_id, first_name) ;
But if you also have similar queries, where you are selecting other columns, you'll need more indexes like this. And adding 5-10 indexes in the table is not too bad (it will only slow your inserts a bit.) But adding too many indexes will be harmful in the end.
Suggestion 2
If every user has maximum of 1 profile, then there is no need to have an auto-incrementing id in table profiles. I suggest you drop that column and make the user_id the primary key. I would also make that a foreign key as well:
ALTER TABLE tbl_profile
DROP PRIMARY KEY,
DROP COLUMN id,
ADD CONSTRAINT profile_PK
PRIMARY KEY (user_id),
ADD CONSTRAINT user_profile_FK
FOREIGN KEY (user_id)
REFERENCES tbl_user (id) ;
This is far better than suggestion 1, as you will basically make the user_id the clustered index of the table. Any query that uses user_id for a join on this table will be able to use this (primary and clustered) index.
You may gain a performance improvement by moving the condition into the join's ON clause:
select p.name
from user_profile p
join user u on p.user_id = u.user_id and u.verified = 1;
This reason it may perform better is that the WHERE clause is evaluated after all rows are joined - it's a filter on the result set. However the ON condition is evaluated as the join is being made, so there is a possibility that the database will have to deal with a lot less rows, and thus a lot less memory/resources.
Other than that change, I can't see anything else you could do.
Related
I have the following tables in question:
Personas
ImpressionsPersonas [join table - Personas ManyToMany Impressions]
Impressions
My query looks like this, the EXPLAIN results are attached below:
SELECT
DISTINCT (Personas.id),
Personas.parent_id,
Personas.persona,
Personas.subpersonas_count,
Personas.is_subpersona,
Personas.impressions_count,
Personas.created,
Personas.modified
FROM personas as Personas
INNER JOIN
impressions_personas ImpressionsPersonas ON (
Personas.id = ImpressionsPersonas.persona_id
)
inner JOIN impressions Impressions ON (Impressions.id = ImpressionsPersonas.impression_id AND Impressions.timestamp >= "2016-06-01 00:00:00" AND Impressions.timestamp <= "2016-07-31 00:00:00")
EXPLAIN
+----+-------------+---------------------+--------+-----------------------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------------------+--------+-----------------------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+-----------------------+
| 1 | SIMPLE | Personas | ALL | PRIMARY | NULL | NULL | NULL | 159 | 100.00 | Using temporary |
| 1 | SIMPLE | ImpressionsPersonas | ref | impression_idx,persona_idx,comp_imp_persona,comp_imp_pri,comp_per_pri | persona_idx | 8 | gen1_d2go.Personas.id | 396 | 100.00 | Distinct |
| 1 | SIMPLE | Impressions | eq_ref | PRIMARY,timestamp,timestamp_id | PRIMARY | 8 | gen1_d2go.ImpressionsPersonas.impression_id | 1 | 100.00 | Using where; Distinct |
+----+-------------+---------------------+--------+-----------------------------------------------------------------------+-------------+---------+---------------------------------------------+------+----------+-----------------------+
3 rows in set, 1 warning (0.00 sec)
CREATE STATEMENT FOR PERSONAS
CREATE TABLE `personas` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`parent_id` bigint(20) unsigned DEFAULT NULL,
`persona` varchar(150) NOT NULL,
`subpersonas_count` int(10) unsigned DEFAULT '0',
`is_subpersona` tinyint(1) unsigned DEFAULT '0',
`impressions_count` bigint(20) unsigned DEFAULT '0',
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `lookup` (`parent_id`,`persona`),
KEY `parent_index` (`parent_id`),
KEY `persona` (`persona`),
KEY `persona_a_id` (`id`,`persona`),
CONSTRAINT `self_referential_join_to_self` FOREIGN KEY (`parent_id`) REFERENCES `personas` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=1049 DEFAULT CHARSET=utf8;
CREATE STATEMENT FOR IMPRESSIONS_PERSONAS
CREATE TABLE `impressions_personas` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`impression_id` bigint(20) unsigned NOT NULL,
`persona_id` bigint(20) unsigned NOT NULL,
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `impression_idx` (`impression_id`),
KEY `persona_idx` (`persona_id`),
KEY `comp_imp_persona` (`impression_id`,`persona_id`),
KEY `comp_imp_pri` (`impression_id`,`id`),
KEY `comp_per_pri` (`persona_id`,`id`),
CONSTRAINT `impression` FOREIGN KEY (`impression_id`) REFERENCES `impressions` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION,
CONSTRAINT `persona` FOREIGN KEY (`persona_id`) REFERENCES `personas` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=19387839 DEFAULT CHARSET=utf8;
CREATE STATEMENT FOR IMPRESSIONS
CREATE TABLE `impressions` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`device_id` bigint(20) unsigned NOT NULL,
`beacon_id` bigint(20) unsigned NOT NULL,
`zone_id` bigint(20) unsigned NOT NULL,
`application_id` bigint(20) unsigned DEFAULT NULL,
`timestamp` datetime NOT NULL,
`google_place_id` bigint(20) unsigned DEFAULT NULL,
`name` varchar(60) DEFAULT NULL,
`lat` decimal(15,10) DEFAULT NULL,
`lng` decimal(15,10) DEFAULT NULL,
`personas_count` int(10) unsigned DEFAULT '0',
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `device_idx` (`device_id`),
KEY `zone_idx` (`zone_id`),
KEY `beacon_id_idx2` (`beacon_id`),
KEY `timestamp` (`timestamp`),
KEY `appid_fk_idx_idx` (`application_id`),
KEY `comp_lookup` (`device_id`,`beacon_id`,`timestamp`),
KEY `timestamp_id` (`timestamp`,`id`),
CONSTRAINT `appid_fk_idx` FOREIGN KEY (`application_id`) REFERENCES `applications` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `beacon_id` FOREIGN KEY (`beacon_id`) REFERENCES `beacons` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION,
CONSTRAINT `device2` FOREIGN KEY (`device_id`) REFERENCES `devices` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `zone_FK` FOREIGN KEY (`zone_id`) REFERENCES `zones` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=1582724 DEFAULT CHARSET=utf8;
Now - when I run the query without the DISTINCT and using a COUNT(*), it pulls about 17,000,000 records. Running it with DISTINCT yields 112 records. I am not sure why there are so many records showing up when the explain showed only 159 and 396.
Some information about the tables:
The Personas table contains 159 records. The ImpressionsPersonas table contains about 12.6 million, and Impressions contains about 920,000 records.
What we are doing is selecting the Personas table and joining to the Impressions by way of the join table ImpressionsPersonas. There are filters applied to the Impressions table (date in this case).
Note: removing the date filter had a negligible impact on the execution time - which hovers right around 120s. Is there a way to filter these records down to cut down the execution time of this query?
I presume that you want to get the list of persons who have at least 1 impression within a specified time period. To get this, you can use such a correlated sub-query:
SELECT
Personas.id,
Personas.parent_id,
Personas.persona,
Personas.subpersonas_count,
Personas.is_subpersona,
Personas.impressions_count,
Personas.created,
Personas.modified
FROM personas as Personas
WHERE EXISTS(SELECT 1 FROM impressions_personas
LEFT JOIN impressions Impressions ON
Impressions.id = ImpressionsPersonas.impression_id
WHERE Personas.id = ImpressionsPersonas.persona_id
AND Impressions.timestamp >= "2016-06-01 00:00:00"
AND Impressions.timestamp <= "2016-07-31 00:00:00"
)
Create an INDEX on timestamp column of impressions table. And see if improved else try using the created index in the query(forcing index).
UPDATE
Use INDEX in the JOIN
SELECT
DISTINCT (Personas.id),
Personas.parent_id,
Personas.persona,
Personas.subpersonas_count,
Personas.is_subpersona,
Personas.impressions_count,
Personas.created,
Personas.modified
FROM
personas as Personas
INNER JOIN
impressions_personas ImpressionsPersonas ON (
Personas.id = ImpressionsPersonas.persona_id
)
INNER JOIN
impressions Impressions WITH(INDEX(timestamp)) ON
(Impressions.id = ImpressionsPersonas.impression_id AND
Impressions.timestamp >= "2016-06-01 00:00:00" AND
Impressions.timestamp <= "2016-07-31 11:59:59")
At the moment you're first joining three tables with millions of rows and then using DISTINCT to get only a few rows out of it. Better way is to first get only the required IDs and then use those to select the actual result data.
For example:
SELECT column, other FROM personas
WHERE id IN
(SELECT distinct persona_id
FROM impressions_personas
INNER JOIN impressions Impressions
ON Impressions.id = ImpressionsPersonas.impression_id
AND Impressions.timestamp >= "2016-06-01 00:00:00"
AND Impressions.timestamp <= "2016-07-31 00:00:00"))
This way the engine will only handle a single column for the whole procedure until getting the results.
I have a few queries on a "custom dashboard" of my application, and one of them is taking 10-12 seconds to execute. Using EXPLAIN I can see why it's slow, but I don't know what to do about it. Here is the query:
SELECT person.PersonID,FullName,Furigana,qualdate FROM person
INNER JOIN (
SELECT pq.PersonID,MAX(ContactDate) AS qualdate FROM person pq
INNER JOIN contact cq ON pq.PersonID=cq.PersonID
WHERE cq.ContactTypeID IN (22,26,45) GROUP BY pq.PersonID
) qual ON person.PersonID=qual.PersonID
LEFT OUTER JOIN (
SELECT pe.personID,MAX(ContactDate) AS elimdate FROM person pe
INNER JOIN contact ce ON pe.PersonID=ce.PersonID WHERE ce.ContactTypeID IN (25,31,30,41,23,42,2,33,35,29,12)
GROUP BY pe.PersonID
) elim ON qual.PersonID=elim.PersonID
LEFT OUTER JOIN (
SELECT po.personID FROM person po
INNER JOIN percat pc ON po.PersonID=pc.PersonID WHERE pc.CategoryID=38
) overseas ON qual.PersonID=overseas.PersonID
WHERE (elimdate IS NULL OR qualdate > elimdate)
AND qualdate < CURDATE()-INTERVAL 7 DAY
AND overseas.PersonID IS NULL
ORDER BY qualdate
And here is the EXPLAIN result:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 5447 Using where; Using temporary; Using filesort
1 PRIMARY <derived3> ALL NULL NULL NULL NULL 5565 Using where
1 PRIMARY <derived4> ALL NULL NULL NULL NULL 9 Using where; Not exists
1 PRIMARY person eq_ref PRIMARY PRIMARY 4 qual.PersonID 1
4 DERIVED pc ref PRIMARY,CategoryID CategoryID 4 8
4 DERIVED po eq_ref PRIMARY PRIMARY 4 kizuna_misa.pc.PersonID 1 Using index
3 DERIVED pe index PRIMARY PRIMARY 4 NULL 5964 Using index
3 DERIVED ce ref PersonID,ContactTypeID PersonID 4 kizuna_misa.pe.PersonID 1 Using where
2 DERIVED pq index PRIMARY PRIMARY 4 NULL 5964 Using index
2 DERIVED cq ref PersonID,ContactTypeID PersonID 4 kizuna_misa.pq.PersonID 1 Using where
I'm sure the first line of the EXPLAIN reveals the problem (comparing with similar queries, it appears that the second line isn't too slow), but I don't know how to fix it. I already have indexes on every column that appears in the joins, but since the tables are <derived2> etc., I guess indexes are irrelevant.
The objective (since it's probably not obvious to someone unfamiliar with my application and schema) is a followup tickler list - if one of the #22/26/45 contacts has occurred but nothing has been done in response (either one of several other contacts or designating by a category assignment that the person is overseas), then the person should appear in the list for followup after waiting a week. Subqueries are easier for me to write and understand than these messy joins, but I can't check the sequence of dates (and subqueries are often slow, also).
EDIT (in response to Rick James):
MySQL version is 5.0.95 (yeah, I know...). And here is SHOW CREATE TABLE for the three tables involved, even though most of the fields in person are irrelevant:
CREATE TABLE `contact` (
`ContactID` int(11) unsigned NOT NULL auto_increment,
`PersonID` int(11) unsigned NOT NULL default '0',
`ContactTypeID` int(11) unsigned NOT NULL default '0',
`ContactDate` date NOT NULL default '0000-00-00',
`Description` text,
PRIMARY KEY (`ContactID`),
KEY `ContactDate` (`ContactDate`),
KEY `PersonID` (`PersonID`),
KEY `ContactTypeID` (`ContactTypeID`)
) ENGINE=MyISAM AUTO_INCREMENT=16901 DEFAULT CHARSET=utf8
CREATE TABLE `percat` (
`PersonID` int(11) unsigned NOT NULL default '0',
`CategoryID` int(11) unsigned NOT NULL default '0',
PRIMARY KEY (`PersonID`,`CategoryID`),
KEY `CategoryID` (`CategoryID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
CREATE TABLE `person` (
`PersonID` int(11) unsigned NOT NULL auto_increment,
`FullName` varchar(100) NOT NULL default '',
`Furigana` varchar(100) NOT NULL default '',
`Sex` enum('','M','F') character set ascii NOT NULL default '',
`HouseholdID` int(11) unsigned NOT NULL default '0',
`Relation` varchar(6) character set ascii NOT NULL default '',
`Title` varchar(6) NOT NULL default '',
`CellPhone` varchar(30) character set ascii NOT NULL default '',
`Email` varchar(70) character set ascii NOT NULL default '',
`Birthdate` date NOT NULL default '0000-00-00',
`Country` varchar(30) NOT NULL default '',
`URL` varchar(150) NOT NULL default '',
`Organization` tinyint(1) NOT NULL default '0',
`Remarks` text NOT NULL,
`Photo` tinyint(1) NOT NULL default '0',
`UpdDate` date NOT NULL default '0000-00-00',
PRIMARY KEY (`PersonID`),
KEY `Furigana` (`Furigana`),
KEY `FullName` (`FullName`),
KEY `Email` (`Email`),
KEY `Organization` (`Organization`,`Furigana`)
) ENGINE=MyISAM AUTO_INCREMENT=6063 DEFAULT CHARSET=utf8
Attempted suggestion:
I tried to implement Rick James's suggestion of putting the subselects in the field list (I didn't even know that was possible), like this:
SELECT
p.PersonID,
FullName,
Furigana,
(SELECT MAX(ContactDate) FROM contact cq
WHERE cq.PersonID=p.PersonID
AND cq.ContactTypeID IN (22,26,45))
AS qualdate,
(SELECT MAX(ContactDate) FROM contact ce
WHERE ce.PersonID=p.PersonID
AND ce.ContactTypeID IN (25,31,30,41,23,42,2,33,35,29,12))
AS elimdate
FROM person p
WHERE (elimdate IS NULL OR qualdate > elimdate)
AND qualdate < CURDATE()-INTERVAL 7 DAY
AND NOT EXISTS (SELECT * FROM percat WHERE CategoryID=38 AND percat.PersonID=p.PersonID)
ORDER BY qualdate
But it complains: #1054 - Unknown column 'elimdate' in 'where clause' According to the docs, WHERE clauses are interpreted before field lists, so this approach isn't going to work.
You have an interesting query. I am not sure what the best solution is. Here are two guesses:
Plan A
INDEX(qualdate)
may help. Please provide SHOW CREATE TABLE.
This construct optimizes poorly:
FROM ( SELECT ... )
JOIN ( SELECT ... )
In your case, overseas should probably turned into a JOIN, not a subselect. And the other two should probably be turned into a different flavor of dependent subquery:
SELECT ...,
( SELECT MAX(...) ... ) AS qualdate,
( SELECT MAX(...) ... ) AS elimdate
FROM ...
What version of MySQL are you running?
Plan B
If practical, fold these into the subqueries so that they generate fewer rows, thereby leading to less effort at the outer query. (One per subquery)
elimdate IS NOT NULL
qualdate < CURDATE()-INTERVAL 7 DAY
overseas.PersonID IS NOT NULL
Perhaps the NULL tests apply to LEFT and this suggestion may not apply.
Here's the basic query:
SELECT
some_columns
FROM
d
JOIN
m ON d.id = m.d_id
JOIN
s ON s.id = m.s_id
JOIN
p ON p.id = s.p_id
WHERE
some_criteria
ORDER BY
d.date DESC
LIMIT 25
Table m can contain multiple s_ids per each d_id. Here's a super simple example:
+--------+--------+------+
| id | d_id | s_id |
+--------+--------+------+
| 317354 | 291220 | 642 |
| 317355 | 291220 | 32 |
+--------+--------+------+
2 rows in set (0.00 sec)
Which we want. But, obviously, it's producing duplicate d records in this particular query.
These tables have lots of columns, and I need to edit these down due to the sensitive nature of the data, but here's the basic structure as it pertains to this query:
| d | CREATE TABLE `d` (
`id` bigint(22) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`),
KEY `date` (`date`)
) ENGINE=InnoDB |
| m | CREATE TABLE `m` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`d_id` bigint(20) NOT NULL,
`s_id` bigint(20) NOT NULL,
`is_king` binary(1) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `d_id` (`d_id`),
KEY `is_king` (`is_king`),
KEY `s_id` (`s_id`)
) ENGINE=InnoDB |
| s | CREATE TABLE `s` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`p_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `p_id` (`p_id`)
) ENGINE=InnoDB |
| p | CREATE TABLE `p` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=InnoDB |
Now, previously, we had a GROUP BY d.id in place to grab uniques. The data here are now huge, so we can no longer realistically do that. SELECT DISTINCT d.id is even slower.
Any ideas? Everything I come up with creates a problem elsewhere.
Does changing "JOIN m ON d.id = m.d_id" to "LEFT JOIN m ON d.id = m.d_id" accomplish what you're looking for here?
I'm not sure I understand your goal clearly, but "table m contains many rows per each d" immediately has me wondering if you should be using some other type of join to accomplish your ends.
I have got two tables. One is a User table with a primary key on the userid and the other table references the user table with a foreign key.
The User table has only one entry (for now) and the other table has one million entrys.
The following join drives me mad:
SELECT p0_.*, p1_.*
FROM photo p0_, User p1_
WHERE p0_.user_id = p1_.user_id
ORDER BY p0_.uploaddate DESC Limit 10 OFFSET 100000
The query takes 12sec on a very fast machine with the order by and 0.0005 sec without the order by.
I've got an index on user_id (IDX_14B78418A76ED395) and a composite index ("search2") on user_id and uploaddate.
EXPLAIN shows the following:
+----+-------------+-------+------+------------------------------+----------------------+---------+---------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+------------------------------+----------------------+---------+---------------------+-------+---------------------------------+
| 1 | SIMPLE | p1_ | ALL | PRIMARY | NULL | NULL | NULL | 1 | Using temporary; Using filesort |
| 1 | SIMPLE | p0_ | ref | IDX_14B78418A76ED395,search2 | IDX_14B78418A76ED395 | 4 | odsfoto.p1_.user_id | 58520 | |
+----+-------------+-------+------+------------------------------+----------------------+---------+---------------------+-------+---------------------------------+
Table definitions:
CREATE TABLE `photo` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`album_id` int(11) DEFAULT NULL,
`exif_id` int(11) DEFAULT NULL,
`title` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
`width` int(11) NOT NULL,
`height` int(11) NOT NULL,
`uploaddate` datetime NOT NULL,
`filesize` int(11) DEFAULT NULL,
`path` varchar(200) COLLATE utf8_unicode_ci NOT NULL,
`originalFilename` varchar(200) COLLATE utf8_unicode_ci NOT NULL,
`mimeType` varchar(200) COLLATE utf8_unicode_ci NOT NULL,
`description` longtext COLLATE utf8_unicode_ci,
`gpsData_id` int(11) DEFAULT NULL,
`views` int(11) DEFAULT NULL,
`likes` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_14B78418B0FC9251` (`exif_id`),
UNIQUE KEY `UNIQ_14B7841867E96507` (`gpsData_id`),
KEY `IDX_14B78418A76ED395` (`user_id`),
KEY `IDX_14B784181137ABCF` (`album_id`),
KEY `search_idx` (`uploaddate`),
KEY `search2` (`user_id`,`uploaddate`),
KEY `search3` (`uploaddate`,`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `user` (
`user_id` int(11) NOT NULL,
`photoCount` int(11) NOT NULL,
`photoViews` int(11) NOT NULL,
`photoComments` int(11) NOT NULL,
`photoLikes` int(11) NOT NULL,
`username` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
What can I do to speed up this query?
Seems you're suffering from MySQL's inability to do late row lookups:
MySQL ORDER BY / LIMIT performance: late row lookups
Late row lookups: InnoDB
Try this:
SELECT p.*, u.*
FROM (
SELECT id
FROM photo
ORDER BY
uploaddate DESC, id DESC
LIMIT 10
OFFSET 100000
) pi
JOIN photo p
ON p.id = pi.id
JOIN user u
ON u.user_id = p.user_id
You need a separate index on uploaddate. This sort will take advantage of composite index only if uploaddate is first column in it.
You can also try to add user_id to ORDER BY:
....
ORDER BY p0_.user_id, p0_.uploaddate
You have two problems:
You need to create an INDEX(user_id, uploaddate) which will greatly increase the efficiency of the query.
You need to find a workaround to using LIMIT 10 OFFSET 100000. MySQL is creating a recordset with 100,000 records in it, then it pulls the last 10 records off the end... that is extremely inefficient.
https://www.percona.com/blog/2006/09/01/mysql-order-by-limit-performance-optimization/
First try to get result based on primary key with out join and use result to query result again.
For ex:
$userIds=mysql::select("select user_id from photo ORDER BY p0_.uploaddate DESC Limit 10 OFFSET 100000");
$photoData=mysql::select("SELECT p0_., p1_.
FROM photo p0_, User p1_
WHERE p0_.user_id = p1_.user_id and p0_.user_id in ($userIds->user_id) order by p0_.uploaddate");
Here we had divided the statement into two parts:
1.We can easily order and get based on primary key and also there are no joins.
2.Getting query results based on id and order by is only on limited columns we can retrieve data in less time
From 30 seconds to 0.015 sec / 0.000 sec using Quassnoi answer !
This is what I called MySql expertise !
I cut out one Join from my personal project (no join with itself)
Select ser.id_table, ser.id_rec, ser.relevance, cnt, title, description, sell_url, medium_thumb,
unique_id_supplier, keywords width, height, media_type
from (
Select ser.id_rec, ser.id_table, ser.relevance, ser.cnt
from searchEngineResults ser
where thisSearch = 16287
order by ser.relevance desc, cnt desc, id_rec
) ser
join photo_resell sou on sou.id = ser.id_rec
#join searchEngineResults ser on ser.id_rec = tmp.id_rec
limit 0, 9
I'm doing a join between the "favorites" table (3 million rows) the "items" table (600k rows).
The query is taking anywhere from .3 seconds to 2 seconds, and I'm hoping I can optimize it some.
Favorites.faver_profile_id and Items.id are indexed.
Instead of using the faver_profile_id index I created a new index on (faver_profile_id,id), which eliminated the filesort needed when sorting by id. Unfortunately this index doesn't help at all and I'll probably remove it (yay, 3 more hours of downtime to drop the index..)
Any ideas on how I can optimize this query?
In case it helps:
Favorite.removed and Item.removed are "0" 98% of the time.
Favorite.collection_id is NULL about 80% of the time.
SELECT `Item`.`id`, `Item`.`source_image`, `Item`.`cached_image`, `Item`.`source_title`, `Item`.`source_url`, `Item`.`width`, `Item`.`height`, `Item`.`fave_count`, `Item`.`created`
FROM `favorites` AS `Favorite`
LEFT JOIN `items` AS `Item`
ON (`Item`.`removed` = 0 AND `Favorite`.`notice_id` = `Item`.`id`)
WHERE ((`faver_profile_id` = 1) AND (`collection_id` IS NULL) AND (`Favorite`.`removed` = 0) AND (`Item`.`removed` = '0'))
ORDER BY `Favorite`.`id` desc LIMIT 50;
+----+-------------+----------+--------+----------------------------------------------------- ----------+------------------+---------+-----------------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+--------+---------------------------------------------------------------+------------------+---------+-----------------------------------------+------+-------------+
| 1 | SIMPLE | Favorite | ref | notice_id,faver_profile_id,collection_id_idx,idx_faver_idx_id | idx_faver_idx_id | 4 | const | 7910 | Using where |
| 1 | SIMPLE | Item | eq_ref | PRIMARY | PRIMARY | 4 | gragland_imgfavebeta.Favorite.notice_id | 1 | Using where |
+----+-------------+----------+--------+---------------------------------------------------------------+------------------+---------+-----------------------------------------+------+-------------+
+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| favorites | CREATE TABLE `favorites` (
`id` int(11) NOT NULL auto_increment COMMENT 'unique identifier',
`faver_profile_id` int(11) NOT NULL default '0',
`collection_id` int(11) default NULL,
`collection_order` int(8) default NULL,
`created` datetime NOT NULL default '0000-00-00 00:00:00' COMMENT 'date this record was created',
`modified` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP COMMENT 'date this record was modified',
`notice_id` int(11) NOT NULL default '0',
`removed` tinyint(1) NOT NULL default '0',
PRIMARY KEY (`id`),
KEY `notice_id` (`notice_id`),
KEY `faver_profile_id` (`faver_profile_id`),
KEY `collection_id_idx` (`collection_id`),
KEY `idx_faver_idx_id` (`faver_profile_id`,`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| items |CREATE TABLE `items` (
`id` int(11) NOT NULL auto_increment COMMENT 'unique identifier',
`submitter_id` int(11) NOT NULL default '0' COMMENT 'who made the update',
`source_image` varchar(255) default NULL COMMENT 'update content',
`cached_image` varchar(255) default NULL,
`source_title` varchar(255) NOT NULL default '',
`source_url` text NOT NULL,
`width` int(4) NOT NULL default '0',
`height` int(4) NOT NULL default '0',
`status` varchar(122) NOT NULL default '',
`popular` int(1) NOT NULL default '0',
`made_popular` timestamp NULL default NULL,
`fave_count` int(9) NOT NULL default '0',
`tags` text,
`user_art` tinyint(1) NOT NULL default '0',
`nudity` tinyint(1) NOT NULL default '0',
`created` datetime NOT NULL default '0000-00-00 00:00:00' COMMENT 'date this record was created',
`modified` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP COMMENT 'date this record was modified',
`removed` int(1) NOT NULL default '0',
`nofront` tinyint(1) NOT NULL default '0',
`test` varchar(10) NOT NULL default '',
`recs` text,
`recs_data` text,
PRIMARY KEY (`id`),
KEY `notice_profile_id_idx` (`submitter_id`),
KEY `content` (`source_image`),
KEY `idx_popular` (`popular`),
KEY `idx_madepopular` (`made_popular`),
KEY `idx_favecount_idx_id` (`fave_count`,`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
First of all, you order by favorites.id which is clustered primary key in favorites table. This wil not be necessary of you will join favorites to items instead of items to favorites.
Second, (Item.removed = '0') in WHERE is excess, because the same condition has already been used in JOIN.
Third, change the order of condition in join to:
`Favorite`.`notice_id` = `Item`.`id` AND `Item`.`removed` = 0
the optimizer will be able to use you primary key for index. You may even consider creating (id, removed) index on items table.
Next, create (faver_profile_id, removed) index in favorites (or better update faver_profile_id index) and change the order of conditions in WHERE to the following:
(`faver_profile_id` = 1)
AND (`Favorite`.`removed` = 0)
AND (`collection_id` IS NULL)
UPD: I am sorry, I missed that you already join favorites to items. Then the ORDER BY is not needed. You should result in something like the following:
SELECT
`Item`.`id`,
`Item`.`source_image`,
`Item`.`cached_image`,
`Item`.`source_title`,
`Item`.`source_url`,
`Item`.`width`,
`Item`.`height`,
`Item`.`fave_count`,
`Item`.`created`
FROM `favorites` AS `Favorite`
LEFT JOIN `items` AS `Item`
ON (`Favorite`.`notice_id` = `Item`.`id` AND `Item`.`removed` = 0)
WHERE `faver_profile_id` = 1
AND `Favorite`.`removed` = 0
AND `collection_id` IS NULL
LIMIT 50;
And one more thing, when you have KEY idx_faver_idx_id (faver_profile_id,id) you do not need KEY faver_profile_id (faver_profile_id), because the second index just duplicates half of the idx_faver_idx_id. I hope you will extend the second index, as I suggested.
Get a copy of your table from backup, and try to make an index on Favorite table covering all WHERE and JOIN conditions, namely (removed, collection_id, profile_id). Do the same with Item. It might help, but will make inserts potentially much slower.
The SQL engine won't use an index if it still has to do full table scan due to constraints, would it?