Slow mysql query with multiple joins - mysql

I have the following tables in my database:
product_fav:
CREATE TABLE `product_fav` (
`user_id` int(9) unsigned NOT NULL,
`asin` varchar(10) NOT NULL DEFAULT '',
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`price` decimal(7,2) NOT NULL,
PRIMARY KEY (`user_id`,`asin`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
product_info:
CREATE TABLE `product_info` (
`asin` varchar(10) NOT NULL,
`name` varchar(200) DEFAULT NULL,
`brand` varchar(50) DEFAULT NULL,
`part_number` varchar(50) DEFAULT NULL,
`url` text,
`image` text,
`availabillity` tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`asin`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
product_price:
CREATE TABLE `product_price` (
`asin` varchar(10) NOT NULL,
`date` date NOT NULL,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`price` decimal(7,2) NOT NULL DEFAULT '0.00',
PRIMARY KEY (`asin`,`date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
I have the following query:
SELECT pi.*,
pp.price,
pf.date,
pf.price AS price_added,
round((100.0 (pp.price - pf.price) / pf.price),0) AS percentdiff
FROM product_info pi
JOIN
(
SELECT *
FROM product_price
ORDER BY date DESC) pp
ON pp.asin = pi.asin
JOIN product_fav pf
ON pp.asin = pf.asin
WHERE pf.user_id=". $user['user_id'] ."
GROUP BY asin
Product price has many records and query needs about 3 second. Is it possible to make it faster?
I have also the same issue with search query:
SELECT pi.*,
price,
date
FROM product_info pi
JOIN (SELECT *
FROM product_price
ORDER BY date DESC) pp
ON pi.asin = pp.asin
WHERE ( ` NAME ` LIKE '%".$search."%' )
GROUP BY pi.asin
ORDER BY price
EXPLAIN return this:
+----+-------------+---------------+--------+---------------+---------+---------+---------------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+--------+---------------+---------+---------+---------------+--------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 106709 | Using temporary; Using filesort |
| 1 | PRIMARY | pi | eq_ref | PRIMARY | PRIMARY | 32 | pp.asin | 1 | |
| 1 | PRIMARY | pf | eq_ref | PRIMARY | PRIMARY | 36 | const,pp.asin | 1 | |
| 2 | DERIVED | product_price | ALL | NULL | NULL | NULL | NULL | 112041 | Using filesort |
+----+-------------+---------------+--------+---------------+---------+---------+---------------+--------+---------------------------------+

You dont ORDER before JOIN, If you need order do it after the WHERE and GROUP BY so less data to sort.
JOIN
(
SELECT *
FROM product_price
ORDER BY date DESC) pp
Create index for asin so JOIN for ON pp.asin = pi.asin will be more efficient
Create index for user_id so the WHERE pf.user_id=". $user['user_id'] ." will be more efficient

Try running an EXPLAIN on your query to figure out where the bottle-neck is.
What's with the ORDER BY date in the inner query? Try getting rid of it. Also try replacing the inner query with a JOIN, they tend to be faster.
Also, do you have an index on the date field? Try adding one for the ORDER BY at the end of the query.

Related

Slow MySQL Query, Group By Order By Limit

I currently join 5 tables to select 20 objects to show the user, unfortunately if I use GROUP BY and ORDER BY it gets really slow.
An example query looks Like this:
SELECT r.name, l.name, o.typ, o.id, persons, children, description, rating, totalratings, minprice, picture FROM angebote as a
JOIN objekte as o ON a.fid_objekt = o.id
JOIN regionen as r ON a.fid_region = r.id
JOIN laender as l ON a.fid_land = l.id
WHERE l.slug="aegypten" AND a.letztes_angebot >= 1
GROUP BY a.fid_objekt ORDER BY rating DESC LIMIT 0,20
The EXPLAIN of the Query shows this:
+------+-------------+-------+--------+----------------------------+------------+---------+---------------------------------------+--------+--------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+--------+----------------------------+------------+---------+---------------------------------------+--------+--------------------------------------------------------+
| 1 | SIMPLE | l | ref | PRIMARY,slug | slug | 767 | const | 1 | Using index condition; Using temporary; Using filesort |
| 1 | SIMPLE | o | ALL | PRIMARY | NULL | NULL | NULL | 186779 | Using join buffer (flat, BNL join) |
| 1 | SIMPLE | a | ref | unique_key,letztes_angebot | unique_key | 8 | ferienhaeuser.o.id,ferienhaeuser.l.id | 1 | Using where |
| 1 | SIMPLE | r | eq_ref | PRIMARY | PRIMARY | 4 | ferienhaeuser.a.fid_region | 1 | |
+------+-------------+-------+--------+----------------------------+------------+---------+---------------------------------------+--------+--------------------------------------------------------+
So it looks like it doesn't use a key for the table objekte, the Profiling says it uses 2.7s for Copying to tmp table.
Instead of FROM angebote or JOIN objekte I tried it with (SELECT * GROUP BY id) but unfortunately this doesn't improve.
The fields used for WHERE, ORDER BY and GROUP BY are also indexed.
I think I missed some basic concept here and any help will be appreciated.
Since it's most probable I made a mistake with the Tables, here the description of them:
Objekte
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| objekte | CREATE TABLE `objekte` (
`id` int(11) NOT NULL,
`typ` varchar(50) NOT NULL,
`persons` int(11) NOT NULL,
`children` int(11) NOT NULL,
`description` text NOT NULL,
`rating` float NOT NULL,
`totalratings` int(11) NOT NULL,
`minprice` float NOT NULL,
`picture` varchar(255) NOT NULL,
`last_offer` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `minprice` (`minprice`),
KEY `rating` (`rating`),
KEY `last_offer` (`last_offer`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Angebote
+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| angebote | CREATE TABLE `angebote` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`fid_objekt` int(11) NOT NULL,
`fid_land` int(11) NOT NULL,
`fid_region` int(11) NOT NULL,
`fid_subregion` int(11) NOT NULL,
`letztes_angebot` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_key` (`fid_objekt`,`fid_land`,`fid_region`,`fid_subregion`),
KEY `letztes_angebot` (`letztes_angebot`),
KEY `fid_objekt` (`fid_objekt`),
KEY `fid_land` (`fid_land`),
KEY `fid_region` (`fid_region`),
KEY `fid_subregion` (`fid_subregion`)
) ENGINE=InnoDB AUTO_INCREMENT=2433073 DEFAULT CHARSET=utf8 |
+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
laender, regionen, subregionen (same structure)
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| laender | CREATE TABLE `laender` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`iso` varchar(2) NOT NULL,
`name` varchar(255) NOT NULL,
`slug` varchar(255) NOT NULL,
`letztes_angebot` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `iso` (`iso`),
KEY `slug` (`slug`),
KEY `letztes_angebot` (`letztes_angebot`)
) ENGINE=InnoDB AUTO_INCREMENT=107 DEFAULT CHARSET=utf8 |
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
First of all this is a non standard group by. As such it will stop working when you upgrade to mysql 5.7.
The biggest problem comes from the fact that no index is used on the objekte table. To make matters worse you are ordering on the ratings field on that table but the index is still not being used. A possible solution is to create a composite index like this:
CREATE INDEX objekte_idx ON objekte(id,rating);
You do not need to use GROUP BY here. You have not use aggregrate functions. So remove GROUP BY from query. Remove the Group By will increase query performance. Also no need to define 0 for limit.
SELECT r.name, l.name, o.typ, o.id, persons, children, description, rating, totalratings, minprice, picture FROM angebote as a
JOIN objekte as o ON a.fid_objekt = o.id
JOIN regionen as r ON a.fid_region = r.id
JOIN laender as l ON a.fid_land = l.id
WHERE l.slug="aegypten" AND a.letztes_angebot >= 1
ORDER BY rating DESC LIMIT 20

Complex query extremely slow when grouping or ordering - index suggestions?

So, I've been given the task of replicating functionality we currently handle via code, within MySQL.
The query below works beautifully, bringing back 245,000 rows in 40ms, however as soon as you touch it with a group or order, it takes over 6s.
Does anyone have any suggestions on what changes need making to the indexes or potentially how to modify the query to improve it?
Thanks
Without any grouping or ordering
select
s.id as sensorid,
s.sensortypeid,
COALESCE(s.pulserate, 1) as pulserate,
COALESCE(s.correctionFactor, 1) as correctionFactor,
ur.id as unitrateid,
COALESCE(ur.priceperkwh, 0) as priceperkwh,
COALESCE(ur.duosCharges, 0) as duosCharges,
IF(t.blnnonunitratecharges, t.nonunitratecharge/48, 0) as nonunitratecost,
IF(t.blnFeedIn, COALESCE(t.feedInRate, 0), 0) as feedInRate,
IF(t.blnRoc, COALESCE(t.rocRate, 0), 0) as rocRate,
from_unixtime(FLOOR(UNIX_TIMESTAMP(srs.dateTimeStamp)/(30*60))*(30*60)) as timeKey
from sensorreadings srs
inner join sensorpoints sp on (sp.id = srs.sensorpointid)
inner join sensors s on (s.id = sp.sensorid)
left join unitrates ur on ur.id = (
select
ur.id
from unitrates ur, tariffs t, companyhubs ch
where
ur.tariffid = t.id and
t.companyid = ch.companyid and
ch.hubid = s.hubid and
t.utilitytypeid = s.utilitytypeid and
(srs.dateTimeStamp between t.startdate and t.enddate) and
((time(srs.dateTimeStamp) between ur.starttime and ur.endtime) and
(ur.dayMask & POW(2, WEEKDAY(srs.dateTimeStamp)) <> 0) and
(ur.monthMask & POW(2, MONTH(srs.dateTimeStamp) - 1) <> 0))
order by
t.startdate desc,
ur.starttime desc
limit 0, 1
)
left join tariffs t on (t.id = ur.tariffid)
where
s.id = 5289
With grouping and ordering
select
s.id as sensorid,
s.sensortypeid,
COALESCE(s.pulserate, 1) as pulserate,
COALESCE(s.correctionFactor, 1) as correctionFactor,
ur.id as unitrateid,
COALESCE(ur.priceperkwh, 0) as priceperkwh,
COALESCE(ur.duosCharges, 0) as duosCharges,
IF(t.blnnonunitratecharges, t.nonunitratecharge/48, 0) as nonunitratecost,
IF(t.blnFeedIn, COALESCE(t.feedInRate, 0), 0) as feedInRate,
IF(t.blnRoc, COALESCE(t.rocRate, 0), 0) as rocRate,
min(srs.reading) as minReading,
avg(srs.reading) as avgReading,
from_unixtime(FLOOR(UNIX_TIMESTAMP(srs.dateTimeStamp)/(30*60))*(30*60)) as timeKey
from sensorreadings srs
inner join sensorpoints sp on (sp.id = srs.sensorpointid)
inner join sensors s on (s.id = sp.sensorid)
left join unitrates ur on ur.id = (
select
ur.id
from unitrates ur, tariffs t, companyhubs ch
where
ur.tariffid = t.id and
t.companyid = ch.companyid and
ch.hubid = s.hubid and
t.utilitytypeid = s.utilitytypeid and
(srs.dateTimeStamp between t.startdate and t.enddate) and
((time(srs.dateTimeStamp) between ur.starttime and ur.endtime) and
(ur.dayMask & POW(2, WEEKDAY(srs.dateTimeStamp)) <> 0) and
(ur.monthMask & POW(2, MONTH(srs.dateTimeStamp) - 1) <> 0))
order by
t.startdate desc,
ur.starttime desc
limit 0, 1
)
left join tariffs t on (t.id = ur.tariffid)
where
s.id = 5289
group by timeKey
order by timeKey desc
Schemas
CREATE TABLE `sensorreadings` (
`sensorpointid` int(11) NOT NULL DEFAULT '0',
`reading` decimal(15,5) NOT NULL,
`dateTimeStamp` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`sensorpointid`,`dateTimeStamp`),
KEY `sensormetricid` (`sensormetricid`),
KEY `sensorreadings_timestamp` (`dateTimeStamp`,`sensorpointid`),
KEY `sensorpointid` (`sensorpointid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `sensorpoints` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`sensorid` int(11) DEFAULT NULL,
`hubpointid` int(11) DEFAULT NULL,
`pointlabel` varchar(255) NOT NULL,
`pointhash` varchar(255) NOT NULL,
`target` decimal(10,0) DEFAULT NULL,
`tolerance` decimal(10,0) DEFAULT '0',
`blnlivepoint` int(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `FK_sensorpoints_sensors` (`sensorid`),
CONSTRAINT `FK_sensorpoints_sensors` FOREIGN KEY (`sensorid`) REFERENCES `sensors` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=8824 DEFAULT CHARSET=latin1;
CREATE TABLE `sensors` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`hubid` int(11) DEFAULT NULL,
`sensortypeid` int(11) NOT NULL DEFAULT '5',
`pulserate` decimal(10,6) DEFAULT NULL,
`utilitytypeid` int(11) NOT NULL DEFAULT '1',
`correctionfactor` decimal(10,3) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `FK_sensors_sensortypes` (`sensortypeid`),
KEY `FK_sensors_hubs` (`hubid`),
KEY `FK_sensors_utilitytypes` (`utilitytypeid`),
CONSTRAINT `FK_sensors_hubs` FOREIGN KEY (`hubid`) REFERENCES `hubs` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `FK_sensors_sensortypes` FOREIGN KEY (`sensortypeid`) REFERENCES `sensortypes` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
) ENGINE=InnoDB AUTO_INCREMENT=5503 DEFAULT CHARSET=latin1;
CREATE TABLE `tariffs` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`companyid` int(11) DEFAULT NULL,
`utilitytypeid` int(11) DEFAULT NULL,
`startdate` date NOT NULL,
`enddate` date NOT NULL,
`blnnonunitratecharges` int(1) DEFAULT '0',
`nonunitratecharge` decimal(16,8) DEFAULT '0.00000000',
`blnFeedIn` int(1) DEFAULT '0',
`blnRoc` int(1) DEFAULT '0',
`rocRate` decimal(16,8) DEFAULT '0.00000000',
`feedInRate` decimal(16,8) DEFAULT '0.00000000',
PRIMARY KEY (`id`),
KEY `companyid` (`companyid`,`utilitytypeid`,`startdate`,`enddate`),
KEY `startdate` (`startdate`,`enddate`),
) ENGINE=InnoDB AUTO_INCREMENT=1107 DEFAULT CHARSET=latin1;
CREATE TABLE `unitrates` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`tariffid` int(11) NOT NULL,
`priceperkwh` decimal(16,8) NOT NULL,
`starttime` time NOT NULL,
`endtime` time NOT NULL,
`duoscharges` decimal(10,5) DEFAULT NULL,
`dayMask` int(11) DEFAULT '127',
`monthMask` int(11) DEFAULT '4095',
PRIMARY KEY (`id`),
KEY `FK_unitrates_tariffs` (`tariffid`),
KEY `times` (`starttime`,`endtime`),
KEY `masks` (`dayMask`,`monthMask`),
CONSTRAINT `FK_unitrates_tariffs` FOREIGN KEY (`tariffid`) REFERENCES `tariffs` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
) ENGINE=InnoDB AUTO_INCREMENT=3104 DEFAULT CHARSET=latin1;
Explains
Without groups/ordering
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|----|--------------------|-------|--------|---------------------------------|-------------------------|---------|-------------------------------|------|----------------------------------------------|
| 1 | PRIMARY | s | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL |
| 1 | PRIMARY | sp | ref | PRIMARY,FK_sensorpoints_sensors | FK_sensorpoints_sensors | 5 | const | 1 | Using index |
| 1 | PRIMARY | srs | ref | PRIMARY,sensorpointid | PRIMARY | 4 | dbnameprod.sp.id | 211 | Using index |
| 1 | PRIMARY | ur | eq_ref | PRIMARY | PRIMARY | 4 | func | 1 | Using where |
| 1 | PRIMARY | t | eq_ref | PRIMARY | PRIMARY | 4 | dbnameprod.ur.tariffid | 1 | NULL |
| 2 | DEPENDENT SUBQUERY | ch | ref | hubid | hubid | 5 | const | 1 | Using where; Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | t | ref | PRIMARY,companyid,startdate | companyid | 10 | dbnameprod.ch.companyid,const | 1 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | ur | ref | FK_unitrates_tariffs,times | FK_unitrates_tariffs | 4 | dbnameprod.t.id | 1 | Using where |
With ordering/grouping
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|----|--------------------|-------|--------|---------------------------------------------------------------|-------------------------|---------|-------------------------------|------|----------------------------------------------|
| 1 | PRIMARY | s | const | PRIMARY | PRIMARY | 4 | const | 1 | Using temporary; Using filesort |
| 1 | PRIMARY | sp | ref | PRIMARY,FK_sensorpoints_sensors | FK_sensorpoints_sensors | 5 | const | 1 | Using index |
| 1 | PRIMARY | srs | ref | PRIMARY,sensormetricid,sensorreadings_timestamp,sensorpointid | PRIMARY | 4 | dbnameprod.sp.id | 211 | Using index |
| 1 | PRIMARY | ur | eq_ref | PRIMARY | PRIMARY | 4 | func | 1 | Using where |
| 1 | PRIMARY | t | eq_ref | PRIMARY | PRIMARY | 4 | dbnameprod.ur.tariffid | 1 | NULL |
| 2 | DEPENDENT SUBQUERY | ch | ref | hubid | hubid | 5 | const | 1 | Using where; Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | t | ref | PRIMARY,companyid,startdate | companyid | 10 | dbnameprod.ch.companyid,const | 1 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | ur | ref | FK_unitrates_tariffs,times | FK_unitrates_tariffs | 4 | dbnameprod.t.id | 1 | Using where |
Well you are grouping and ordering for a calculated field timeKey and db doesnt have any index on that field.
So db need to calculate all rows before doing the group by and then do the ordering and without index cant speed up the calculations.
Suggestion: Create a time field on your db and add index for that field.
Before looking into the performance, let's discuss the likelihood that the query is broken.
When doing a GROUP BY, all the non-aggregate SELECT values should be included in the GROUP BY. Otherwise, any random value can be delivered.
Furthermore, this pattern:
SELECT ..., AVG(a.x)
FROM a
JOIN b ON ...
GROUP BY a.id
usually leads to an inflation of the number of rows (due to the JOIN), followed by computing the aggregates over the inflated number of rows. Add COUNT(*) to see if I am right for your case. For COUNT, the answer can be blatantly wrong; for AVG it can be subtly wrong; for MIN it is probably correct. And finally the GROUP BY deflates the number of rows.
The usual cure is to compute the aggregates without the JOINs (I am not sure if it is possible in your case). Maybe something like...
...
JOIN (
SELECT min(srs.reading) as minReading,
avg(srs.reading) as avgReading,
from_unixtime(FLOOR(UNIX_TIMESTAMP(srs.dateTimeStamp)/
(30*60))*(30*60)) as timeKey
FROM srs
GROUP BY timeKey
) AS r
JOIN ...
It is usually a 'bad' idea to have date and time in separate columns. A DATETIME or TIMESTAMP is easier to compare against, etc. (I am unclear on what you are doing with your separate date and time.) This can also be a performance issue.
The 3 tables lead to a bunch of JOINing, making the WHERE s.id = 5289 hard to transfer to srs. You may need to rethink the schema as another performance issue.
I realize the values are different, but could
order by
t.startdate desc,
ur.starttime desc
be replaced by
order by srs.dateTimeStamp
That might lead to to being able to us an index.
I'm surprised you are using DECIMAL(m,n) instead of FLOAT for sensor readings.

Terrible and slow query

I have some speed problems with query, that shows list of users in my DB.
I want to show list of users with traffic info and the last employee who works with user.
DB looks like this:
users table (contains users info):
CREATE TABLE `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`ip` tinytext NOT NULL,
`name` varchar(64) NOT NULL,
... some other fields
PRIMARY KEY (`id`),
UNIQUE KEY `name` (`name`),
KEY `ip` (`ip`(15)) USING BTREE,
)
users_trf table (contains information about users traffic; uid - id of users from users table):
CREATE TABLE `users_trf` (
`uid` int(11) unsigned NOT NULL,
`uip` varchar(15) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
`in` bigint(20) NOT NULL DEFAULT '0',
`out` bigint(20) NOT NULL DEFAULT '0',
`test` tinyint(4) NOT NULL,
UNIQUE KEY `uid` (`uid`),
KEY `test` (`test`)
)
employees with list of all employees:
CREATE TABLE `employees` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`full_name` varchar(16) NOT NULL,
PRIMARY KEY (`id`)
)
and log table where I store data about jobs which employee did with client (uid - id of the client from users table, mid - id of employees from employees table):
CREATE TABLE `employees_log` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`uid` int(10) unsigned NOT NULL,
`mid` int(10) unsigned NOT NULL,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`note` text NOT NULL,
PRIMARY KEY (`id`)
)
My query:
SELECT SQL_CALC_FOUND_ROWS *
FROM users u
LEFT JOIN users_trf t ON u.id = t.uid
LEFT JOIN (
SELECT e2.full_name, e1.uid, e1.mid AS moid
FROM employees_log e1
LEFT JOIN employees e2 ON e1.mid = e2.id
WHERE NOT
EXISTS (
SELECT *
FROM employees_log e3
WHERE e1.uid = e3.uid
AND e1.id < e3.id
)
) e ON e.uid = u.id
LIMIT 0 , 50
it works very slow, I think the reason of this is this subquery (I'm trying to select the last employee who works with client):
SELECT e2.full_name, e1.uid, e1.mid AS moid
FROM employees_log e1
LEFT JOIN employees e2 ON e1.mid = e2.id
WHERE NOT
EXISTS (
SELECT *
FROM employees_log e3
WHERE e1.uid = e3.uid
AND e1.id < e3.id
)
Is it possible to speed up my query?
UPD:
I added index ALTER TABLE employees_log ADD INDEX ( uid, id ); and query become 2 times faster, but can I make it more faster?
+----+--------------------+------------+--------+---------------+---------+---------+-------------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+--------+---------------+---------+---------+-------------+-------+--------------------------+
| 1 | PRIMARY | u | ALL | NULL | NULL | NULL | NULL | 12029 | |
| 1 | PRIMARY | t | eq_ref | uid | uid | 4 | bill.u.id | 1 | |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 2239 | |
| 2 | DERIVED | e1 | ALL | NULL | NULL | NULL | NULL | 2288 | Using where |
| 2 | DERIVED | e2 | eq_ref | PRIMARY | PRIMARY | 4 | bill.e1.mid | 1 | |
| 3 | DEPENDENT SUBQUERY | e3 | ref | PRIMARY,uid | uid | 4 | bill.e1.uid | 1 | Using where; Using index |
+----+--------------------+------------+--------+---------------+---------+---------+-------------+-------+--------------------------+
first of all, i think you have to expalin to yourself why using int and bigint. do you really expect so much data? try using smallint or mediumint, they need less memory and are much faster. if you use the mediumint and smallint as unsigned, they can have a pretty large value, take a look at: http://dev.mysql.com/doc/refman/5.0/en/integer-types.html
second, you need to combine some field to one key:
ALTER TABLE `employees_log ` ADD INDEX ( `uid` , `id` ) ;
If you are creating a new MySQL table you can specify a column to index by using the INDEX term.Indexes are something extra that you can enable on your MySQL tables to increase performance
http://www.databasejournal.com/features/mysql/article.php/1382791/Optimizing-MySQL-Queries-and-Indexes.htm
http://www.tutorialspoint.com/mysql/mysql-indexes.htm view this it gives you much idea..
With your goal of trying to join to the log for the last employee for that user in the log table (based on the key at least), maybe just try a = <subquery> instead of a NOT EXISTS?
SELECT e2.full_name, e1.uid, e1.mid AS moid
FROM employees_log e1
LEFT JOIN employees e2 ON e1.mid = e2.id
WHERE e1.id = (
SELECT MAX(e3.id)
FROM employees_log e3
WHERE e1.uid = e3.uid
)
Consider adding an index on the column MID and UID on employees_log - the explain suggests that this join is not using an index.
Like so: create index compound on employees_log (mid, uid)

Excluding large sets of objects from a query on a table with fast changing order

I have a table of products with a score column, which has a B-Tree Index on it. I have a query which returns products that have not been shown to the user in the current session. I can't simply use simple pagination with LIMIT for it, because the result should be ordered by the score column, which can change between query calls.
My current solution works like this:
SELECT *
FROM products p
LEFT JOIN product_seen ps
ON (ps.session_id = ? AND p.product_id = ps.product_id )
WHERE ps.product_id is null
ORDER BY p.score DESC
LIMIT 30;
This works fine for the first few pages, but the response time grows linear to the number of products already shown in the session and hits the second mark by the time this number reaches ~300. Is there a way to fasten this up in MySQL? Or should I solve this problem in an entirely other way?
Edit:
These are the two tables:
CREATE TABLE `products` (
`product_id` int(15) NOT NULL AUTO_INCREMENT,
`shop` varchar(15) NOT NULL,
`shop_id` varchar(25) NOT NULL,
`shop_category_id` varchar(20) DEFAULT NULL,
`shop_subcategory_id` varchar(20) DEFAULT NULL,
`shop_designer_id` varchar(20) DEFAULT NULL,
`shop_designer_name` varchar(40) NOT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`product_url` varchar(255) NOT NULL,
`name` varchar(255) NOT NULL,
`description` mediumtext NOT NULL,
`price_cents` int(10) NOT NULL,
`list_image_url` varchar(255) NOT NULL,
`list_image_height` int(4) NOT NULL,
`ending` timestamp NULL DEFAULT NULL,
`category_id` int(5) NOT NULL,
`last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`included_at` timestamp NULL DEFAULT NULL,
`hearts` int(5) NOT NULL,
`score` decimal(10,5) NOT NULL,
`rand_field` decimal(16,15) NOT NULL,
`last_score_update` timestamp NULL DEFAULT NULL,
`active` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`product_id`),
UNIQUE KEY `unique_shop_id` (`shop`,`shop_id`),
KEY `score_index` (`active`,`score`),
KEY `included_at_index` (`included_at`),
KEY `active_category_score` (`active`,`category_id`,`score`),
KEY `active_category` (`active`,`category_id`,`product_id`),
KEY `active_products` (`active`,`product_id`),
KEY `active_rand` (`active`,`rand_field`),
KEY `active_category_rand` (`active`,`category_id`,`rand_field`)
) ENGINE=InnoDB AUTO_INCREMENT=55985 DEFAULT CHARSET=utf8
CREATE TABLE `product_seen` (
`seenby_id` int(20) NOT NULL AUTO_INCREMENT,
`session_id` varchar(25) NOT NULL,
`product_id` int(15) NOT NULL,
`last_seen` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`sorting` varchar(10) NOT NULL,
`in_category` int(3) DEFAULT NULL,
PRIMARY KEY (`seenby_id`),
KEY `last_seen_index` (`last_seen`),
KEY `session_id` (`session_id`,`seenby_id`),
KEY `session_id_2` (`session_id`,`sorting`,`seenby_id`)
) ENGINE=InnoDB AUTO_INCREMENT=17431 DEFAULT CHARSET=utf8
Edit 2:
The query above is a simplification, this is the real query with EXPLAIN:
EXPLAIN SELECT
DISTINCT p.product_id AS id,
p.list_image_url AS image,
p.list_image_height AS list_height,
hearts,
active AS available,
(UNIX_TIMESTAMP( ) - ulp.last_action) AS last_loved
FROM `looksandgoods`.`products` p
LEFT JOIN `looksandgoods`.`user_likes_products` ulp
ON ( p.product_id = ulp.product_id AND ulp.user_id =1 )
LEFT JOIN `looksandgoods`.`product_seen` sb
ON (sb.session_id = 'y7lWunZKKABgMoDgzjwDjZw1'
AND sb.sorting = 'trend'
AND p.product_id = sb.product_id )
WHERE p.active =1
AND sb.product_id IS NULL
ORDER BY p.score DESC
LIMIT 30 ;
Explain output, there is still a temp table and filesort, although the keys for the join exist:
+----+-------------+-------+-------+----------------------------------------------------------------------------------------------------+------------------+---------+----------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+----------------------------------------------------------------------------------------------------+------------------+---------+----------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | p | range | score_index,active_category_score,active_category,active_products,active_rand,active_category_rand | score_index | 1 | NULL | 2299 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | ulp | ref | love_count_index,user_to_product_index,product_id | love_count_index | 9 | looksandgoods.p.product_id,const | 1 | |
| 1 | SIMPLE | sb | ref | session_id,session_id_2 | session_id | 77 | const | 711 | Using where; Not exists; Distinct |
+----+-------------+-------+-------+----------------------------------------------------------------------------------------------------+------------------+---------+----------------------------------+------+----------------------------------------------+
New answer
I think the problem with the real query is the DISTINCT clause. The implication is that either or both of the product_seen and user_likes_products tables can join multiple rows for each product_id which could potentially appear in the result set (given the somewhat disturbing lack of UNIQUE KEYs on the product_seen table), and this is the reason you've included the DISTINCT clause. Unfortunately, it also means MySQL will have to create a temp table to process the query.
Before I go any further, if it's possible to do...
ALTER TABLE product_seen ADD UNIQUE KEY (session_id, product_id, sorting);
...and...
ALTER TABLE user_likes_products ADD UNIQUE KEY (user_id, product_id);
...then the DISTINCT clause is redundant, and removing it should eliminate the problem. N.B. I'm not suggesting you necessarily need to add these keys, but rather just to confirm that these fields are always unique.
If it's not possible, then there may be another solution, but I'd need to know a lot more about the tables involved in the joins.
Old answer
An EXPLAIN for your query yields...
+----+-------------+-------+------+---------------+------------+---------+-------+------+-------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------------+---------+-------+------+-------------------------+
| 1 | SIMPLE | p | ALL | NULL | NULL | NULL | NULL | 10 | Using filesort |
| 1 | SIMPLE | ps | ref | session_id | session_id | 27 | const | 1 | Using where; Not exists |
+----+-------------+-------+------+---------------+------------+---------+-------+------+-------------------------+
...which shows it's not using an index on the products table, so it's having to do a table scan and a filesort, which is why it's slow.
I noticed there's an index on (active, score) which you could use by changing the query to only show active products...
SELECT *
FROM products p
LEFT JOIN product_seen ps
ON (ps.session_id = ? AND p.product_id = ps.product_id )
WHERE p.active=TRUE AND ps.product_id is null
ORDER BY p.score DESC
LIMIT 30;
...which changes the EXPLAIN to...
+----+-------------+-------+-------+-----------------------------+-------------+---------+-------+------+-------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+-----------------------------+-------------+---------+-------+------+-------------------------+
| 1 | SIMPLE | p | range | score_index,active_products | score_index | 1 | NULL | 10 | Using where |
| 1 | SIMPLE | ps | ref | session_id | session_id | 27 | const | 1 | Using where; Not exists |
+----+-------------+-------+-------+-----------------------------+-------------+---------+-------+------+-------------------------+
...which is now doing a range scan and no filesort, which should be much faster.
Or if you want it to also return inactive products, then you'll need to add an index on score only, with...
ALTER TABLE products ADD KEY (score);

MySQL structure help for joins ( large tables)

I currently have 2 tables that are used for a select query with a simple join. The first table houses around 6-9 million rows, and this gets used as the join. The primary table is anywhere from 1mil to 300mil rows. However, I notice when I join above 10mil rows on the primary table the select query goes from instant to very slow (3+ seconds and grows).
Here is my table structure and queries.
CREATE TABLE IF NOT EXISTS `links` (
`link_id` int(10) unsigned NOT NULL,
`domain_id` mediumint(7) unsigned NOT NULL,
`parent_id` int(11) unsigned DEFAULT NULL,
`hash` int(10) unsigned NOT NULL,
`url` text NOT NULL,
`type` enum('html','pdf') DEFAULT NULL,
`processed` enum('N','Y') NOT NULL DEFAULT 'N',
UNIQUE KEY `hash` (`hash`),
KEY `idx_processed` (`processed`),
KEY `domain_id` (`domain_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT;
CREATE TABLE IF NOT EXISTS `domains` (
`domain_id` mediumint(7) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(170) NOT NULL,
`blocked` enum('N','Y') NOT NULL DEFAULT 'N',
`count` mediumint(6) NOT NULL DEFAULT '0',
`mcount` mediumint(3) NOT NULL,
PRIMARY KEY (`domain_id`),
KEY `name` (`name`),
KEY `blocked` (`blocked`),
KEY `mcount` (`mcount`),
KEY `count` (`count`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=10834389 ;
Query:
(SELECT link_id, url, hash FROM links, domains WHERE links.domain_id = domains.domain_id and mcount > 1 and processed='N' limit 200)
UNION
(SELECT link_id, url, hash FROM links where processed='N' and type='html' limit 200)
Explain select:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+------------+-------+-------------------------+--------------- +---------+---------------------------+---------+-------------+
| 1 | PRIMARY | domains | range | PRIMARY,mcount | mcount | 3 | NULL | 257673 | Using where |
| 1 | PRIMARY | links | ref | idx_processed,domain_id | domain_id | 3 | crawler.domains.domain_id | 1 | Using where |
| 2 | UNION | links | ref | idx_processed | idx_processed | 1 | const | 7090017 | Using where |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+------------+-------+-------------------------+---------------+---------+---------------------------+---------+-------------+
Right now, I'm trying a partition with 20 partitions on links using domain_id as the key.
Any other options would be greatly appreciated.
A single SELECT statement would replace your entire UNION statement:
SELECT link_id, url, hash
FROM links, domains
WHERE links.domain_id = domains.domain_id
AND mcount > 1
AND processed='N'
AND type='html'
This may not be THE answer you are looking for, but it should help you simplify your question.
When things suddenly slow down you might want to check the size of your indexes (used in the query execution) vs size of various mysql buffers.