I have below query where it is taking long time to execute as it is having OR opeartion.Here Institution 59958 is a global institution. It can have stats itself and its children can have stats; so 'parentinstitutionid' with 59958 or institutionid 59958 we require usage data for.so I am using OR opeartor.
select date(SUBDATE(m.timestamp, INTERVAL (day(m.timestamp) -1) day)) as month,
count(*) as c,
sum(case when a.streamid = 5 then 1 else 0 end) as education,
sum(case when a.streamid in(7, 1) then 1 else 0 end) as research,
sum(case when searchterms <> '' then 1 else 0 end) as search
from stats_to_institution as s
join masterstats_innodb as m on s.statid = m.id
left join articles as a on (a.productid >= 49 and a.productid = m.article)
where m.timestamp >= '2022-01-01'
and (s.institutionid = 59958 or s.institutionid in ( select institutionid from institutions where parentinstitutionid = 59958))
group by month;
Here below condition is taking time
(s.institutionid = 59958 or s.institutionid in (select institutionid from institutions where parentinstitutionid = 59958))
I cannot use CTE as it is on 5.6 version.Is any other way to modify above condition for good performance?.
If I remove s.institutionid = 59958 it takes only 5 secs to run as it will not have OR operator.
Any suggestion on this?
table structure as follows
CREATE TABLE `institutions` (
`InstitutionID` int(11) NOT NULL AUTO_INCREMENT,
`Name` varchar(200) DEFAULT NULL,
`Approved` tinyint(1) NOT NULL DEFAULT '0',
`DateAdded` datetime DEFAULT CURRENT_TIMESTAMP,
`IsAcademic` tinyint(1) DEFAULT NULL,
`IsIndustry` tinyint(1) DEFAULT NULL,
`LogoFile` varchar(50) DEFAULT NULL,
`NotifyLibEveryXRequests` int(11) DEFAULT NULL,
`IsParentInstitution` int(1) NOT NULL DEFAULT '0',
`ParentInstitutionID` int(11) DEFAULT NULL,
PRIMARY KEY (`InstitutionID`),
KEY `Institutions_Name` (`Name`),
KEY `ParentInstitutionID` (`ParentInstitutionID`),
FULLTEXT KEY `Name` (`Name`)
) ;
CREATE TABLE `masterstats_innodb` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`page` text COLLATE utf8_unicode_ci NOT NULL,
`video` int(11) NOT NULL,
`language` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`referrer` text COLLATE utf8_unicode_ci NOT NULL,
`joveuser` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
`timestamp` date NOT NULL DEFAULT '0000-00-00',
PRIMARY KEY (`id`,`timestamp`),
KEY `joveuser` (`joveuser`),
KEY `institutionid` (`institutionid`),
KEY `timestamp` (`timestamp`),
KEY `idx__video_timestamp` (`video`,`timestamp`)
) ;
CREATE TABLE `stats_to_institution` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`statid` int(11) NOT NULL,
`institutionid` int(11) NOT NULL,
PRIMARY KEY (`id`,`institutionid`),
UNIQUE KEY `statid_2` (`statid`,`institutionid`),
KEY `statid` (`statid`),
KEY `institutionid` (`institutionid`)
) ;
CREATE TABLE `articles` (
`ProductID` int(11) NOT NULL,
`Name` varchar(1000) DEFAULT NULL,
`Tags` varchar(1000) NOT NULL,
`D` varchar(2000) DEFAULT NULL,
`Active` tinyint(1) DEFAULT NULL,
`UserID` int(11) DEFAULT NULL,
`DateAdded` datetime DEFAULT NULL,
`Detail_Abstract` text,
`StreamID` int(11) DEFAULT NULL COMMENT '-1 = Errata, 1= Article, 2= Advertisment, 3 = Editorial, 4= Junk, 5=SE',
`DatePublished` datetime DEFAULT NULL,
`AccessType` int(11) DEFAULT NULL COMMENT '-1=Unpublished, 0=Closed, 1=Free, 2=Open, 3 = Open UK',
`Rep_Results` text,
`Stage` int(11) DEFAULT NULL,
`SectionID` int(11) DEFAULT NULL,
PRIMARY KEY (`ProductID`),
KEY `Articles_StreamID_Active_DatePublished` (`StreamID`,`Active`,`DatePublished`),
KEY `articles_idx_sectionid` (`SectionID`),
FULLTEXT KEY `DetailAbstractTest` (`Detail_Abstract`,`Name`),
FULLTEXT KEY `Materials` (`Materials`),
FULLTEXT KEY `title` (`Name`)
);
explain result
+----+-------------+--------------+--------+------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------+-----------------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+--------+------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------+-----------------+----------+----------------------------------------------+
| 1 | PRIMARY | m | ALL | PRIMARY,timestamp_video,joveuser,institutionid,video_institutionid,user_id,ip_binary,time_on_page,Article,timestamp,idx__video_timestamp | NULL | NULL | NULL | 19653526 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | a | eq_ref | PRIMARY | PRIMARY | 4 | stats.m.Article | 1 | Using where |
| 1 | PRIMARY | s | ref | statid_2,statid,institutionid | statid_2 | 4 | stats.m.id | 1 | Using where; Using index |
| 2 | SUBQUERY | institutions | ref | PRIMARY,ParentInstitutionID | ParentInstitutionID | 5 | const | 173 | Using index |
+----+-------------+--------------+--------+------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------+-----------------+----------+----------------------------------------------+
No CTE how about join?
select date(SUBDATE(m.timestamp, INTERVAL (day(m.timestamp) -1) day)) as month,
count(*) as c,
sum(case when a.streamid = 5 then 1 else 0 end) as education,
sum(case when a.streamid in(7, 1) then 1 else 0 end) as research,
sum(case when searchterms <> '' then 1 else 0 end) as search
from stats_to_institution as s
join (select institutionid
from institutions
where parentinstitutionid = 59958
union all select 59998
) x on x.institutionid = s.institutionid
join masterstats_innodb as m on s.statid = m.id
left join articles as a on (a.productid >= 49 and a.productid = m.article)
where m.timestamp >= '2022-01-01'
group by month;
Related
I have the following entry in mysql-slow.log:
# Time: 180506 21:57:03
# User#Host: mysqlserver[mysqlserver] # localhost []
# Query_time: 88.963476 Lock_time: 0.000088 Rows_sent: 50 Rows_examined: 114197
SET timestamp=1525633023;
SELECT n1.full_name AS sender_full_name, s1.email AS sender_email, e.subject, e.body,
e.attach, e.date, e.id, r.status, n2.full_name AS receiver_full_name,
s2.email AS receiver_email, r.basket
FROM people_emails p
JOIN email_routing r ON r.receiver_email_id = 3223 AND r.status = 2
JOIN email e ON e.id = r.message_id
JOIN people_emails s1 ON s1.id = r.sender_email_id
JOIN people n1 ON n1.id = s1.people_id
JOIN people_emails s2 ON s2.id = r.receiver_email_id
JOIN people n2 ON n2.id = s2.people_id
WHERE p.internal_user_id = 314
ORDER BY e.date desc
LIMIT 0, 50;
The result of that query is similar to this:
----------------------------------------------------------------------------------------------------
|sender_full_name|sender_email|subject|body| attach | date | id |status|receiver_full_name|basket|
----------------------------------------------------------------------------------------------------
|John Blow |jb#corp.lan |Aloha |Text| |180506|856050|2 |Mary Johns |1 |
----------------------------------------------------------------------------------------------------
Here is all the data about the query and the used tables:
EXPLAIN SELECT n1.full_name AS sender_full_name, s1.email AS sender_email,
e.subject, e.body, e.attach, e.date, e.id, r.status, n2.full_name AS receiver_full_name,
s2.email AS receiver_email, r.basket, 'user777' FROM people_emails p
JOIN email_routing r ON r.receiver_email_id = 3233 AND r.status = 2
JOIN email e ON e.id = r.message_id
JOIN people_emails s1 ON s1.id = r.sender_email_id
JOIN people n1 ON n1.id = s1.people_id
JOIN people_emails s2 ON s2.id = r.receiver_email_id
JOIN people n2 ON n2.id = s2.people_id
WHERE p.internal_user_id = 314 ORDER BY e.date desc LIMIT 0, 50;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE s2 const PRIMARY PRIMARY 4 const 1 Using temporary; Using filesort
1 SIMPLE n2 const PRIMARY PRIMARY 4 const 1
1 SIMPLE p ALL NULL NULL NULL NULL 18631 Using where
1 SIMPLE r ALL NULL NULL NULL NULL 899567 Using where; Using join buffer
1 SIMPLE e eq_ref PRIMARY PRIMARY 4 server.r.message_id 1
1 SIMPLE s1 eq_ref PRIMARY PRIMARY 4 server.r.sender_email_id1
1 SIMPLE n1 eq_ref PRIMARY PRIMARY 4 server.s1.people_id 1
SHOW CREATE TABLE people_emails;
CREATE TABLE `people_emails` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`nick` varchar(255) NOT NULL,
`email` varchar(255) NOT NULL,
`key_name` varchar(255) NOT NULL,
`people_id` int(11) NOT NULL,
`status` int(11) NOT NULL DEFAULT '0',
`activity` int(11) NOT NULL,
`internal_user_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `email` (`email`)
) ENGINE=MyISAM AUTO_INCREMENT=22114 DEFAULT CHARSET=utf8
SHOW CREATE TABLE email_routing;
CREATE TABLE `email_routing` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`message_id` int(11) NOT NULL,
`sender_email_id` int(11) NOT NULL,
`receiver_email_id` int(11) NOT NULL,
`basket` int(11) NOT NULL,
`status` int(11) NOT NULL,
`popup` int(11) NOT NULL,
`tm` int(11) NOT NULL,
KEY `id` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=987389 DEFAULT CHARSET=utf8
SHOW CREATE TABLE email;
CREATE TABLE `email` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`subject` text NOT NULL,
`body` text NOT NULL,
`date` datetime NOT NULL,
`attach` text NOT NULL,
`attach_ondisk` text NOT NULL,
`attach_dir` varchar(255) CHARACTER SET cp1251 DEFAULT NULL,
`attach_subject` varchar(255) DEFAULT NULL,
`attach_content` longtext,
PRIMARY KEY (`id`),
KEY `Index_2` (`attach_dir`),
FULLTEXT KEY `path` (`attach_dir`)
) ENGINE=MyISAM AUTO_INCREMENT=856151 DEFAULT CHARSET=utf8
SHOW CREATE TABLE people;
CREATE TABLE `people` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`fname` varchar(255) CHARACTER SET cp1251 NOT NULL,
`lname` varchar(255) CHARACTER SET cp1251 NOT NULL,
`patronymic` varchar(255) CHARACTER SET cp1251 NOT NULL,
`gender` tinyint(1) NOT NULL,
`full_name` varchar(255) NOT NULL DEFAULT ' ',
`category` int(11) NOT NULL,
`people_type_id` int(255) DEFAULT NULL,
`tags` varchar(255) CHARACTER SET cp1251 NOT NULL,
`job` varchar(255) CHARACTER SET cp1251 NOT NULL,
`post` varchar(255) CHARACTER SET cp1251 NOT NULL,
`profession` varchar(255) CHARACTER SET cp1251 DEFAULT NULL,
`zip` varchar(16) CHARACTER SET cp1251 NOT NULL,
`country` int(11) DEFAULT NULL,
`region` varchar(10) NOT NULL,
`city` varchar(255) CHARACTER SET cp1251 NOT NULL,
`address` varchar(255) CHARACTER SET cp1251 NOT NULL,
`address_date` date DEFAULT NULL,
`inner` tinyint(4) NOT NULL,
`contact_through` varchar(255) DEFAULT '',
`next_call` date NOT NULL,
`additional` text CHARACTER SET cp1251 NOT NULL,
`user_id` int(11) NOT NULL,
`changed` datetime NOT NULL,
`status` int(11) DEFAULT NULL,
`nick` varchar(255) DEFAULT NULL,
`birthday` date DEFAULT NULL,
`last_update_ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`area` text NOT NULL,
`reviewed_` tinyint(4) NOT NULL,
`phones_old` text NOT NULL,
`post_sticker` text NOT NULL,
`permissions` int(120) NOT NULL DEFAULT '0',
`internal_user_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `most_used` (`category`,`status`,`city`,`lname`,`next_call`),
KEY `registrars` (`category`,`status`,`contact_through`,`next_call`),
FULLTEXT KEY `lname` (`lname`),
FULLTEXT KEY `fname` (`fname`),
FULLTEXT KEY `mname` (`patronymic`),
FULLTEXT KEY `Full Name` (`full_name`)
) ENGINE=MyISAM AUTO_INCREMENT=415009 DEFAULT CHARSET=utf8
While getting the above output as per request from the comment I have also noticed that my tables are all in different format - MyISAM and InnoDB. Can that be a part of the problem too?
Did I made the tables structure too complicated? I would like to understand what part of the query makes this so slow so I can re-arrange my tables.
In general, you want to eliminate the entries from your EXPLAIN report where type=ALL. This means it's doing a table-scan, and that's bad for performance if it happens on a large table.
In your case, you have two tables that are doing table-scans. Check the numbers in the row column of the explain, 18631 and 899567. Multiply them together = 16,759,832,777. That's how many row combinations the query will potentially examine!
Part of the problem is that your query is doing a Cartesian product. You have no conditions relating your table p to the other tables. So for every row examined in p, it combines this with the rows examined in other tables. This has a very high cost.
It's not clear why you even have p in your query, since it's not related to the other tables, and you don't fetch any columns from it in the select-list. I can produce the result set you described even when I take p out of the query:
SELECT n1.full_name AS sender_full_name, s1.email AS sender_email,
e.subject, e.body, e.attach, e.date, e.id, r.status, n2.full_name AS receiver_full_name,
s2.email AS receiver_email, r.basket, 'user777'
FROM email_routing r
JOIN email e ON e.id = r.message_id
JOIN people_emails s1 ON s1.id = r.sender_email_id
JOIN people n1 ON n1.id = s1.people_id
JOIN people_emails s2 ON s2.id = r.receiver_email_id
JOIN people n2 ON n2.id = s2.people_id
WHERE r.receiver_email_id = 3233 AND r.status = 2
ORDER BY e.date desc LIMIT 0, 50;
I also suggest adding this index:
ALTER TABLE email_routing ADD KEY bk1 (receiver_email_id, status,
sender_email_id, message_id, basket);
This helps the search for r.receiver_email_id = 3233 AND r.status = 2.
The additional columns are in the index to make it a covering index. This means the query doesn't have to read the email_routing table at all, if it gets all the columns it needs from the index.
EXPLAIN for this query looks better, now that none of the tables are doing type=ALL, and one of them shows "Using index" which is the indicator of the covering index.
+----+-------------+-------+--------+---------------+---------+---------+------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+------------------------+------+---------------------------------+
| 1 | SIMPLE | s2 | const | PRIMARY | PRIMARY | 4 | const | 1 | Using temporary; Using filesort |
| 1 | SIMPLE | n2 | const | PRIMARY | PRIMARY | 4 | const | 1 | NULL |
| 1 | SIMPLE | r | ref | bk1 | bk1 | 8 | const,const | 1 | Using index |
| 1 | SIMPLE | s1 | eq_ref | PRIMARY | PRIMARY | 4 | test.r.sender_email_id | 1 | NULL |
| 1 | SIMPLE | n1 | eq_ref | PRIMARY | PRIMARY | 4 | test.s1.people_id | 1 | NULL |
| 1 | SIMPLE | e | eq_ref | PRIMARY | PRIMARY | 4 | test.r.message_id | 1 | NULL |
+----+-------------+-------+--------+---------------+---------+---------+------------------------+------+---------------------------------+
P.S.: MyISAM vs. InnoDB makes little difference for this query optimization. The index will help a lot for both storage engines. But I always recommend to convert to InnoDB (see my answer to MyISAM versus InnoDB).
This looks wrong:
FROM people_emails p
JOIN email_routing r ON r.receiver_email_id = 3223
AND r.status = 2
p is not used in any ON clauses. Perhaps you are missing so way to tie p and r together? Without it, you have a "cross join". If there are 1K rows in each, you end up with 1M rows in the join.
Also, please use ON for showing how tables relate; use WHERE for filtering (3222 & 2).
The more records are added to database the query becomes slower. From 1 sec. to few seconds now, in result webpage load time is way to long
CREATE TABLE `ads` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`user_status` enum('register','unregister') COLLATE latin1_general_ci NOT NULL DEFAULT 'register',
`title` varchar(255) COLLATE latin1_general_ci NOT NULL,
`tags` varchar(255) COLLATE latin1_general_ci NOT NULL,
`ad_type` enum('offer','want') COLLATE latin1_general_ci NOT NULL,
`price` float NOT NULL,
`image` varchar(255) COLLATE latin1_general_ci NOT NULL,
`address` varchar(255) COLLATE latin1_general_ci NOT NULL,
`google_address` varchar(255) COLLATE latin1_general_ci NOT NULL,
`country_id` int(11) NOT NULL,
`state_id` int(11) NOT NULL,
`address2` text COLLATE latin1_general_ci NOT NULL,
`city` varchar(255) COLLATE latin1_general_ci NOT NULL,
`location` int(11) NOT NULL,
`postal_code` varchar(255) COLLATE latin1_general_ci NOT NULL,
`Latitude` varchar(255) COLLATE latin1_general_ci NOT NULL,
`Longitude` varchar(255) COLLATE latin1_general_ci NOT NULL,
`working_remote` varchar(255) COLLATE latin1_general_ci NOT NULL,
`emergency_service` varchar(255) COLLATE latin1_general_ci NOT NULL,
`ad_description` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`cat_id` int(11) NOT NULL,
`sub_cat_id` int(11) NOT NULL,
`sub_sub_cat_id` int(11) NOT NULL,
`status` enum('0','1') COLLATE latin1_general_ci NOT NULL,
`delete_status` enum('0','1') COLLATE latin1_general_ci NOT NULL DEFAULT '0',
`publication_days` varchar(255) COLLATE latin1_general_ci NOT NULL,
`publication_total` float(11,2) NOT NULL,
`added_date` datetime NOT NULL,
`expiry_date` datetime NOT NULL,
`payment_status` enum('pending','paid','cancel') COLLATE latin1_general_ci NOT NULL,
`closed_date` datetime NOT NULL,
`deleted_date` datetime NOT NULL,
`ad_status` enum('active','closed') COLLATE latin1_general_ci NOT NULL DEFAULT 'active',
`user_first_name` varchar(255) COLLATE latin1_general_ci NOT NULL,
`user_last_name` varchar(255) COLLATE latin1_general_ci NOT NULL,
`user_phone_number` varchar(255) COLLATE latin1_general_ci NOT NULL,
`user_email_id` varchar(255) COLLATE latin1_general_ci NOT NULL,
`ads_extend_date` datetime NOT NULL,
`ads_extend_expiry_date` datetime NOT NULL,
`ads_extend_status` enum('yes','no') COLLATE latin1_general_ci NOT NULL DEFAULT 'no',
`actvation_notification` enum('yes','no') COLLATE latin1_general_ci NOT NULL DEFAULT 'no',
`ads_view_count` int(11) NOT NULL,
`md5_key` varchar(100) COLLATE latin1_general_ci NOT NULL,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`,`user_status`,`title`,`ad_type`,`price`),
KEY `title` (`title`),
KEY `ad_type` (`ad_type`),
KEY `price` (`price`),
KEY `google_address` (`google_address`),
KEY `country_id` (`country_id`),
KEY `state_id` (`state_id`),
KEY `city` (`city`),
KEY `postal_code` (`postal_code`),
KEY `cat_id` (`cat_id`),
KEY `sub_cat_id` (`sub_cat_id`),
KEY `sub_sub_cat_id` (`sub_sub_cat_id`),
KEY `status` (`status`),
KEY `payment_status` (`payment_status`),
KEY `ad_status` (`ad_status`),
KEY `added_date` (`added_date`),
KEY `expiry_date` (`expiry_date`),
KEY `id_2` (`id`,`user_id`,`user_status`,`title`,`ad_type`,`country_id`,`state_id`,`city`,`postal_code`,`cat_id`,`sub_cat_id`,`sub_sub_cat_id`,`added_date`,`expiry_date`,`payment_status`,`ad_status`)
) ENGINE=MyISAM AUTO_INCREMENT=1208 DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci
Slow query log
# Query_time: 3.859838 Lock_time: 0.000368 Rows_sent: 340 Rows_examined: 1248768
SET timestamp=1448331158;
SELECT ads.id,ads.user_id, ads.user_status, ads.title, ads.ad_type,ads.price, ads.address, ads.google_address, ads.state_id, ads.address2, ads.city as city_id, ads. location as location_id, ads.postal_code, ads.Latitude, ads.Longitude, ads.working_remote, ads.emergency_service, ads.ad_description, ads.cat_id, ads. sub_cat_id, ads.sub_sub_cat_id, ads.status, ads.publication_total, ads.ads_view_count, ads.added_date,cat.category_name, sub_cat.category_name as sub_category_name, sub_sub_cat.category_name as sub_sub_category_name, usr.id as user_id, usr.username as user_name, usr.first_name as first_name, usr.rating as rating, adimg.thumbnail, state.state,state.state_abbr, city.city, location.location as locationname,
(SELECT added_date
FROM ads_publication as pub
WHERE pub.ad_id = ads.id
AND pub.publication_id != '0'
ORDER BY pub.sort_type ASC LIMIT 0,1) as publication_srt_id,
SQRT((((69.1*(ads.Latitude -(0)))*(69.1*(ads.Latitude -(0))))+((53*(ads.Longitude -(0)))*(53*(ads.Longitude -(0)))))) as dist_in_miles
FROM ads as ads
LEFT JOIN ads_images as adimg ON (ads.id = adimg.ad_id AND default_image = '1')
LEFT JOIN workrange as wr ON ads.user_id = wr.user_id
LEFT JOIN users as usr ON ads.user_id = usr.id
LEFT JOIN ads_service as price_list ON ads.id = price_list.ad_id
LEFT JOIN ads_publication as promot ON ads.id = promot.ad_id
LEFT JOIN user_languages as language ON ads.id = language.ad_id
LEFT JOIN categories as cat ON (ads.cat_id = cat.id AND cat.parent_category_id = 0)
LEFT JOIN categories as sub_cat ON ads.sub_cat_id = sub_cat.id
LEFT JOIN categories as sub_sub_cat ON ads.sub_sub_cat_id = sub_cat.id
LEFT JOIN location as state ON ads.state_id = state.locationId
LEFT JOIN location as city ON ads.city = city.locationId
LEFT JOIN location as location ON ads.location = location.locationId
WHERE ads.status = '1'
AND ads.payment_status = 'paid'
AND ads.delete_status = '0'
AND ads.expiry_date >= '2015-11-23 21:12:38'
AND ads.ad_status = 'active'
AND ads.ad_type = 'offer'
GROUP BY ads.id
ORDER BY ads.user_status ASC, publication_srt_id DESC, ads.added_date DESC;
Explain
id | select_type |table | type | possible_keys | key | key_len | ref | rows | Extra
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | PRIMARY | ads | index_merge | "ad_type,status,payment_status,ad_status,expiry_dat..." | "status,ad_status" | "1,1" | NULL | 173 | "Using intersect(status,ad_status); Using where; Us..."
1 | PRIMARY | adimg | index | NULL | id | 526 | NULL | 1398 | Using index
1 | PRIMARY | wr | ALL | NULL | NULL | NULL | NULL | 75 |
1 | PRIMARY | usr | eq_ref | PRIMARY | PRIMARY | 4 | serv_co_za.ads.user_id | 1 |
1 | PRIMARY | price_list | ALL | NULL | NULL | NULL | NULL | 57 |
1 | PRIMARY | promot | ref | "ad_id,ad_id_2,ad_id_3" | ad_id_3 | 4 | serv_co_za.ads.id | 11 | Using index
1 | PRIMARY | language | ALL | NULL | NULL | NULL | NULL | 393 |
1 | PRIMARY | cat | eq_ref | "PRIMARY,id" | PRIMARY | 4 | serv_co_za.ads.cat_id | 1 |
1 | PRIMARY | sub_cat | eq_ref | "PRIMARY,id" | PRIMARY | 4 | serv_co_za.ads.sub_cat_id | 1 |
1 | PRIMARY | state | eq_ref | PRIMARY | PRIMARY | 4 | serv_co_za.ads.state_id | 1 |
1 | PRIMARY | city | eq_ref | PRIMARY | PRIMARY | 4 | serv_co_za.ads.city | 1 |
1 | PRIMARY | location | eq_ref | PRIMARY | PRIMARY | 4 | serv_co_za.ads.location | 1 |
1 | PRIMARY | sub_sub_cat | index | NULL | id | 111 | NULL | 1193 | Using index
2 | DEPENDENT SUBQUERY | pub | ref | "ad_id,ad_id_2,ad_id_3" | ad_id | 4 | func | 115 | Using where; Using filesort
Config:
key_buffer_size 33554432
max_allowed_packet 268435456
query_cache_limit 1048576
query_cache_min_res_unit 4096
query_cache_size 33554432
myisam_sort_buffer_size 16777216
sort_buffer_size 524288
thread_cache_size 4
thread_concurrency 10
interactive_timeout 28800
wait_timeout 28800
What I noticed is when this part is removed from the very end of query
GROUP BY ads.id ORDER BY ads.user_status ASC, publication_srt_id DESC, ads.added_date DESC;
query time is about 0.06 sec.
Any help or starting point is highly appreciated.
Thank you in advance,
Derek
Using intersect(status,ad_status) -- A composite index will always beat that. So add INDEX(status, ad_status). Assuming those columns are simply flags, get rid of the individual indexes on them. (Get rid of other simple indexes on other status fields.)
WHERE ads.status = '1'
AND ads.payment_status = 'paid'
AND ads.delete_status = '0'
AND ads.expiry_date >= '2015-11-23 21:12:38'
AND ads.ad_status = 'active'
AND ads.ad_type = 'offer'
For that WHERE, this is better:
INDEX(status, payment_status, delete_status, ad_status, at_type, -- in any order
expiry_date) -- deliberately last
This will make the first step more efficient. Index Cookbook explains how I got that.
Remove LEFT unless the 'right' table is really optional. This could give the optimizer more choices on evaluating the query.
wr, price_list, and language need to scan ALL rows. Let's figure out why. They need indexes on user_id, ad_id, and ad_id respectively. And the datatypes must match what you are comparing to.
Don't use (M,N) (eg, float(11,2)) in FLOAT or DOUBLE, it leads to an extra rounding that could cause surprises. For currency, switch to DECIMAL(11,2) (or similar).
Don't use VARCHAR for continuous, numeric, values such as Latitude and Longitude. FLOAT or DOUBLE is good.
Consider moving to InnoDB. MyISAM is dying off.
DROP INDEX id_2 -- it is likely to serve no purpose.
Possible reasons for the query getting slower and slower:
wr, price_list, and language are getting larger. The indexes should cure that.
MyISAM involves table locks.
key_buffer_size should be set about 20% of available ram. As the tables grow, the key_buffer may be thrashing. (Note: a different setting is needed for InnoDB.)
((Edit))
Since Latitude needs to be converted for expression evaluation, it is even more important to use some numeric datatype.
pub needs INDEX(ads_id, sort_type)
publication_id may get in the way of the above INDEX; can you get rid of the test?
I have these two tables (Moodle 2.8):
CREATE TABLE `mdl_course` (
`id` bigint(10) NOT NULL AUTO_INCREMENT,
`category` bigint(10) NOT NULL DEFAULT '0',
`sortorder` bigint(10) NOT NULL DEFAULT '0',
`fullname` varchar(254) NOT NULL DEFAULT '',
`shortname` varchar(255) NOT NULL DEFAULT '',
`idnumber` varchar(100) NOT NULL DEFAULT '',
`summary` longtext,
`summaryformat` tinyint(2) NOT NULL DEFAULT '0',
`format` varchar(21) NOT NULL DEFAULT 'topics',
`showgrades` tinyint(2) NOT NULL DEFAULT '1',
`newsitems` mediumint(5) NOT NULL DEFAULT '1',
`startdate` bigint(10) NOT NULL DEFAULT '0',
`marker` bigint(10) NOT NULL DEFAULT '0',
`maxbytes` bigint(10) NOT NULL DEFAULT '0',
`legacyfiles` smallint(4) NOT NULL DEFAULT '0',
`showreports` smallint(4) NOT NULL DEFAULT '0',
`visible` tinyint(1) NOT NULL DEFAULT '1',
`visibleold` tinyint(1) NOT NULL DEFAULT '1',
`groupmode` smallint(4) NOT NULL DEFAULT '0',
`groupmodeforce` smallint(4) NOT NULL DEFAULT '0',
`defaultgroupingid` bigint(10) NOT NULL DEFAULT '0',
`lang` varchar(30) NOT NULL DEFAULT '',
`theme` varchar(50) NOT NULL DEFAULT '',
`timecreated` bigint(10) NOT NULL DEFAULT '0',
`timemodified` bigint(10) NOT NULL DEFAULT '0',
`requested` tinyint(1) NOT NULL DEFAULT '0',
`enablecompletion` tinyint(1) NOT NULL DEFAULT '0',
`completionnotify` tinyint(1) NOT NULL DEFAULT '0',
`cacherev` bigint(10) NOT NULL DEFAULT '0',
`calendartype` varchar(30) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `mdl_cour_cat_ix` (`category`),
KEY `mdl_cour_idn_ix` (`idnumber`),
KEY `mdl_cour_sho_ix` (`shortname`),
KEY `mdl_cour_sor_ix` (`sortorder`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `mdl_log` (
`id` bigint(10) NOT NULL AUTO_INCREMENT,
`time` bigint(10) NOT NULL DEFAULT '0',
`userid` bigint(10) NOT NULL DEFAULT '0',
`ip` varchar(45) NOT NULL DEFAULT '',
`course` bigint(10) NOT NULL DEFAULT '0',
`module` varchar(20) NOT NULL DEFAULT '',
`cmid` bigint(10) NOT NULL DEFAULT '0',
`action` varchar(40) NOT NULL DEFAULT '',
`url` varchar(100) NOT NULL DEFAULT '',
`info` varchar(255) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `mdl_log_coumodact_ix` (`course`,`module`,`action`),
KEY `mdl_log_tim_ix` (`time`),
KEY `mdl_log_act_ix` (`action`),
KEY `mdl_log_usecou_ix` (`userid`,`course`),
KEY `mdl_log_cmi_ix` (`cmid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
And this query:
SELECT l.id,
l.userid AS participantid,
l.course AS courseid,
l.time,
l.ip,
l.action,
l.info,
l.module,
l.url
FROM mdl_log l
INNER JOIN mdl_course c ON l.course = c.id AND c.category <> 0
WHERE
l.id > [some large id]
AND
l.time > [some unix timestamp]
ORDER BY l.id ASC
LIMIT 0,200
mdl_log table has over 200 milion records, and I need to export it into file using PHP and not die in intent. The main problem here is that executing this is too slow. The main killer here is the join to the mdl_course table. If I remove it, everything works fast.
Here is the explain:
+----+-------------+-------+-------+---------------------------------------------+----------------------+---------+----------------+------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------------------------------------+----------------------+---------+----------------+------+-----------------------------------------------------------+
| 1 | SIMPLE | c | range | PRIMARY,mdl_cour_cat_ix | mdl_cour_cat_ix | 8 | NULL | 3152 | Using where; Using index; Using temporary; Using filesort |
| 1 | SIMPLE | l | ref | PRIMARY,mdl_log_coumodact_ix,mdl_log_tim_ix | mdl_log_coumodact_ix | 8 | xray2qasb.c.id | 618 | Using index condition; Using where |
+----+-------------+-------+-------+---------------------------------------------+----------------------+---------+----------------+------+-----------------------------------------------------------+
Is there any way to remove usage of temporary and filesort? What do you propose here?
After some testing this query works fast as expected:
SELECT l.id,
l.userid AS participantid,
l.course AS courseid,
l.time,
l.ip,
l.action,
l.info,
l.module,
l.url
FROM mdl_log l
WHERE
l.id > 123456
AND
l.time > 1234
AND
EXISTS (SELECT * FROM mdl_course c WHERE l.course = c.id AND c.category <> 0 )
ORDER BY l.id ASC
LIMIT 0,200
Thanks to JamieD77 for his suggestion!
execution plan:
+----+--------------------+-------+--------+-------------------------+---------+---------+--------------------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+--------+-------------------------+---------+---------+--------------------+----------+-------------+
| 1 | PRIMARY | l | range | PRIMARY,mdl_log_tim_ix | PRIMARY | 8 | NULL | 99962199 | Using where |
| 2 | DEPENDENT SUBQUERY | c | eq_ref | PRIMARY,mdl_cour_cat_ix | PRIMARY | 8 | xray2qasb.l.course | 1 | Using where |
+----+--------------------+-------+--------+-------------------------+---------+---------+--------------------+----------+-------------+
Try moving the category selection outside the JOIN. Here I put it in an IN() which the engine will cache on successive runs. I don't have 200M rows to test on, so YMMV.
DESCRIBE
SELECT l.id,
l.userid AS participantid,
l.course AS courseid,
l.time,
l.ip,
l.action,
l.info,
l.module,
l.url
FROM mdl_log l
WHERE
l.id > 1234567890
AND
l.time > 1234567890
AND
l.course IN (SELECT c.id FROM mdl_course c WHERE c.category > 0)
ORDER BY l.id ASC
LIMIT 0,200;
(In addition to using EXISTS...)
l.id > 123456 AND l.time > 1234
seems to beg for a 2-dimensional index.
99962199 -- the table is very big, correct?
Consider PARTITION BY RANGE on mdl_log on time. But...
Don't have more than about 50 partitions; other inefficiencies kick in then.
Partitioning probably won't help id and time are sorta in lock-step. Typical case: id is AUTO_INCREMENT and time is approximately the time of the INSERT.
If that applies, consider:
PRIMARY KEY(time, id) -- see below
INDEX(id) -- Yes, this is sufficient for `id AUTO_INCREMENT`.
With those indexes, you could efficiently do
WHERE time > ...
ORDER BY time, id
which is probably what you really wanted.
I have this MySQL which should be correct syntax:
select c.cat_id,c.cat_name as cat_name,
c.cat_desc, c.cat_image, mi.filename,
l.link_id, l.user_id, l.address,l.city,
l.country,l.link_created,l.link_desc,
l.email,l.fax,l.link_hits, l.link_modified,
l.link_name,l.postcode, l.price,l.link_rating,
l.state,l.telephone,l.link_votes,
l.website, l.link_id, l.link_visited, cf.value
from j25_mt_cats as c,
j25_mt_links as l
LEFT OUTER JOIN j25_mt_cfvalues AS cf ON (cf.link_id = l.link_id),
j25_mt_images AS mi,
j25_mt_cl as cl
UNION ALL
select c.cat_id,c.cat_name as cat_name,
c.cat_desc, c.cat_image, mi.filename,
l.link_id, l.user_id, l.address,l.city,
l.country,l.link_created,l.link_desc,
l.email,l.fax,l.link_hits, l.link_modified,
l.link_name,l.postcode, l.price,l.link_rating,
l.state,l.telephone,l.link_votes,
l.website, l.link_id, l.link_visited, cf.value
FROM j25_mt_cats as c,
j25_mt_links as l
RIGHT OUTER JOIN j25_mt_cfvalues AS cf ON cf.link_id = l.link_id,
j25_mt_images AS mi,
j25_mt_cl as cl
where cf.cf_id = 40 and cl.link_id = l.link_id
AND mi.link_id = l.link_id AND mi.ordering < 2
AND c.cat_id = cl.cat_id and c.cat_published = 1
AND c.cat_approved = 1 and l.link_published = 1 and l.link_approved = 1
AND cf.link_id IS NULL;
The query eats up 3GB+ in the tmp directory and ends up timing out. I'm missing something here, how can I increase the efficiency? My goal here was just adding onto an existing query to grab a value from an additional table (j25_mt_cfvalues).
explain:
+----+--------------+------------+-------+---------------+---------+---------+--------------------------+------+------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+------------+-------+---------------+---------+---------+--------------------------+------+------------------+
| 1 | PRIMARY | mi | ALL | NULL | NULL | NULL | NULL | 165 | |
| 1 | PRIMARY | c | ALL | NULL | NULL | NULL | NULL | 301 | |
| 1 | PRIMARY | l | ALL | NULL | NULL | NULL | NULL | 2139 | |
| 1 | PRIMARY | cf | ref | link_id | link_id | 4 | db_table.l.link_id | 2 | |
| 1 | PRIMARY | cl | index | NULL | PRIMARY | 4 | NULL | 2742 | Using index |
| 2 | UNION | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+------------+-------+---------------+---------+---------+--------------------------+------+------------------+
j25_mt_cats schema:
CREATE TABLE j25_mt_cats (
cat_id int(11) NOT NULL auto_increment,
cat_name varchar(255) NOT NULL,
alias varchar(255) NOT NULL,
title varchar(255) NOT NULL,
cat_desc text NOT NULL,
cat_parent int(11) NOT NULL default '0',
cat_links int(11) NOT NULL default '0',
cat_cats int(11) NOT NULL default '0',
cat_featured tinyint(4) NOT NULL default '0',
cat_image varchar(255) NOT NULL,
cat_published tinyint(4) NOT NULL default '0',
cat_created datetime NOT NULL default '0000-00-00 00:00:00',
cat_approved tinyint(4) NOT NULL default '0',
cat_template varchar(255) NOT NULL default '',
cat_usemainindex tinyint(4) NOT NULL default '0',
cat_allow_submission tinyint(4) NOT NULL default '1',
cat_show_listings tinyint(3) unsigned NOT NULL default '1',
metakey text NOT NULL,
metadesc text NOT NULL,
ordering int(11) NOT NULL default '0',
lft int(11) NOT NULL default '0',
rgt int(11) NOT NULL default '0',
PRIMARY KEY (cat_id),
KEY cat_id (cat_id,cat_published,cat_approved),
KEY cat_parent (cat_parent,cat_published,cat_approved,cat_cats,cat_links),
KEY dtree (cat_published,cat_approved),
KEY lft_rgt (lft,rgt),
KEY func_getPathWay (lft,rgt,cat_id,cat_parent),
KEY alias (alias)
) ENGINE=MyISAM AUTO_INCREMENT=3851 DEFAULT CHARSET=utf8 |
j25_mt_links schema:
CREATE TABLE j25_mt_links (
link_id int(11) NOT NULL auto_increment,
link_name varchar(255) NOT NULL,
alias varchar(255) NOT NULL,
link_desc mediumtext NOT NULL,
user_id int(11) NOT NULL default '0',
link_hits int(11) NOT NULL default '0',
link_votes int(11) NOT NULL default '0',
link_rating decimal(7,6) unsigned NOT NULL default '0.000000',
link_featured smallint(6) NOT NULL default '0',
link_published tinyint(4) NOT NULL default '0',
link_approved int(4) NOT NULL default '0',
link_template varchar(255) NOT NULL,
attribs text NOT NULL,
metakey text NOT NULL,
metadesc text NOT NULL,
internal_notes text NOT NULL,
ordering int(11) NOT NULL default '0',
link_created datetime NOT NULL default '0000-00-00 00:00:00',
publish_up datetime NOT NULL default '0000-00-00 00:00:00',
publish_down datetime NOT NULL default '0000-00-00 00:00:00',
link_modified datetime NOT NULL default '0000-00-00 00:00:00',
link_visited int(11) NOT NULL default '0',
address varchar(255) NOT NULL,
city varchar(255) NOT NULL,
state varchar(255) NOT NULL,
country varchar(255) NOT NULL,
postcode varchar(255) NOT NULL,
telephone varchar(255) NOT NULL,
fax varchar(255) NOT NULL,
email varchar(255) NOT NULL,
website varchar(255) NOT NULL,
price double(9,2) NOT NULL default '0.00',
lat float(10,6) NOT NULL COMMENT 'Latitude',
lng float(10,6) NOT NULL COMMENT 'Longitude',
zoom tinyint(3) unsigned NOT NULL COMMENT 'Map''s zoom level',
PRIMARY KEY (link_id),
KEY link_rating (link_rating),
KEY link_votes (link_votes),
KEY link_name (link_name),
KEY publishing (link_published,link_approved,publish_up,publish_down),
KEY count_listfeatured (link_published,link_approved,link_featured,publish_up,publish_down,link_id),
KEY count_viewowner (link_published,link_approved,user_id,publish_up,publish_down),
KEY mylisting (user_id,link_id),
FULLTEXT KEY link_name_desc (link_name,link_desc)
) ENGINE=MyISAM AUTO_INCREMENT=3229 DEFAULT CHARSET=utf8 |
j25_mt_cfvalues schema:
CREATE TABLE j25_mt_cfvalues (
id int(11) NOT NULL auto_increment,
cf_id int(11) NOT NULL,
link_id int(11) NOT NULL,
value mediumtext NOT NULL,
attachment int(10) unsigned NOT NULL default '0',
counter int(11) NOT NULL default '0',
PRIMARY KEY (id),
KEY cf_id (cf_id,link_id),
KEY link_id (link_id),
KEY value (value(8))
) ENGINE=MyISAM AUTO_INCREMENT=20876 DEFAULT CHARSET=utf8 |
The problem is that your first SQL query does not have any WHERE criteria and is causing a global cartesian across EVERY table your are working. Only in the second query does the WHERE clause get applied.
That said, you had a left and right join for the CF table, but you can't have both a cf=40 and cf IS NULL, so I simplified to just a left join on the ID AND 40... so if there IS a record in the CF table it only shows if it's value is 40... any other value would be ignored..
that said, your query can be simplified down to a single query. I also changed to JOIN syntax instead of WHERE so you and others can see the relation to the tables vs guessing.
select
(all your fields)
from
j25_mt_cats as c
JOIN j25_mt_cl as cl
ON c.cat_id = cl.cat_id
JOIN j25_mt_links as l
ON cl.link_id = l.link_id
AND l.link_published = 1
AND l.link_approved = 1
JOIN j25_mt_images AS mi
ON l.link_id = mi.link_id
AND mi.ordering < 2
LEFT OUTER JOIN j25_mt_cfvalues AS cf
ON l.link_id = cf.link_id
AND cf.cf_id = 40
where
c.cat_published = 1
AND c.cat_approved = 1
ORDER BY
RAND() DESC;
TO help optimize the query, your
j25_mt_cats should have an index on (cat_published, cat_approved)
j25_mt_cl on (cat_id)
j25_mt_links on (link_id, link_published, link_approved)
j25_mt_images on (link_id, ordering)
j25_mt_cfvalues on (link_id, cf_id)
my SQL(with sub-queries) take so long(nearly 24hour). Is using sub-queries is not good for performance?
My table as below
mysql> show create table eventnew;
CREATE TABLE `eventnew` (
`id` int(50) NOT NULL AUTO_INCREMENT,
`date` datetime DEFAULT NULL,
`src_ip` int(10) unsigned DEFAULT NULL,
`src_port` int(10) unsigned DEFAULT NULL,
`dst_ip` int(10) unsigned DEFAULT NULL,
`dst_port` int(10) unsigned DEFAULT NULL,
`repo_ip` varchar(50) DEFAULT NULL,
`link` varchar(50) DEFAULT NULL,
`binary_hash` varchar(50) DEFAULT NULL,
`sensor_id` varchar(50) DEFAULT NULL,
`repox_ip` int(10) unsigned DEFAULT NULL,
`flags` varchar(50) DEFAULT NULL,
`shellcode` varchar(1000) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `date` (`date`),
KEY `sensor_id` (`sensor_id`),
KEY `src_ip` (`src_ip`)
) ENGINE=MyISAM AUTO_INCREMENT=883278 DEFAULT CHARSET=latin1
my SQL as below:
SELECT COUNT( DISTINCT binary_hash ) AS cnt
FROM eventnew
WHERE DATE >= '2010-10-16'
AND DATE < '2010-10-17'
AND binary_hash NOT
IN (
SELECT DISTINCT binary_hash
FROM eventnew
WHERE DATE < '2010-10-16'
AND binary_hash IS NOT NULL
)
below are result running EXPLAIN:
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
| 1 | PRIMARY | eventnew | range | date | date | 9 | NULL | 14296 | Using where |
| 2 | DEPENDENT SUBQUERY | eventnew | range | date | date | 9 | NULL | 384974 | Using where |
+----+--------------------+----------+-------+---------------+------+---------+------+--------+-------------+
Using subqueries certainly does affect your performance. For instance, lets say a Table T1 has 'n' records and T2 has 'm' records. when you do a join on T1 and T2, it will take n*m records and then will sort them based on your condition. The same case goes with in keyword as well. and if you have another constraint in subquery, it would further decrease the efficiency. However, using subqueries couldn't be avoided in practice as they are meant to be.
I'd suggest you use NOT EXISTS instead of NOT IN.
Try this
SELECT COUNT( DISTINCT a.binary_hash ) AS cnt
FROM eventnew a left join eventnew b on (a.binary_hash=b.binary_hash AND b.binary_hash IS NOT NULL AND b.DATE < '2010-10-16')
WHERE a.DATE >= '2010-10-16'
AND a.DATE < '2010-10-17'
and b.date is null