Can you please help me optimize this query. I use this query to get a list of friends along with their details and their status.
It takes about 0.08 secs to process this on a Athlon X2 6000
I cant use materizlized view as well because this is frequently changing.
SELECT p.userid, p.firstname, p.lastname, p.gender, p.dob, x.relationship,
IF(p.picture !=1,
IF(p.gender != 'm','/sc/f-t.jpg','/sc/m-t.jpg'),
concat('/sc/pthumb/', p.userid, '.jpg' )) AS picture
FROM `social` AS p
LEFT JOIN `friendlist` AS f1 ON (f1.`userid` = p.`userid` AND f1.`friendid` = 1 AND `f1`.`status` = 1)
LEFT JOIN `friendlist` AS f2 ON (f2.`friendid` = p.`userid` AND f2.`userid` = 1 AND `f2`.`status` = 1)
LEFT JOIN `x_relationship` AS x ON (x.`id` = p.`relationship`)
LEFT JOIN `auth` as a ON (a.`userid` = p.`userid`)
WHERE 1 AND (f1.`userid` IS NOT NULL OR f2.`userid` IS NOT NULL AND ((a.`banned` != 1 AND a.`deleted` != 1)))
ORDER BY RAND() LIMIT 0,10
This is your original query, formatted. Below it are a few thoughts.
SELECT
p.userid,
p.firstname,
p.lastname,
p.gender,
p.dob,
x.relationship,
IF(p.picture !=1, IF(p.gender != 'm', '/sc/f-t.jpg', '/sc/m-t.jpg'), concat('/sc/pthumb/', p.userid, '.jpg' )) AS picture
FROM
`social` AS p
LEFT JOIN `friendlist` AS f1 ON (f1.`friendid` = 1 AND f1.`userid` = p.`userid` AND `f1`.`status` = 1)
LEFT JOIN `friendlist` AS f2 ON (f2.`friendid` = p.`userid` AND f2.`userid` = 1 AND `f2`.`status` = 1)
LEFT JOIN `x_relationship` AS x ON (x.`id` = p.`relationship`)
LEFT JOIN `auth` AS a ON (a.`userid` = p.`userid`)
WHERE
1
AND (
f1.`userid` IS NOT NULL
OR f2.`userid` IS NOT NULL
AND (a.`banned` != 1 AND a.`deleted` != 1)
)
ORDER BY
RAND()
LIMIT
0,10
Think if some of your left joins could be inner joins. If so, change them.
Your WHERE clause is messed up. You are mixing AND and OR without properly prioritizing with parentheses. Try:
WHERE
a.`banned` != 1
AND a.`deleted` != 1
AND (
f1.`userid` IS NOT NULL
OR f2.`userid` IS NOT NULL
)
You could also try:
INNER JOIN `auth` AS a ON (
a.`userid` = p.`userid`
AND a.`banned` != 1
AND a.`deleted` != 1
)
WHERE
f1.`userid` IS NOT NULL
OR f2.`userid` IS NOT NULL
You seem to be joining against friendlist solely to check for record existence. You might want to give this a try as well:
FROM
`social` AS p
INNER JOIN `auth` AS a ON (
a.`userid` = p.`userid`
AND a.`banned` != 1
AND a.`deleted` != 1
)
LEFT JOIN `x_relationship` AS x ON (
x.`id` = p.`relationship`
)
WHERE
EXISTS (
SELECT 1 FROM `friendlist` WHERE `friendid` = 1 AND `userid` = p.`userid` AND `status` = 1
)
OR EXISTS (
SELECT 1 FROM `friendlist` WHERE `friendid` = p.`userid` AND `userid` = 1 AND `status` = 1
)
ORDER BY RAND() might not be the fastest thing on earth, but if this is what you need... Try ordering by a indexed column to see how much impact ORDER BY RAND() has.
Indexing:
You should create a composite index on friendlist(friendid, userid, status).
Make sure there is an index on relationship(id)
Make sure there is an index on auth(userid)
Related
First question on here, so I apologise if I have missed something previously asked, or don't format this well....
My company has a custom CRM + database that I am attempting to improve. We need to find a list of properties that will not have their yearly service renewed. Currently, we do this by first finding properties that will have their service renewed, which is done with the following query:
SELECT DISTINCT
j.`property_id`
FROM
`jobs` AS j
LEFT JOIN `property` AS p
ON j.`property_id` = p.`property_id`
LEFT JOIN `agency` AS a
ON p.`agency_id` = a.`agency_id`
INNER JOIN `property_services` AS ps
ON (
j.`property_id` = ps.`property_id`
AND j.`service` = ps.`alarm_job_type_id`
)
WHERE ps.`service` = 1
AND a.`country_id` = 1
AND (
j.`status` = 'Pending'
OR j.`date` IS NULL
OR j.`date` = '0000-00-00'
OR j.`job_type` = 'Once-off'
OR j.`job_type` = '240v Rebook'
OR (
j.`date` >= '2019-04-22'
AND j.`job_type` = 'Yearly Maintenance'
)
)
Then we find the details we want to display for the user, excluding other items in the process:
SELECT DISTINCT
j.`property_id`,
p.`address_1` AS p_address1,
p.`address_2` AS p_address2,
p.`address_3` AS p_address3,
p.`state` AS p_state,
p.`postcode` AS p_postcode,
a.`agency_id`,
a.`agency_name`
FROM
`jobs` AS j
LEFT JOIN `property` AS p
ON j.`property_id` = p.`property_id`
LEFT JOIN `agency` AS a
ON p.`agency_id` = a.`agency_id`
INNER JOIN `property_services` AS ps
ON (
j.`property_id` = ps.`property_id`
AND j.`service` = ps.`alarm_job_type_id`
)
WHERE p.`property_id` NOT IN (INSERT HERE THE IDS YOU GOT FROM THE FIRST QUERY)
AND ps.`service` = 1
AND p.`deleted` = 0
AND p.`agency_deleted` = 0
AND a.`status` = 'active'
AND a.`country_id` = 1
ORDER BY j.`property_id` DESC
LIMIT 0, 50
Ideally, I'd like to combine these queries, or somehow optimise them, as the page currently takes 2+ minutes to load, even with indexing.
Again, my apologies, I am not a database or query expert, pretty sure the degree only included one or two subjects on the matter!
For the record (in case someone would find it helpful), the combined version is:
SELECT DISTINCT
j.`property_id`,
p.`address_1` AS p_address1,
p.`address_2` AS p_address2,
p.`address_3` AS p_address3,
p.`state` AS p_state,
p.`postcode` AS p_postcode,
a.`agency_id`,
a.`agency_name`
FROM
`jobs` AS j
LEFT JOIN `property` AS p
ON j.`property_id` = p.`property_id`
LEFT JOIN `agency` AS a
ON p.`agency_id` = a.`agency_id`
INNER JOIN `property_services` AS ps
ON (
j.`property_id` = ps.`property_id`
AND j.`service` = ps.`alarm_job_type_id`
)
WHERE p.`property_id` NOT IN
(SELECT DISTINCT
j.`property_id`
FROM
`jobs` AS j
LEFT JOIN `property` AS p
ON j.`property_id` = p.`property_id`
LEFT JOIN `agency` AS a
ON p.`agency_id` = a.`agency_id`
INNER JOIN `property_services` AS ps
ON (
j.`property_id` = ps.`property_id`
AND j.`service` = ps.`alarm_job_type_id`
)
WHERE ps.`service` = 1
AND a.`country_id` = 1
AND (
j.`status` = 'Pending'
OR j.`date` IS NULL
OR j.`date` = '0000-00-00'
OR j.`job_type` = 'Once-off'
OR j.`job_type` = '240v Rebook'
OR (
j.`date` >= '2019-05-08'
AND j.`job_type` = 'Yearly Maintenance'
)
))
AND ps.`service` = 1
AND p.`deleted` = 0
AND p.`agency_deleted` = 0
AND a.`status` = 'active'
AND a.`country_id` = 1
AND (
j.`status` != 'Booked'
AND j.`status` != 'To Be Booked'
AND j.`status` != 'Send Letters'
AND j.`status` != 'On Hold'
AND j.`status` != 'On Hold - COVID'
AND j.`status` != 'Pre Completion'
AND j.`status` != 'Merged Certificates'
)
AND j.`date` > DATE_ADD(NOW(), INTERVAL - 350 DAY)
ORDER BY j.`property_id` DESC
Someone outside of StackOverflow helped combine, but it didn't help...
We ended up having to add a marker and search for that, because this query took 193 seconds to run on our database.
Also, I totally get the requirement to provide a minimum reproducible example, and its my fault for not doing that.
query taking 1 minute to fetch results
SELECT
`jp`.`id`,
`jp`.`title` AS game_title,
`jp`.`game_type`,
`jp`.`state_abb` AS game_state,
`jp`.`location` AS game_city,
`jp`.`zipcode` AS game_zipcode,
`jp`.`modified_on`,
`jp`.`posted_on`,
`jp`.`game_referal_amount`,
`jp`.`games_referal_amount_type`,
`jp`.`status`,
`jp`.`is_flaged`,
`u`.`id` AS employer_id,
`u`.`email` AS employer_email,
`u`.`name` AS employer_name,
`jf`.`name` AS game_function,
`jp`.`game_freeze_status`,
`jp`.`game_statistics`,
`jp`.`ats_value`,
`jp`.`integration_id`,
`u`.`account_manager_id`,
`jp`.`model_game`,
`jp`.`group_id`,
(CASE
WHEN jp.group_id != '0' THEN gm.group_name
ELSE 'NA'
END) AS group_name,
`jp`.`priority_game`,
(CASE
WHEN jp.country != 'US' THEN jp.country_name
ELSE ''
END) AS game_country,
IFNULL((CASE
WHEN
`jp`.`account_manager_id` IS NULL
OR `jp`.`account_manager_id` = 0
THEN
(SELECT
(CASE
WHEN
account_manager_id IS NULL
OR account_manager_id = 0
THEN
`u`.`account_manager_id`
ELSE account_manager_id
END) AS account_manager_id
FROM
user_user
WHERE
id = (SELECT
user_id
FROM
game_user_assigned
WHERE
game_id = `jp`.`id`
LIMIT 1))
ELSE `jp`.`account_manager_id`
END),
`u`.`account_manager_id`) AS acc,
(SELECT
COUNT(recach_limit_id)
FROM
recach_limit
WHERE
recach_limit = '1'
AND recach_limit_game_id = rpr.recach_limit_game_id) AS somewhatgame,
(SELECT
COUNT(recach_limit_id)
FROM
recach_limit
WHERE
recach_limit = '2'
AND recach_limit_game_id = rpr.recach_limit_game_id) AS verygamecommitted,
(SELECT
COUNT(recach_limit_id)
FROM
recach_limit
WHERE
recach_limit = '3'
AND recach_limit_game_id = rpr.recach_limit_game_id) AS notgame,
(SELECT
COUNT(joa.id) AS applicationcount
FROM
game_refer_to_member jrmm
INNER JOIN
game_refer jrr ON jrr.id = jrmm.rid
INNER JOIN
game_applied joa ON jrmm.id = joa.referred_by
WHERE
jrmm.STATUS = '1'
AND jrr.referby_user_id IN (SELECT
ab_testing_user_id
FROM
ab_testing)
AND joa.game_post_id = rpr.recach_limit_game_id
AND (rpr.recach_limit = 1
OR rpr.recach_limit = 2)) AS gamecount
FROM
(`game_post` AS jp)
JOIN
`user_info` AS u ON `jp`.`user_user_id` = `u`.`id`
JOIN
`game_functional` jf ON `jp`.`game_functional_id` = `jf`.`id`
LEFT JOIN
`group_musesm` gm ON `gm`.`group_id` = `jp`.`group_id`
LEFT JOIN
`recach_limit` rpr ON `jp`.`id` = `rpr`.`recach_limit_game_id`
WHERE
`jp`.`status` != '3'
GROUP BY `jp`.`id`
ORDER BY `posted_on` DESC
LIMIT 10
I would first suggest not nesting select statements because this will cause an n^x performance hit on every xth level and I see at least 3 levels of selects inside this query.
Add index
INDEX(status, posted_on)
Move LIMIT inside
Then, instead of saying
FROM (`game_post` AS jp)
say
FROM ( SELECT id FROM game_post
WHERE status != 3
ORDER BY posted_on DESC
LIMIT 10 ) AS ids
JOIN game_post AS jp USING(id)
(I am assuming that the PK of jp is (id)?)
That should efficiently use the new index to get the 10 ids needed. Then it will reach back into game_post to get the other columns.
LEFT
Also, don't say LEFT unless you need it. It costs something to generate NULLs that you may not be needing.
Is GROUP BY necessary?
If you remove the GROUP BY, does it show dup ids? The above changes may have eliminated the need.
IN(SELECT) may optimize poorly
Change
AND jrr.referby_user_id IN ( SELECT ab_testing_user_id
FROM ab_testing )
to
AND EXISTS ( SELECT * FROM ab_testing
WHERE ab_testing_user_id = jrr.referby_user_id )
(This change may or may not help, depending on the version you are running.)
More
Please provide EXPLAIN SELECT if you need further assistance.
I have MySQL query currently selecting and joining 13 tables and finally grouping ~60k rows. The query without grouping takes ~0ms but with grouping the query time increases to ~1.7sec. The field, which is used for grouping is primary field and is indexed. Where could be the issue?
I know group by without aggregate is considered invalid query and bad practise but I need distinct base table rows and can not use DISTINCT syntax.
The query itself looks like this:
SELECT `table_a`.*
FROM `table_a`
LEFT JOIN `table_b`
ON `table_b`.`invoice` = `table_a`.`id`
LEFT JOIN `table_c` AS `r1`
ON `r1`.`invoice_1` = `table_a`.`id`
LEFT JOIN `table_c` AS `r2`
ON `r2`.`invoice_2` = `table_a`.`id`
LEFT JOIN `table_a` AS `i1`
ON `i1`.`id` = `r1`.`invoice_2`
LEFT JOIN `table_a` AS `i2`
ON `i2`.`id` = `r2`.`invoice_1`
JOIN `table_d` AS `_u0`
ON `_u0`.`id` = 1
LEFT JOIN `table_e` AS `_ug0`
ON `_ug0`.`user` = `_u0`.`id`
JOIN `table_f` AS `_p0`
ON ( `_p0`.`enabled` = 1
AND ( ( `_p0`.`role` < 2
AND `_p0`.`who` IS NULL )
OR ( `_p0`.`role` = 2
AND ( `_p0`.`who` = '0'
OR `_p0`.`who` = `_u0`.`id` ) )
OR ( `_p0`.`role` = 3
AND ( `_p0`.`who` = '0'
OR `_p0`.`who` = `_ug0`.`group` ) ) ) )
AND ( `_p0`.`action` = '*'
OR `_p0`.`action` = 'read' )
AND ( `_p0`.`related_table` = '*'
OR `_p0`.`related_table` = 'table_name' )
JOIN `table_a` AS `_e0`
ON ( ( `_p0`.`related_id` = 0
OR `_p0`.`related_id` = `_e0`.`id`
OR `_p0`.`related_user` = `_e0`.`user`
OR `_p0`.`related_group` = `_e0`.`group` )
OR ( `_p0`.`role` = 0
AND `_e0`.`user` = `_u0`.`id` )
OR ( `_p0`.`role` = 1
AND `_e0`.`group` = `_ug0`.`group` ) )
AND `_e0`.`id` = `table_a`.`id`
JOIN `table_d` AS `_u1`
ON `_u1`.`id` = 1
LEFT JOIN `table_e` AS `_ug1`
ON `_ug1`.`user` = `_u1`.`id`
JOIN `table_f` AS `_p1`
ON ( `_p1`.`enabled` = 1
AND ( ( `_p1`.`role` < 2
AND `_p1`.`who` IS NULL )
OR ( `_p1`.`role` = 2
AND ( `_p1`.`who` = '0'
OR `_p1`.`who` = `_u1`.`id` ) )
OR ( `_p1`.`role` = 3
AND ( `_p1`.`who` = '0'
OR `_p1`.`who` = `_ug1`.`group` ) ) ) )
AND ( `_p1`.`action` = '*'
OR `_p1`.`action` = 'read' )
AND ( `_p1`.`related_table` = '*'
OR `_p1`.`related_table` = 'table_name' )
JOIN `table_g` AS `_e1`
ON ( ( `_p1`.`related_id` = 0
OR `_p1`.`related_id` = `_e1`.`id`
OR `_p1`.`related_user` = `_e1`.`user`
OR `_p1`.`related_group` = `_e1`.`group` )
OR ( `_p1`.`role` = 0
AND `_e1`.`user` = `_u1`.`id` )
OR ( `_p1`.`role` = 1
AND `_e1`.`group` = `_ug1`.`group` ) )
AND `_e1`.`id` = `table_a`.`company`
WHERE `table_a`.`date_deleted` IS NULL
AND `table_a`.`company` = 4
AND `table_a`.`type` = 1
AND `table_a`.`date_composed` >= '2016-05-04 14:43:55'
GROUP BY `table_a`.`id`
The ORs kill performance.
This composite index may help: INDEX(company, type, date_deleted, date_composed).
LEFT JOIN table_b ON table_b.invoice = table_a.id seems to do absolutely nothing other than slow down the processing. No fields of table_b are used or SELECTed. Since it is a LEFT join, it does not limit the output. Etc. Get rid if it, or justify it.
Ditto for other joins.
What happens with JOIN and GROUP BY: First, all the joins are performed; this explodes the number of rows in the intermediate 'table'. Then the GROUP BY implodes the set of rows.
One technique for avoiding this explode-implode sluggishness is to do
SELECT ...,
( SELECT ... ) AS ...,
...
instead of a JOIN or LEFT JOIN. However, that works only if there is zero or one row in the subquery. Usually this is beneficial when an aggregate (such as SUM) can be moved into the subquery.
For further discussion, please include SHOW CREATE TABLE.
I have following Mysql query
SELECT c.`id`
,c.`category_name`
,c.`category_type`
,c.bookmark_count
,f.category_id cat_id
,f.unfollow_at
,(
CASE WHEN c.id = f.follower_category_id
THEN (
SELECT count(`user_bookmarks`.`id`)
FROM `user_bookmarks`
WHERE (`user_bookmarks`.`category_id` = cat_id)
AND ((`f`.`unfollow_at` > `user_bookmarks`.`created_at`) || (`f`.`unfollow_at` = '0000-00-00 00:00:00'))
)
ELSE 0 END
) counter
,c.id
,f.follower_category_id follow_id
,c.user_id
FROM categories c
LEFT JOIN following_follower_categories f ON f.follower_category_id = c.id
WHERE c.user_id = 26
ORDER BY `category_name` ASC
and here is output what i am getting after execuation
now i just want to count . here i have field id having value 172 against it i have counter 30,3, 2 and Bookmark_count is 4( i need to include only once)
and i am accepting output for id 172 is 30+3+2+4(bookmark_count only once).
I am not sure how to do this.
Can anybody help me out
Thanks a lot
The following may be the most inefficient query for that purpose, but I added a cover to your query in order to hint at grouping the results.
(I removed the second c.id, and my example may have errors since I couldn't try it.)
SELECT `id`,
`category_name`,
`category_type`,
max(`bookmark_count`),
`cat_id`,
`unfollow_at`,
sum(`counter`)+max(`bookmark_count`) counter,
follow_id`, `user_id`
FROM
(SELECT c.`id`
,c.`category_name`
,c.`category_type`
,c.bookmark_count
,f.category_id cat_id
,f.unfollow_at
,(
CASE WHEN c.id = f.follower_category_id
THEN (
SELECT count(`user_bookmarks`.`id`)
FROM `user_bookmarks`
WHERE (`user_bookmarks`.`category_id` = cat_id)
AND ((`f`.`unfollow_at` > `user_bookmarks`.`created_at`) || (`f`.`unfollow_at` = '0000-00-00 00:00:00'))
)
ELSE 0 END
) counter
,f.follower_category_id follow_id
,c.user_id
FROM categories c
LEFT JOIN following_follower_categories f ON f.follower_category_id = c.id
WHERE c.user_id = 26)
GROUP BY `id`, `category_name`, `category_type`, `cat_id`, `unfollow_at`, `follow_id`, `user_id`
ORDER BY `category_name` ASC
I have a sql query that is running really slow and I am perplexed as to why. The query is:
SELECT DISTINCT(c.ID),c.* FROM `content` c
LEFT JOIN `content_meta` cm1 ON c.id = cm1.content_id
WHERE 1=1
AND c.site_id IN (14)
AND c.type IN ('a','t')
AND c.status = 'visible'
AND (c.lock = 0 OR c.site_id = 14)
AND c.level = 0
OR
(
( c.site_id = 14
AND cm1.meta_key = 'g_id'
AND cm1.meta_value IN ('12','13','7')
)
OR
( c.status = 'visible'
AND (
(c.type = 'topic' AND c.parent_id IN (628,633,624))
)
)
)
ORDER BY c.date_updated DESC LIMIT 20
The content table has about 1250 rows and the content meta table has about 3000 rows. This isn't a lot of data and I'm not quite sure what may be causing it to run so slow. Any thoughts/opinions would be greatly appreciated.
Thanks!
Is you where clause correct? You are making a series of AND statements and later you are executing a OR.
Wouldn't the correct be something like:
AND (c.lock = 0 OR c.site_id = 14)
AND (
( ... )
OR
( ... )
)
If it is indeed correct, you could think on changing the structure or treating the result in an script or procedure.
It might have to do with your "OR" clause at the end... your up-front are being utilized by the indexes where possible, but then you throw this huge OR condition at the end that can be either one or the other. Not knowing more of the underlying content, I would adjust to have a UNION on the inside so each entity can utilize its own indexes, get you qualified CIDs, THEN join to final results.
select
c2.*
from
( select distinct
c.ID
from
`content` c
where
c.site_id in (14)
and c.type in ('a', 't' )
and c.status = 'visible'
and c.lock in ( 0, 14 )
and c.level = 0
UNION
select
c.ID
from
`content` c
where
c.status = 'visible'
and c.type = 'topic'
and c.parent_id in ( 628, 633, 624 )
UNION
select
c.ID
from
`content` c
join `content_meta` cm1
on c.id = cm1.content_id
AND cm1.meta_key = 'g_id'
AND cm1.meta_value in ( '12', '13', '7' )
where
c.site_id = 14 ) PreQuery
JOIN `content` c2
on PreQuery.cID = c2.cID
order by
c2.date_updated desc
limit
20
I would ensure content table has an index on ( site_id, type, status ) another on (parent_id, type, status)
and the meta table, an index on ( content_id, meta_key, meta_value )