MYSQL Query Optimization with subqueries and joins - mysql

Hi I have 8 million row data and need to optimize Mysql query to fetch a row from that data. I am using below query but its server response time is too high that creating issue in page loading speed
SELECT q.id
, q.title
, q.question
, q.price
, q.created
, q.userid
, q.duedate
, q.tags
, s.id subjectid
, sc.id subcategoryid
, s.title subject
, sc.title subcategory
, q.statusid
, (SELECT COUNT(id) FROM tbl_answers a WHERE a.questionid = q.id AND a.statusid = 1 AND a.deleted = 'N') q_num_answers
, u.username
, u.image
, u.gender
, (SELECT COUNT(id) FROM tbl_answers a WHERE a.userid = q.userid AND a.statusid = 1 AND a.deleted = 'N') num_answers
, (SELECT COUNT(id) FROM tbl_questions WHERE userid = q.userid AND statusid = 1 AND deleted = 'N') AS num_questions
, 0 amt_earned
, 0 amt_spent
, 0 num_sold
, (SELECT COUNT(ur.id) FROM tbl_users_ratings ur WHERE ur.userid = q.userid AND ur.deleted = 'N') u_num_ratings
, (SELECT COALESCE(SUM(ur.rating), 0) FROM tbl_users_ratings ur WHERE ur.userid = q.userid AND ur.deleted = 'N') u_score
FROM tbl_questions q
JOIN tbl_subjects s
ON q.subject = s.id
JOIN tbl_subjects sc
ON q.subcategory = sc.id
LEFT
JOIN tbl_users u
ON u.id = q.userid
WHERE q.deleted = '$show_deleted'
AND q.id = ?
LIMIT 1

These indexes may help. (I am assuming that id is the PRIMARY KEY of each table.)
ur: (deleted, userid, rating)
a: (deleted, statusid, userid)
a: (deleted, statusid, questionid)
Please provide EXPLAIN SELECT ....
Don't use COUNT(id) unless you need to check for id not being NULL. The usual way to write it is COUNT(*).
In one place you are checking for a provided values for deleted. In another, you hard code it. Perhaps wrong?
AND ur.deleted = 'N'
If the PRIMARY KEY for q is id, then this will lead to either 1 row or no row. What am I missing?
WHERE q.deleted = '$show_deleted'
AND q.id = ?
(There may be more tips. Please change the things I suggested and follow the suggestions from others. Then let's have another look.)

Related

How to Make This SQL Query More Efficient?

I'm not sure how to make the following SQL query more efficient. Right now, the query is taking 8 - 12 seconds on a pretty fast server, but that's not close to fast enough for a Website when users are trying to load a page with this code on it. It's looking through tables with many rows, for instance the "Post" table has 717,873 rows. Basically, the query lists all Posts related to what the user is following (newest to oldest).
Is there a way to make it faster by only getting the last 20 results total based on PostTimeOrder?
Any help would be much appreciated or insight on anything that can be done to improve this situation. Thank you.
Here's the full SQL query (lots of nesting):
SELECT DISTINCT p.Id, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime, p.Content AS Content, p.Bu AS Bu, p.Se AS Se, UNIX_TIMESTAMP(p.PostCreationTime) AS PostTimeOrder
FROM Post p
WHERE (p.Id IN (SELECT pc.PostId
FROM PostCreator pc
WHERE (pc.UserId IN (SELECT uf.FollowedId
FROM UserFollowing uf
WHERE uf.FollowingId = '100')
OR pc.UserId = '100')
))
OR (p.Id IN (SELECT pum.PostId
FROM PostUserMentions pum
WHERE (pum.UserId IN (SELECT uf.FollowedId
FROM UserFollowing uf
WHERE uf.FollowingId = '100')
OR pum.UserId = '100')
))
OR (p.Id IN (SELECT ssp.PostId
FROM SStreamPost ssp
WHERE (ssp.SStreamId IN (SELECT ssf.SStreamId
FROM SStreamFollowing ssf
WHERE ssf.UserId = '100'))
))
OR (p.Id IN (SELECT psm.PostId
FROM PostSMentions psm
WHERE (psm.StockId IN (SELECT sf.StockId
FROM StockFollowing sf
WHERE sf.UserId = '100' ))
))
UNION ALL
SELECT DISTINCT p.Id AS Id, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime, p.Content AS Content, p.Bu AS Bu, p.Se AS Se, UNIX_TIMESTAMP(upe.PostEchoTime) AS PostTimeOrder
FROM Post p
INNER JOIN UserPostE upe
on p.Id = upe.PostId
INNER JOIN UserFollowing uf
on (upe.UserId = uf.FollowedId AND (uf.FollowingId = '100' OR upe.UserId = '100'))
ORDER BY PostTimeOrder DESC;
Changing your p.ID in (...) predicates to existence predicates with correlated subqueries may help. Also since both halves of your union all query are pulling from the Post table and possibly returning nearly identical records you might be able to combine the two into one query by left outer joining to UserPostE and adding upe.PostID is not null as an OR condition in the WHERE clause. UserFollowing will still inner join to UPE. If you want the same Post record twice once with upe.PostEchoTime and once with p.PostCreationTime as the PostTimeOrder you'll need keep the UNION ALL
SELECT
DISTINCT -- <<=- May not be needed
p.Id
, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime
, p.Content AS Content
, p.Bu AS Bu
, p.Se AS Se
, UNIX_TIMESTAMP(coalesce( upe.PostEchoTime
, p.PostCreationTime)) AS PostTimeOrder
FROM Post p
LEFT JOIN UserPostE upe
INNER JOIN UserFollowing uf
on (upe.UserId = uf.FollowedId AND
(uf.FollowingId = '100' OR
upe.UserId = '100'))
on p.Id = upe.PostId
WHERE upe.PostID is not null
or exists (SELECT 1
FROM PostCreator pc
WHERE pc.PostId = p.ID
and pc.UserId = '100'
or exists (SELECT 1
FROM UserFollowing uf
WHERE uf.FollowedId = pc.UserID
and uf.FollowingId = '100')
)
OR exists (SELECT 1
FROM PostUserMentions pum
WHERE pum.PostId = p.ID
and pum.UserId = '100'
or exists (SELECT 1
FROM UserFollowing uf
WHERE uf.FollowedId = pum.UserId
and uf.FollowingId = '100')
)
OR exists (SELECT 1
FROM SStreamPost ssp
WHERE ssp.PostId = p.ID
and exists (SELECT 1
FROM SStreamFollowing ssf
WHERE ssf.SStreamId = ssp.SStreamId
and ssf.UserId = '100')
)
OR exists (SELECT 1
FROM PostSMentions psm
WHERE psm.PostId = p.ID
and exists (SELECT
FROM StockFollowing sf
WHERE sf.StockId = psm.StockId
and sf.UserId = '100' )
)
ORDER BY PostTimeOrder DESC
The from section could alternatively be rewritten to also use an existence clause with a correlated sub query:
FROM Post p
LEFT JOIN UserPostE upe
on p.Id = upe.PostId
and ( upe.UserId = '100'
or exists (select 1
from UserFollowing uf
where uf.FollwedID = upe.UserID
and uf.FollowingId = '100'))
Turn IN ( SELECT ... ) into a JOIN .. ON ... (see below)
Turn OR into UNION (see below)
Some the tables are many:many mappings? Such as SStreamFollowing? Follow the tips in http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
Example of IN:
SELECT ssp.PostId
FROM SStreamPost ssp
WHERE (ssp.SStreamId IN (
SELECT ssf.SStreamId
FROM SStreamFollowing ssf
WHERE ssf.UserId = '100' ))
-->
SELECT ssp.PostId
FROM SStreamPost ssp
JOIN SStreamFollowing ssf ON ssp.SStreamId = ssf.SStreamId
WHERE ssf.UserId = '100'
The big WHERE with all the INs becomes something like
JOIN ( ( SELECT pc.PostId AS id ... )
UNION ( SELECT pum.PostId ... )
UNION ( SELECT ssp.PostId ... )
UNION ( SELECT psm.PostId ... ) )
Get what you can done of that those suggestions, then come back for more advice if you still need it. And bring SHOW CREATE TABLE with you.

MySQL query taking too much time

query taking 1 minute to fetch results
SELECT
`jp`.`id`,
`jp`.`title` AS game_title,
`jp`.`game_type`,
`jp`.`state_abb` AS game_state,
`jp`.`location` AS game_city,
`jp`.`zipcode` AS game_zipcode,
`jp`.`modified_on`,
`jp`.`posted_on`,
`jp`.`game_referal_amount`,
`jp`.`games_referal_amount_type`,
`jp`.`status`,
`jp`.`is_flaged`,
`u`.`id` AS employer_id,
`u`.`email` AS employer_email,
`u`.`name` AS employer_name,
`jf`.`name` AS game_function,
`jp`.`game_freeze_status`,
`jp`.`game_statistics`,
`jp`.`ats_value`,
`jp`.`integration_id`,
`u`.`account_manager_id`,
`jp`.`model_game`,
`jp`.`group_id`,
(CASE
WHEN jp.group_id != '0' THEN gm.group_name
ELSE 'NA'
END) AS group_name,
`jp`.`priority_game`,
(CASE
WHEN jp.country != 'US' THEN jp.country_name
ELSE ''
END) AS game_country,
IFNULL((CASE
WHEN
`jp`.`account_manager_id` IS NULL
OR `jp`.`account_manager_id` = 0
THEN
(SELECT
(CASE
WHEN
account_manager_id IS NULL
OR account_manager_id = 0
THEN
`u`.`account_manager_id`
ELSE account_manager_id
END) AS account_manager_id
FROM
user_user
WHERE
id = (SELECT
user_id
FROM
game_user_assigned
WHERE
game_id = `jp`.`id`
LIMIT 1))
ELSE `jp`.`account_manager_id`
END),
`u`.`account_manager_id`) AS acc,
(SELECT
COUNT(recach_limit_id)
FROM
recach_limit
WHERE
recach_limit = '1'
AND recach_limit_game_id = rpr.recach_limit_game_id) AS somewhatgame,
(SELECT
COUNT(recach_limit_id)
FROM
recach_limit
WHERE
recach_limit = '2'
AND recach_limit_game_id = rpr.recach_limit_game_id) AS verygamecommitted,
(SELECT
COUNT(recach_limit_id)
FROM
recach_limit
WHERE
recach_limit = '3'
AND recach_limit_game_id = rpr.recach_limit_game_id) AS notgame,
(SELECT
COUNT(joa.id) AS applicationcount
FROM
game_refer_to_member jrmm
INNER JOIN
game_refer jrr ON jrr.id = jrmm.rid
INNER JOIN
game_applied joa ON jrmm.id = joa.referred_by
WHERE
jrmm.STATUS = '1'
AND jrr.referby_user_id IN (SELECT
ab_testing_user_id
FROM
ab_testing)
AND joa.game_post_id = rpr.recach_limit_game_id
AND (rpr.recach_limit = 1
OR rpr.recach_limit = 2)) AS gamecount
FROM
(`game_post` AS jp)
JOIN
`user_info` AS u ON `jp`.`user_user_id` = `u`.`id`
JOIN
`game_functional` jf ON `jp`.`game_functional_id` = `jf`.`id`
LEFT JOIN
`group_musesm` gm ON `gm`.`group_id` = `jp`.`group_id`
LEFT JOIN
`recach_limit` rpr ON `jp`.`id` = `rpr`.`recach_limit_game_id`
WHERE
`jp`.`status` != '3'
GROUP BY `jp`.`id`
ORDER BY `posted_on` DESC
LIMIT 10
I would first suggest not nesting select statements because this will cause an n^x performance hit on every xth level and I see at least 3 levels of selects inside this query.
Add index
INDEX(status, posted_on)
Move LIMIT inside
Then, instead of saying
FROM (`game_post` AS jp)
say
FROM ( SELECT id FROM game_post
WHERE status != 3
ORDER BY posted_on DESC
LIMIT 10 ) AS ids
JOIN game_post AS jp USING(id)
(I am assuming that the PK of jp is (id)?)
That should efficiently use the new index to get the 10 ids needed. Then it will reach back into game_post to get the other columns.
LEFT
Also, don't say LEFT unless you need it. It costs something to generate NULLs that you may not be needing.
Is GROUP BY necessary?
If you remove the GROUP BY, does it show dup ids? The above changes may have eliminated the need.
IN(SELECT) may optimize poorly
Change
AND jrr.referby_user_id IN ( SELECT ab_testing_user_id
FROM ab_testing )
to
AND EXISTS ( SELECT * FROM ab_testing
WHERE ab_testing_user_id = jrr.referby_user_id )
(This change may or may not help, depending on the version you are running.)
More
Please provide EXPLAIN SELECT if you need further assistance.

MySQL query is taking too much time

I have query below:
SELECT t.t_id
, t.usr_idx
, t.t_is_for
, t.tg_ids
, t.created_time
, t.allow_reply
, u.usr_name
, u.usr_avatar
, u.show_profile
, IF(u.usr_timeline != '',CONCAT('https://s3.amazonaws.com/tuurnts3thumbnail/',u.usr_timeline),'') as usr_timeline
, u.node_userid
, t.t_time
FROM tuu_tuurnt t
JOIN tuu_user u
ON t.usr_idx = u.usr_idx
AND u.usr_state = 1
LEFT
JOIN tuu_post p
ON t.t_id = p.t_id
AND p.usr_idx = 44756
LEFT
JOIN tuu_friend f
ON f.frd_my_idx = 44756
AND f.frd_your_idx = t.usr_idx
LEFT
JOIN tuu_friend fl
ON fl.frd_your_idx = 44756
AND fl.frd_my_idx = t.usr_idx
WHERE t.status = 0
AND NOT EXISTS ( SELECT b.tuu_b_by_usr_idx
FROM tuu_blocked as b
WHERE b.tuu_b_usr_idx = 44756
AND t.usr_idx = b.tuu_b_by_usr_idx
)
GROUP
BY t.t_id
ORDER
BY t.t_time DESC
, t.t_id DESC
LIMIT 0,30;
It takes almost 7-8 second to give result but when I remove order by t.t_time and t.t_id then it runs within 1 sec max.
Is there anything I am doing wrong?
Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows. The larger the table, the more this costs. See How MySQL Uses Indexes for details. See also this topic about using indexes and aliases together.

(My)SQL JOIN - get teams with exactly specified members

Assume tables
team: id, title
team_user: id_team, id_user
I'd like to select teams with just and only specified members. In this example I want team(s) where the only users are those with id 1 and 5, noone else. I came up with this SQL, but it seems to be a little overkill for such simple task.
SELECT team.*, COUNT(`team_user`.id_user) AS cnt
FROM `team`
JOIN `team_user` user0 ON `user0`.id_team = `team`.id AND `user0`.id_user = 1
JOIN `team_user` user1 ON `user1`.id_team = `team`.id AND `user1`.id_user = 5
JOIN `team_user` ON `team_user`.id_team = `team`.id
GROUP BY `team`.id
HAVING cnt = 2
EDIT: Thank you all for your help. If you want to actually try your ideas, you can use example database structure and data found here: http://down.lipe.cz/team_members.sql
How about
SELECT *
FROM team t
JOIN team_user tu ON (tu.id_team = t.id)
GROUP BY t.id
HAVING (SUM(tu.id_user IN (1,5)) = 2) AND (SUM(tu.id_user NOT IN (1,5)) = 0)
I'm assuming a unique index on team_user(id_team, id_user).
You can use
SELECT
DISTINCT id,
COUNT(tu.id_user) as cnt
FROM
team t
JOIN team_user tu ON ( tu.id_team = t.id )
GROUP BY
t.id
HAVING
count(tu.user_id) = count( CASE WHEN tu.user_id = 1 or tu.user_id = 5 THEN 1 ELSE 0 END )
AND cnt = 2
Not sure why you'd need the cnt = 2 condition, the query would get only those teams where all of users having the ID of either 1 or 5
Try This
SELECT team.*, COUNT(`team_user`.id_user) AS cnt FROM `team`
JOIN `team_user` ON `team_user`.id_team = `team`.id
where `team_user`.id_user IN (1,5)
GROUP BY `team`.id
HAVING cnt = 2

How to write complex query with mysql join with tables

I have tables
Tasks- id,name
then i have
userTasks id , task_id , user_id
and
User - id , name
Suppose i have 10 tasks in task table and out of those i have 3 tasks in userTask table
I want query like this
Select task.id , task.name , STATUS (if(presentInUserTasks),1,0) FROM whatever
the STATUS word should 1 if that task id is present in usertasks table for that userid otherwise it should be 0
So that i am able to find which of those tasks are alreadu in userTask table
You're looking for the EXISTS keyword:
SELECT tasks.id, tasks.name,
IF(EXISTS(SELECT id
FROM userTasks
WHERE userTasks.task_id = tasks.id
AND userTasks.user_id = #that_user_id)
,1,0) AS STATUS
FROM tasks
Try this one:
SELECT b.id,
b.name,
IF(coalesce(c.Task_ID, -1) = -1, 0, 1) `Status`
FROM `User` a
CROSS JOIN `Task` b
LEFT JOIN UserTask c
ON a.ID = c.user_ID AND
b.ID = c.Task_ID
Where a.id = 1
Demo: http://sqlfiddle.com/#!2/a22d0/7
try this:
select T.id , T.name ,
case when u.task_id is not null then 1 else 0 end as STATUS
from
Tasks T left outer join usertasks U
on T.id=u.task_id
TRY
SELECT t.id,t.name,
CASE WHEN ut.task_id IS NULL THEN '0' ELSE '1' END
FROM Tasks t
LEFT JOIN UserTask ut ON ut.task_id = t.id