Order by clause not behaving correctly after several joins - mysql

Here is my query:
SELECT DISTINCT `post_data`. * , pv.`seller_id` , pv.`islimited` , pv.`isquantity` , pv.`isslider`, `price`.`original_price` , `price`.`discount_percentage` , `timelimit`.`start_date` , `timelimit`.`expire_date` , `quantity`.`in_stock`, `currency`.`currency_symbol`, `seller`.`directory`, `post_to_cat`.`cat_id`, count(`sales`.`sales_id`) as sale FROM `post_view` AS pv
INNER JOIN `post_data` ON pv.`post_id` = `post_data`.`post_id` AND pv.`status` = 1
INNER JOIN `price` ON pv.`post_id` = `price`.`post_id`
INNER JOIN `currency` ON `price`.`currency_id` = `currency`.`currency_id`
INNER JOIN `seller` ON pv.`seller_id` = `seller`.`seller_id`
INNER JOIN `post_to_cat` ON `post_to_cat`.`cat_id` = 1 AND `post_to_cat`.`post_id` = `post_data`.`post_id`
LEFT JOIN `timelimit` ON ( CASE WHEN pv.`islimited` = 1 THEN `timelimit`.`post_id` ELSE -1 END ) = pv.`post_id`
LEFT JOIN `quantity` ON ( CASE WHEN pv.`isquantity` = 1 THEN `quantity`.`post_id` ELSE -1 END ) = pv.`post_id`
LEFT JOIN `sales` ON `sales`.`post_id` = pv.`post_id` AND `sales`.`status` = 1
WHERE pv.`status` = 1
ORDER BY pv.`post_id` DESC LIMIT 1
The ORDER BY DESC is not working, it just returns the first row from the table, but I want to get the highest post_id value row. What is the mistake I am making?

AS #Alex said in the comments you've got a LIMIT 1 at the end, you should probably bracket the last LEFT JOIN also for readability.

As #McAdam331 said we need data sample and sql fiddle to investigate what is wrong with you query. But at the moment I have some suggestions how to improve and debug your query.
First off all, what do I see the main and very left table in your query is post_view so all other tables should be LEFT JOIN if you want to get the max id. You should use INNER JOIN only if you think that other table could filter your main table somehow and order or result could be other table dependend. But in your case I see no reason to use INNER JOIN.
Second point is your very weird ON conditions:
LEFT JOIN `timelimit` ON ( CASE WHEN pv.`islimited` = 1 THEN `timelimit`.`post_id` ELSE -1 END ) = pv.`post_id`
LEFT JOIN `quantity` ON ( CASE WHEN pv.`isquantity` = 1 THEN `quantity`.`post_id` ELSE -1 END ) = pv.`post_id`
I have converted them into another one
CASE WHEN pv.`islimited`=1 THEN `timelimit`.`start_date` ELSE NULL END as start_date ,
CASE WHEN pv.`islimited`=1 THEN `timelimit`.`expire_date` ELSE NULL END as expire_date,
CASE WHEN pv.`isquantity`=1 THEN `quantity`.`in_stock` ELSE NULL END as in_stock,
But I still don't like it. It seems very useless to me. And has no sense when I read CASE WHEN pv.islimited=1 THEN timelimit.start_date ELSE NULL END as start_date so if flag pv.islimited=0 you don't need start_date? Are you sure?
And the last thing I can suggest: try to use my or even your query. But add every table by step while debugging. So First query just:
SELECT
pv.`post_id`, pv.`seller_id` , pv.`islimited` , pv.`isquantity` ,
pv.`isslider`
FROM `post_view` AS pv
WHERE pv.`status` = 1
ORDER BY pv.`post_id` DESC
LIMIT 1
If it returns correct post_id add next table:
SELECT
pv.`post_id`, pv.`seller_id` , pv.`islimited` , pv.`isquantity` ,
pv.`isslider`,
`post_data`. *
FROM `post_view` AS pv
LEFT JOIN `post_data`
ON pv.`post_id` = `post_data`.`post_id`
WHERE pv.`status` = 1
AND `post_data`.`slug` = 'abc'
ORDER BY pv.`post_id` DESC
LIMIT 1
Check the result. And continue step by step.
Yes it takes time. But that is debugging process. It could be the fastest way to get that query done. :-)
SELECT `post_data`. * ,
pv.`post_id`, pv.`seller_id` , pv.`islimited` , pv.`isquantity` ,
pv.`isslider`, `price`.`original_price` , `price`.`discount_percentage` ,
CASE WHEN pv.`islimited`=1 THEN `timelimit`.`start_date` ELSE NULL END as start_date ,
CASE WHEN pv.`islimited`=1 THEN `timelimit`.`expire_date` ELSE NULL END as expire_date,
CASE WHEN pv.`isquantity`=1 THEN `quantity`.`in_stock` ELSE NULL END as in_stock,
`currency`.`currency_symbol`, `seller`.`directory`, `post_to_cat`.`cat_id`, count(`sales`.`sales_id`) as sale
FROM `post_view` AS pv
LEFT JOIN `post_data`
ON pv.`post_id` = `post_data`.`post_id`
LEFT JOIN `price`
ON pv.`post_id` = `price`.`post_id`
LEFT JOIN `currency`
ON `price`.`currency_id` = `currency`.`currency_id`
LEFT JOIN `seller`
ON pv.`seller_id` = `seller`.`seller_id`
LEFT JOIN `post_to_cat`
ON `post_to_cat`.`cat_id` = 1
AND `post_to_cat`.`post_id` = pv.`post_id`
LEFT JOIN `timelimit`
ON `timelimit`.`post_id` = pv.`post_id`
LEFT JOIN `quantity`
ON quantity`.`post_id` = pv.`post_id`
LEFT JOIN `sales`
ON `sales`.`post_id` = pv.`post_id`
AND `sales`.`status` = 1
WHERE pv.`status` = 1
AND `post_data`.`slug` = 'abc'
GROUP BY pv.`post_id`
ORDER BY pv.`post_id` DESC
LIMIT 1
EDIT 1 - last GROUP BY pv.post_id was added as per #McAdam331 notice about count() function without GROUP BY

I believe the issue here is mostly as a result of preforming aggregation (using the COUNT()) function, without any group by. Although, it seems like you don't necessarily need it because you want that count only for the post in question.
If you're trying to gather all of that information for a single post, I would adjust your WHERE clause to have a condition to only gather that information for the post with the largest ID.
Instead of ordering by ID and limiting by 1, use a subquery to get the largest id, like this:
...
WHERE pv.status = 1 AND post_data.slug = 'abc' AND pv.post_id = (SELECT MAX(post_id) FROM post_view);

Related

MySQL query taking too much time

query taking 1 minute to fetch results
SELECT
`jp`.`id`,
`jp`.`title` AS game_title,
`jp`.`game_type`,
`jp`.`state_abb` AS game_state,
`jp`.`location` AS game_city,
`jp`.`zipcode` AS game_zipcode,
`jp`.`modified_on`,
`jp`.`posted_on`,
`jp`.`game_referal_amount`,
`jp`.`games_referal_amount_type`,
`jp`.`status`,
`jp`.`is_flaged`,
`u`.`id` AS employer_id,
`u`.`email` AS employer_email,
`u`.`name` AS employer_name,
`jf`.`name` AS game_function,
`jp`.`game_freeze_status`,
`jp`.`game_statistics`,
`jp`.`ats_value`,
`jp`.`integration_id`,
`u`.`account_manager_id`,
`jp`.`model_game`,
`jp`.`group_id`,
(CASE
WHEN jp.group_id != '0' THEN gm.group_name
ELSE 'NA'
END) AS group_name,
`jp`.`priority_game`,
(CASE
WHEN jp.country != 'US' THEN jp.country_name
ELSE ''
END) AS game_country,
IFNULL((CASE
WHEN
`jp`.`account_manager_id` IS NULL
OR `jp`.`account_manager_id` = 0
THEN
(SELECT
(CASE
WHEN
account_manager_id IS NULL
OR account_manager_id = 0
THEN
`u`.`account_manager_id`
ELSE account_manager_id
END) AS account_manager_id
FROM
user_user
WHERE
id = (SELECT
user_id
FROM
game_user_assigned
WHERE
game_id = `jp`.`id`
LIMIT 1))
ELSE `jp`.`account_manager_id`
END),
`u`.`account_manager_id`) AS acc,
(SELECT
COUNT(recach_limit_id)
FROM
recach_limit
WHERE
recach_limit = '1'
AND recach_limit_game_id = rpr.recach_limit_game_id) AS somewhatgame,
(SELECT
COUNT(recach_limit_id)
FROM
recach_limit
WHERE
recach_limit = '2'
AND recach_limit_game_id = rpr.recach_limit_game_id) AS verygamecommitted,
(SELECT
COUNT(recach_limit_id)
FROM
recach_limit
WHERE
recach_limit = '3'
AND recach_limit_game_id = rpr.recach_limit_game_id) AS notgame,
(SELECT
COUNT(joa.id) AS applicationcount
FROM
game_refer_to_member jrmm
INNER JOIN
game_refer jrr ON jrr.id = jrmm.rid
INNER JOIN
game_applied joa ON jrmm.id = joa.referred_by
WHERE
jrmm.STATUS = '1'
AND jrr.referby_user_id IN (SELECT
ab_testing_user_id
FROM
ab_testing)
AND joa.game_post_id = rpr.recach_limit_game_id
AND (rpr.recach_limit = 1
OR rpr.recach_limit = 2)) AS gamecount
FROM
(`game_post` AS jp)
JOIN
`user_info` AS u ON `jp`.`user_user_id` = `u`.`id`
JOIN
`game_functional` jf ON `jp`.`game_functional_id` = `jf`.`id`
LEFT JOIN
`group_musesm` gm ON `gm`.`group_id` = `jp`.`group_id`
LEFT JOIN
`recach_limit` rpr ON `jp`.`id` = `rpr`.`recach_limit_game_id`
WHERE
`jp`.`status` != '3'
GROUP BY `jp`.`id`
ORDER BY `posted_on` DESC
LIMIT 10
I would first suggest not nesting select statements because this will cause an n^x performance hit on every xth level and I see at least 3 levels of selects inside this query.
Add index
INDEX(status, posted_on)
Move LIMIT inside
Then, instead of saying
FROM (`game_post` AS jp)
say
FROM ( SELECT id FROM game_post
WHERE status != 3
ORDER BY posted_on DESC
LIMIT 10 ) AS ids
JOIN game_post AS jp USING(id)
(I am assuming that the PK of jp is (id)?)
That should efficiently use the new index to get the 10 ids needed. Then it will reach back into game_post to get the other columns.
LEFT
Also, don't say LEFT unless you need it. It costs something to generate NULLs that you may not be needing.
Is GROUP BY necessary?
If you remove the GROUP BY, does it show dup ids? The above changes may have eliminated the need.
IN(SELECT) may optimize poorly
Change
AND jrr.referby_user_id IN ( SELECT ab_testing_user_id
FROM ab_testing )
to
AND EXISTS ( SELECT * FROM ab_testing
WHERE ab_testing_user_id = jrr.referby_user_id )
(This change may or may not help, depending on the version you are running.)
More
Please provide EXPLAIN SELECT if you need further assistance.

IF / CASE statement in SQL

I'm really struggling to find an answer to this issue. I want to write an if or case statement that changes a value in another column. If column 'check' is 0 then column 'points' should be 0 as well.
SELECT club_name, COALESCE(club_winner,0) AS 'check', COUNT(*) AS points,
CASE
WHEN check = '0'
THEN points = '0'
ELSE check
END as Saleable
FROM clubs c
LEFT JOIN club_results cr ON c.club_id = cr.club_winner
GROUP BY club_name ORDER BY points DESC
You can't "SET" columns in a case statement, you can return the value. Think of the case statement as a variable column. Return '0' AS Saleable (Or whatever column you wish to name).
SELECT club_name, COALESCE(club_winner,0) AS 'check', COUNT(*) AS points,
CASE
WHEN COALESCE(club_winner,0) = 0
THEN 0
ELSE COUNT(*)
END AS points_or_0_if_check_is_0
FROM clubs c
LEFT JOIN club_results cr ON c.club_id = cr.club_winner
GROUP BY club_name ORDER BY points_or_0_if_check_is_0 DESC
You can select to count only lines matching your left join like so:
SELECT
club_name,
COUNT(
CASE WHEN
club_winner IS NOT NULL
THEN
CR.id
ELSE
NULL
END
) AS `points`
FROM
clubs C
LEFT JOIN
club_results CR ON C.club_id = CR.club_winner
GROUP BY
C.club_id
ORDER BY
points DESC

MySQL query is taking too much time

I have query below:
SELECT t.t_id
, t.usr_idx
, t.t_is_for
, t.tg_ids
, t.created_time
, t.allow_reply
, u.usr_name
, u.usr_avatar
, u.show_profile
, IF(u.usr_timeline != '',CONCAT('https://s3.amazonaws.com/tuurnts3thumbnail/',u.usr_timeline),'') as usr_timeline
, u.node_userid
, t.t_time
FROM tuu_tuurnt t
JOIN tuu_user u
ON t.usr_idx = u.usr_idx
AND u.usr_state = 1
LEFT
JOIN tuu_post p
ON t.t_id = p.t_id
AND p.usr_idx = 44756
LEFT
JOIN tuu_friend f
ON f.frd_my_idx = 44756
AND f.frd_your_idx = t.usr_idx
LEFT
JOIN tuu_friend fl
ON fl.frd_your_idx = 44756
AND fl.frd_my_idx = t.usr_idx
WHERE t.status = 0
AND NOT EXISTS ( SELECT b.tuu_b_by_usr_idx
FROM tuu_blocked as b
WHERE b.tuu_b_usr_idx = 44756
AND t.usr_idx = b.tuu_b_by_usr_idx
)
GROUP
BY t.t_id
ORDER
BY t.t_time DESC
, t.t_id DESC
LIMIT 0,30;
It takes almost 7-8 second to give result but when I remove order by t.t_time and t.t_id then it runs within 1 sec max.
Is there anything I am doing wrong?
Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows. The larger the table, the more this costs. See How MySQL Uses Indexes for details. See also this topic about using indexes and aliases together.

Adding a subquery to a join

I've been using the following join, to pull rows of users whom have volunteered for various project positions.
SELECT p.id, up.position_id, title, max_vol, current_datetime, IF(up.id IS NULL, "0", "1") volunteered
FROM positions AS p
LEFT JOIN users_positions AS up
ON p.id = up.position_id
AND up.user_id = 1
AND up.calendar_date = '2016-10-03'
WHERE
p.project_id = 1
AND p.day = 1
...but in a change of functionality, I have to now account for the date of the latest edit to a project. In another query, I solved it like so
SELECT *
FROM positions
WHERE
current_datetime = (SELECT MAX(current_datetime)
FROM positions
WHERE
project_id = 1 AND day = 1)
Which works fine, but now I have to also incorporate the return of rows which match the latest datetime in the left join query.
I just can't seem to wrap my head around it. Any suggestions? Thanks.
Use a sub query, like this:
SELECT
p.id,
up.position_id,
title,
max_vol,
current_datetime,
IF(up.id IS NULL,
"0",
"1") volunteered
FROM
( SELECT
*
FROM
positions
WHERE
current_datetime = (
SELECT
MAX(current_datetime)
FROM
positions
WHERE
project_id = 1
AND day = 1
)
) AS p
LEFT JOIN
users_positions AS up
ON p.id = up.position_id
AND up.user_id = 1
AND up.calendar_date = '2016-10-03'
WHERE
p.project_id = 1
AND p.day = 1

Join between sub-queries in SQLAlchemy

In relation to the answer I accepted for this post, SQL Group By and Limit issue, I need to figure out how to create that query using SQLAlchemy. For reference, the query I need to run is:
SELECT t.id, t.creation_time, c.id, c.creation_time
FROM (SELECT id, creation_time
FROM thread
ORDER BY creation_time DESC
LIMIT 5
) t
LEFT OUTER JOIN comment c ON c.thread_id = t.id
WHERE 3 >= (SELECT COUNT(1)
FROM comment c2
WHERE c.thread_id = c2.thread_id
AND c.creation_time <= c2.creation_time
)
I have the first half of the query, but I am struggling with the syntax for the WHERE clause and how to combine it with the JOIN. Any one have any suggestions?
Thanks!
EDIT: First attempt seems to mess up around the .filter() call:
c = aliased(Comment)
c2 = aliased(Comment)
subq = db.session.query(Thread.id).filter_by(topic_id=122098).order_by(Thread.creation_time.desc()).limit(2).offset(2).subquery('t')
subq2 = db.session.query(func.count(1).label("count")).filter(c.id==c2.id).subquery('z')
q = db.session.query(subq.c.id, c.id).outerjoin(c, c.thread_id==subq.c.id).filter(3 >= subq2.c.count)
this generates the following SQL:
SELECT t.id AS t_id, comment_1.id AS comment_1_id
FROM (SELECT count(1) AS count
FROM comment AS comment_1, comment AS comment_2
WHERE comment_1.id = comment_2.id) AS z, (SELECT thread.id AS id
FROM thread
WHERE thread.topic_id = :topic_id ORDER BY thread.creation_time DESC
LIMIT 2 OFFSET 2) AS t LEFT OUTER JOIN comment AS comment_1 ON comment_1.thread_id = t.id
WHERE z.count <= 3
Notice the sub-query ordering is incorrect, and subq2 somehow is selecting from comment twice. Manually fixing that gives the right results, I am just unsure of how to get SQLAlchemy to get it right.
Try this:
c = db.aliased(Comment, name='c')
c2 = db.aliased(Comment, name='c2')
sq = (db.session
.query(Thread.id, Thread.creation_time)
.order_by(Thread.creation_time.desc())
.limit(5)
).subquery(name='t')
sq2 = (
db.session.query(db.func.count(1))
.select_from(c2)
.filter(c.thread_id == c2.thread_id)
.filter(c.creation_time <= c2.creation_time)
.correlate(c)
.as_scalar()
)
q = (db.session
.query(
sq.c.id, sq.c.creation_time,
c.id, c.creation_time,
)
.outerjoin(c, c.thread_id == sq.c.id)
.filter(3 >= sq2)
)