MySql order by clause not working - mysql

In mysql query I use order by, but it is not working.
When I do this
SELECT t.id,t.user_id,t.title,c.comment,d.has_answer,IF(c.id IS NULL, t.date_created, d.recent_date) recent_date,MIN(i.id) image_id
FROM threads t
LEFT JOIN comments c ON c.thread_id = t.id
INNER JOIN (
SELECT thread_id, MAX(date_sent) recent_date, MAX(is_answer) has_answer
FROM comments
GROUP BY thread_id
) d ON c.id IS NULL OR (d.thread_id = c.thread_id AND d.recent_date = c.date_sent)
LEFT JOIN thread_images i ON t.id = i.thread_id
WHERE t.user_id = t.user_id
GROUP BY t.id
ORDER BY d.recent_date DESC
LIMIT 0, 10
It doesn't properly order them. But if I do this:
SELECT *
FROM (
SELECT t.id,t.user_id,t.title,c.comment,d.has_answer,IF(c.id IS NULL, t.date_created, d.recent_date) recent_date,MIN(i.id) image_id
FROM threads t
LEFT JOIN comments c ON c.thread_id = t.id
INNER JOIN (
SELECT thread_id, MAX(date_sent) recent_date, MAX(is_answer) has_answer
FROM comments
GROUP BY thread_id
) d ON c.id IS NULL OR (d.thread_id = c.thread_id AND d.recent_date = c.date_sent)
LEFT JOIN thread_images i ON t.id = i.thread_id
WHERE t.user_id = t.user_id
GROUP BY t.id
LIMIT 0, 10) qwerty
ORDER BY recent_date DESC
Then it does work. Why does the top one not work, and is the second way the best way to fix that?
Thanks

Those two statements are ordering by two different things.
The second statement is ordering by the result of an expression in the SELECT list.
But the first statement specifies ordering by a value of recent_date returned by the inline view d; if you remove "d." from in front of recent_date, then the ORDER BY clause would reference the alias assigned to the expression in the SELECT list, as the second statement does.
Because recent_date is an alias for an expression the SELECT list, these two are equivalent:
ORDER BY recent_date
ORDER BY IF(c.id IS NULL, t.date_created, d.recent_date)
^^
but those are significantly different from:
ORDER BY d.recent_date
^^
Note that the non-standard use of the GROUP BY clause may be masking some values of recent_date which are discarded by the query. This usage of the GROUP BY clause is a MySQL extension to the SQL Standard; most other relational databases would throw an error with this statement. It's possible to get MySQL to throw the same type of error by enabling the ONLY_FULL_GROUP_BY SQL mode.
Q Is the second statement the best way to fix that?
A If that statement guarantees that the resultset returned meets your specification, then it's a workable approach. (One downside is the overhead of the inline view query.)
But I strongly suspect that the second statement is really just masking the problem, not really fixing it.

SELECT t.id,t.user_id,t.title,c.comment,d.has_answer,IF(c.id IS NULL, t.date_created, d.recent_date) recent_date,MIN(i.id) image_id
FROM (threads t
LEFT JOIN comments c ON c.thread_id = t.id
INNER JOIN (
SELECT thread_id, MAX(date_sent) recent_date, MAX(is_answer) has_answer
FROM comments
GROUP BY thread_id
) d ON c.id IS NULL OR (d.thread_id = c.thread_id AND d.recent_date = c.date_sent)
LEFT JOIN thread_images i ON t.id = i.thread_id
WHERE t.user_id = t.user_id
GROUP BY t.id
LIMIT 0, 10) x
ORDER BY d.recent_date DESC

Related

How to limit record before group by for pagination?

I have this query that will LEFT JOIN and GROUP BY to get SUM of column.
SELECT
c.id,
SUM(
r.score
) AS score_sum,
SUM(
CASE WHEN r.is_active = '0' THEN r.negative ELSE 0 END
) AS negative_sum
FROM comments AS c
LEFT JOIN rates AS r ON (r.comment_id = c.id)
WHERE r.comment_id = c.id
GROUP BY c.id
DB Fiddle link:
https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=fadba795d8426f91471fa4db83845b6f
The query works, but if the comments records is large (10K for example), I need to implement pagination, how do I modify this query to limit the comments records first before GROUP BY?
In short:
Get the first 5 comments by limit to 5
Left join the table rates
Get the SUM by group by
Example, show the first 4 comments SUM
Thanks
You can use subquery to "select c.id from comments limit N" in the FROM clause.
select c.id,
sum(r.score) as score_sum,
SUM(
CASE WHEN r.is_active = '0' THEN r.negative ELSE 0 END
) AS negative_sum
from ( select c.id from comments c limit 2) c
LEFT JOIN rates AS r ON (r.comment_id = c.id)
GROUP BY c.id;
You may apply order by in the subquery to determine order in which you want to select the comments (Top N).
DB Fiddle link
Try the following:
SELECT
c.id,
SUM(
r.score
) AS score_sum,
SUM(
CASE WHEN r.is_active = '0' THEN r.negative ELSE 0 END
) AS negative_sum
FROM comments AS c
LEFT JOIN rates AS r ON (r.comment_id = c.id)
WHERE r.comment_id = c.id
GROUP BY c.id
ORDER BY c.id ASC
LIMIT 5
The rationale behind the above query is that id is the Primary key (hence indexed) in your comments table. Also, your GROUP BY and ORDER BY is on the same column, that is, id; so MySQL will first utilize the index on id and get first 5 rows (due to LIMIT), and then proceed forward to JOIN with other tables and do aggregation etc.
Give it a Try!! More details here: https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html
We can confirm the same using EXPLAIN .. on this query.

Delete an SQL query result set

I'm trying to delete an SQL result set but it won't work:
DELETE FROM votes
WHERE id IN (
SELECT *
FROM votes v
LEFT JOIN comments c ON f.id = v.post_id
GROUP BY v.id
HAVING COUNT(c.comment) = 0 )
It's true, that you can't use the same table from which you want to delete rows in a direct subselect, but with a little trick - a subselect on a subselect as derived table - you can do it:
DELETE FROM votes
WHERE id IN (
SELECT
t.id
FROM (
SELECT v.id, COUNT(c.comment) cnt
FROM votes v
LEFT JOIN comments c ON f.id = v.post_id
GROUP BY v.id
HAVING COUNT(c.comment) = 0
) t
);
I'm assuming that the rows without comments should be deleted.
You are close...2 changes
a subquery in a where in() statement can only return one field. Change select * to select v.id
having count = 0 doesn't quite work in a logical sense. If count = 0 then it's not there to delete anyway. I suspect with the left join syntax you've used, you are going for votes that have 0 comments? Right idea with the left join, but you want where c.comment is null (left join produces nulls...where c.comment is null means there was no comment found).
Of course this won't work due to mysql:
DELETE FROM votes
WHERE id IN (
SELECT v.id
FROM votes v
LEFT JOIN comments c ON f.id = v.post_id
where c.comments is null)
If I was stuck in MySQL...(sorry this is psuedo code, I haven't been in mysql long enough to get this exact in a text window.
select id
into #temp
FROM votes v
LEFT JOIN comments c ON f.id = v.post_id
where c.comments is null
delete from votes where id in (select id from #temp)
drop table #temp
Seems like a silly work around

SQL query to check if value doesn't exist in another table

I have a SQL query which does most of what I need it to do but I'm running into a problem.
There are 3 tables in total. entries, entry_meta and votes.
I need to get an entire row from entries when competition_id = 420 in the entry_meta table and the ID either doesn't exist in votes or it does exist but the user_id column value isn't 1.
Here's the query I'm using:
SELECT entries.* FROM entries
INNER JOIN entry_meta ON (entries.ID = entry_meta.entry_id)
WHERE 1=1
AND ( ( entry_meta.meta_key = 'competition_id' AND CAST(entry_meta.meta_value AS CHAR) = '420') )
GROUP BY entries.ID
ORDER BY entries.submission_date DESC
LIMIT 0, 25;
The votes table has 4 columns. vote_id, entry_id, user_id, value.
One option I was thinking of was to SELECT entry_id FROM votes WHERE user_id = 1 and include it in an AND clause in my query. Is this acceptable/efficient?
E.g.
AND entries.ID NOT IN (SELECT entry_id FROM votes WHERE user_id = 1)
A left join with an appropriate where clause might be useful:
SELECT
entries.*
FROM
entries
INNER JOIN entry_meta ON (entries.ID = entry_meta.entry_id)
LEFT JOIN votes ON entries.ID = votes.entry_id
WHERE 1=1
AND (
entry_meta.meta_key = 'competition_id'
AND CAST(entry_meta.meta_value AS CHAR) = '420')
AND votes.entry_id IS NULL -- This will remove any entry with votes
)
GROUP BY entries.ID
ORDER BY entries.submission_date DESC
Here's an implementation of Andrew's suggestion to use exists / not exists.
select
e.*
from
entries e
join entry_meta em on e.ID = em.entry_id
where
em.meta_key = 'competition_id'
and cast(em.meta_value as char) = '420'
and (
not exists (
select 1
from votes v
where
v.entry_id = e.ID
)
or exists (
select 1
from votes v
where
v.entry_id = e.ID
and v.user_id != 1
)
)
group by e.ID
order by e.submission_date desc
limit 0, 25;
Note: it's generally not a good idea to put a function inside a where clause (due to performance reasons), but since you're also joining on IDs you should be OK.
Also, The left join suggestion by Barranka may cause the query to return more rows than your are expecting (assuming that there is a 1:many relationship between entries and votes).

Slow query execution time

SELECT p.id,
p.title,
p.slug,
p.content,
(SELECT url
FROM gallery
WHERE postid = p.id
LIMIT 1) AS url,
t.name
FROM posts AS p
INNER JOIN termrel AS tr
ON ( tr.object = p.id )
INNER JOIN termtax AS tx
ON ( tx.id = tr.termtax_id )
INNER JOIN terms AS t
ON ( t.id = tx.term_id )
WHERE tx.taxonomy_id = 3
AND p.post_status IS NULL
ORDER BY t.name ASC
This query took about 0.2407s to execute. How to make it fast?
Correlated subqueries can have subpar performance as they are executed row by row.
To solve this move your correlated subquery into a regular subquery/derived table and join to it. It will then not have execute row by row for the entire returned result set as it will be executed BEFORE the select statement.
mysql specific links that confirm correaleated subqueries are not optimal choices in mysql.
How to optimize
Answer indicating msql notoriously bad at optimizing correlated subqueries
I use sql-server, but I'm sure the principle is the same for mysql, so I hope this at least points you in the right direction. You would need to partition/return your one result per loan, maybe some could chime in on mysql specific syntax and I could update my answer
select
p.id
,p.title
,p.slug
,p.content
,t.name
,mySubQuery.value
from
posts as p
inner join termrel as tr
on ( tr.object = p.id )
inner join termtax as tx
on ( tx.id = tr.termtax_id )
inner join terms as t
on ( t.id = tx.term_id )
left join (
-- use MYSQL function to partition the reslts and only return 1, I use sql-server, not sure of the RDMS specific syntax
select
id
,url
from
gallery
limit 1
) as mySubquery
on mySubquery.id = p.id
where
tx.taxonomy_id = 3
and p.post_status is null
order by
t.name asc

mysql limit join - is there a more efficient way of doing this?

I have three tables - tblpollquestions, tblpollanswers and tblpollresponses.
I want to select a random question that a user hasn't responded to yet, with the respective answers.
The SQL below returns exactly what I need, but I'm concerned that it takes three SELECTs to do it. There must surely be a more efficient way?
SELECT
poll.id,
poll.question,
a.answer
FROM tblpollquestions poll
INNER JOIN tblpollanswers a ON a.question_id = poll.id
INNER JOIN (
SELECT id FROM tblpollquestions WHERE id NOT IN(
SELECT question_id FROM tblpollresponses WHERE user_id = 1
) ORDER BY RAND() LIMIT 1
) as t ON t.id = poll.id
This could be made a bit better by switching NOT IN(SELECT...) into LEFT JOIN
SELECT
poll.id,
poll.question,
a.answer
FROM
tblpollquestions poll
INNER JOIN
tblpollanswers a
ON
a.question_id = poll.id
INNER JOIN (
SELECT
q.id
FROM
tblpollquestions AS q
LEFT JOIN
tblpollresponses AS r
ON
q.id = r.question_id
AND r.user_id = 1
WHERE
r.question_id IS NULL
ORDER BY RAND() LIMIT 1
) as t ON t.id = poll.id
ORDER BY RAND() can also be slow if there are many rows in tblpollquestions table. See this presentation from Bill Karwin (slide 142 and onwards) for some other ideas on selecting a random row.
http://www.slideshare.net/billkarwin/sql-antipatterns-strike-back
Is seems fine to me, although I would change it slightly:
SELECT
poll.id,
poll.question,
a.answer
FROM tblpollquestions poll
INNER JOIN tblpollanswers a ON a.question_id = poll.id
WHERE poll.id = (
SELECT id FROM tblpollquestions WHERE NOT EXISTS (
SELECT * FROM tblpollresponses WHERE user_id = 1 AND question_id = tblpollquestions.id )
ORDER BY RAND() LIMIT 1)
Written that way should do a better job of using indexes, and not checking the join conditions for every single tblpollanswers.
Make sure you have a UNIQUE index (or primary key) on tblpollresponses for (user_id, question_id) (in that order). If you need it for other queries, you can add an additional UNIQUE index with the columns in the reverse order.
Edit: Actually putting it in the where might not be so good http://jan.kneschke.de/projects/mysql/order-by-rand/ You will need to explain the query and compare.
Use left join like this:
SELECT ques.id, ques.question, ans.answer FROM tblpollquestions ques
INNER JOIN tblpollanswers ans ON(ans.question_id = ques.id)
left join tblpollresponses res on(res.question_id=ques.id and user_id = 1)
where res.question_id is null ORDER BY RAND() LIMIT 1;
I changed your table aliases to make better sense.