Delete an SQL query result set - mysql

I'm trying to delete an SQL result set but it won't work:
DELETE FROM votes
WHERE id IN (
SELECT *
FROM votes v
LEFT JOIN comments c ON f.id = v.post_id
GROUP BY v.id
HAVING COUNT(c.comment) = 0 )

It's true, that you can't use the same table from which you want to delete rows in a direct subselect, but with a little trick - a subselect on a subselect as derived table - you can do it:
DELETE FROM votes
WHERE id IN (
SELECT
t.id
FROM (
SELECT v.id, COUNT(c.comment) cnt
FROM votes v
LEFT JOIN comments c ON f.id = v.post_id
GROUP BY v.id
HAVING COUNT(c.comment) = 0
) t
);
I'm assuming that the rows without comments should be deleted.

You are close...2 changes
a subquery in a where in() statement can only return one field. Change select * to select v.id
having count = 0 doesn't quite work in a logical sense. If count = 0 then it's not there to delete anyway. I suspect with the left join syntax you've used, you are going for votes that have 0 comments? Right idea with the left join, but you want where c.comment is null (left join produces nulls...where c.comment is null means there was no comment found).
Of course this won't work due to mysql:
DELETE FROM votes
WHERE id IN (
SELECT v.id
FROM votes v
LEFT JOIN comments c ON f.id = v.post_id
where c.comments is null)
If I was stuck in MySQL...(sorry this is psuedo code, I haven't been in mysql long enough to get this exact in a text window.
select id
into #temp
FROM votes v
LEFT JOIN comments c ON f.id = v.post_id
where c.comments is null
delete from votes where id in (select id from #temp)
drop table #temp
Seems like a silly work around

Related

Why is this query so slow and what can i do about it

I have the following SELECT UPDATE statement from MySQL
UPDATE table_Learning l
INNER JOIN (select ULN, id from table_users group by ULN having count(ULN) =1) u
ON l.ULN = u.ULN
set l.user_id=u.id
WHERE l.user_id is null
The problem is, it is so slow that it times out, and basically does not work.
I am sure it is to do with the line:
INNER JOIN (select ULN, id from table_users group by ULN having count(ULN) =1) u
and specifically because there is both a GROUP BY and a HAVING clause in this inner select, and from what I have read, because INNER JOINS are very slow with MySQL.
My overall aim is to:
Populate the userID's that are null in table_learning
To do so using the userID's in table_users
To Join on the field named ULN in both tables
To only populate the fields where the ULN is unique in table_users eg if more than one user has this ULN, then do not populate the user_id in table_learning
This is your query:
UPDATE table_Learning l INNER JOIN
(select ULN, id
from table_users
group by ULN
having count(ULN) = 1
) u
ON l.ULN = u.ULN
set l.user_id=u.id
WHERE l.user_id is null;
In MySQL, the subquery is going to be expensive. An index on table_learning(user_id) might help a bit. But filtering inside the subquery could also help:
UPDATE table_Learning l INNER JOIN
(select ULN, id
from table_users
where exists (select 1
from table_learning tl
where tl.ULN = u.uln and tl.user_id is null
)
group by ULN
having count(ULN) = 1
) u
ON l.ULN = u.ULN
set l.user_id=u.id
WHERE l.user_id is null;
For this, you want a composite index on table_learning(ULN, user_id).

Duplicated rows

SQL Query:
SELECT
T.*,
U.nick AS author_nick,
P.id AS post_id,
P.name AS post_name,
P.author AS post_author_id,
P.date AS post_date,
U2.nick AS post_author
FROM
zero_topics T
LEFT JOIN
zero_posts P
ON
T.id = P.topic_id
LEFT JOIN
zero_players U
ON
T.author = U.uuid
LEFT JOIN
zero_players U2
ON
P.author = U2.uuid
ORDER BY
CASE
WHEN P.date is null THEN T.date
ELSE P.date
END DESC
Output:
Topics:
Posts:
Question: Why i have duplicated topic id 22? i have in mysql two topics (id 22 and 23) and two posts(id 24 and 25). I want to see topic with last post only.
If a join produces multiple results and you want only at most one result, you have to rewrite the join and/or filtering criteria to provide that result. If you want only the latest result of all the results, it's doable and reasonably easy once you use it a few times.
select a.Data, b.Data
from Table1 a
left join Table2 b
on b.JoinValue = a.JoinValue
and b.DateField =(
select Max( DateField )
from Table2
where JoinValue = b.JoinValue );
The correlated subquery pulls out the one date that is the highest (most recent) value of all the joinable candidates. That then becomes the row that takes part in the join -- or, of course, nothing if there are no candidates at all. This is a pattern I use quite a lot.

SQL query to check if value doesn't exist in another table

I have a SQL query which does most of what I need it to do but I'm running into a problem.
There are 3 tables in total. entries, entry_meta and votes.
I need to get an entire row from entries when competition_id = 420 in the entry_meta table and the ID either doesn't exist in votes or it does exist but the user_id column value isn't 1.
Here's the query I'm using:
SELECT entries.* FROM entries
INNER JOIN entry_meta ON (entries.ID = entry_meta.entry_id)
WHERE 1=1
AND ( ( entry_meta.meta_key = 'competition_id' AND CAST(entry_meta.meta_value AS CHAR) = '420') )
GROUP BY entries.ID
ORDER BY entries.submission_date DESC
LIMIT 0, 25;
The votes table has 4 columns. vote_id, entry_id, user_id, value.
One option I was thinking of was to SELECT entry_id FROM votes WHERE user_id = 1 and include it in an AND clause in my query. Is this acceptable/efficient?
E.g.
AND entries.ID NOT IN (SELECT entry_id FROM votes WHERE user_id = 1)
A left join with an appropriate where clause might be useful:
SELECT
entries.*
FROM
entries
INNER JOIN entry_meta ON (entries.ID = entry_meta.entry_id)
LEFT JOIN votes ON entries.ID = votes.entry_id
WHERE 1=1
AND (
entry_meta.meta_key = 'competition_id'
AND CAST(entry_meta.meta_value AS CHAR) = '420')
AND votes.entry_id IS NULL -- This will remove any entry with votes
)
GROUP BY entries.ID
ORDER BY entries.submission_date DESC
Here's an implementation of Andrew's suggestion to use exists / not exists.
select
e.*
from
entries e
join entry_meta em on e.ID = em.entry_id
where
em.meta_key = 'competition_id'
and cast(em.meta_value as char) = '420'
and (
not exists (
select 1
from votes v
where
v.entry_id = e.ID
)
or exists (
select 1
from votes v
where
v.entry_id = e.ID
and v.user_id != 1
)
)
group by e.ID
order by e.submission_date desc
limit 0, 25;
Note: it's generally not a good idea to put a function inside a where clause (due to performance reasons), but since you're also joining on IDs you should be OK.
Also, The left join suggestion by Barranka may cause the query to return more rows than your are expecting (assuming that there is a 1:many relationship between entries and votes).

mysql limit join - is there a more efficient way of doing this?

I have three tables - tblpollquestions, tblpollanswers and tblpollresponses.
I want to select a random question that a user hasn't responded to yet, with the respective answers.
The SQL below returns exactly what I need, but I'm concerned that it takes three SELECTs to do it. There must surely be a more efficient way?
SELECT
poll.id,
poll.question,
a.answer
FROM tblpollquestions poll
INNER JOIN tblpollanswers a ON a.question_id = poll.id
INNER JOIN (
SELECT id FROM tblpollquestions WHERE id NOT IN(
SELECT question_id FROM tblpollresponses WHERE user_id = 1
) ORDER BY RAND() LIMIT 1
) as t ON t.id = poll.id
This could be made a bit better by switching NOT IN(SELECT...) into LEFT JOIN
SELECT
poll.id,
poll.question,
a.answer
FROM
tblpollquestions poll
INNER JOIN
tblpollanswers a
ON
a.question_id = poll.id
INNER JOIN (
SELECT
q.id
FROM
tblpollquestions AS q
LEFT JOIN
tblpollresponses AS r
ON
q.id = r.question_id
AND r.user_id = 1
WHERE
r.question_id IS NULL
ORDER BY RAND() LIMIT 1
) as t ON t.id = poll.id
ORDER BY RAND() can also be slow if there are many rows in tblpollquestions table. See this presentation from Bill Karwin (slide 142 and onwards) for some other ideas on selecting a random row.
http://www.slideshare.net/billkarwin/sql-antipatterns-strike-back
Is seems fine to me, although I would change it slightly:
SELECT
poll.id,
poll.question,
a.answer
FROM tblpollquestions poll
INNER JOIN tblpollanswers a ON a.question_id = poll.id
WHERE poll.id = (
SELECT id FROM tblpollquestions WHERE NOT EXISTS (
SELECT * FROM tblpollresponses WHERE user_id = 1 AND question_id = tblpollquestions.id )
ORDER BY RAND() LIMIT 1)
Written that way should do a better job of using indexes, and not checking the join conditions for every single tblpollanswers.
Make sure you have a UNIQUE index (or primary key) on tblpollresponses for (user_id, question_id) (in that order). If you need it for other queries, you can add an additional UNIQUE index with the columns in the reverse order.
Edit: Actually putting it in the where might not be so good http://jan.kneschke.de/projects/mysql/order-by-rand/ You will need to explain the query and compare.
Use left join like this:
SELECT ques.id, ques.question, ans.answer FROM tblpollquestions ques
INNER JOIN tblpollanswers ans ON(ans.question_id = ques.id)
left join tblpollresponses res on(res.question_id=ques.id and user_id = 1)
where res.question_id is null ORDER BY RAND() LIMIT 1;
I changed your table aliases to make better sense.

mysql: multiple join problem

Im trying to select a table with multiple joins, one for the number of comments using COUNT and one to select the total vote value using SUM, the problem is that the two joins affect each other, instead of showing:
3 votes 2 comments
I get 3 * 2 = 6 votes and 2 * 3 comments
This is the query I'm using:
SELECT t.*, COUNT(c.id) as comments, COALESCE(SUM(v.vote), 0) as votes
FROM (topics t)
LEFT JOIN comments c ON c.topic_id = t.id
LEFT JOIN votes v ON v.topic_id = t.id
WHERE t.id = 9
What you're doing is an SQL antipattern that I call Goldberg Machine. Why make the problem so much harder by forcing it to be done in a single SQL query?
Here is how I would really solve this problem:
SELECT t.*, COUNT(c.id) as comments
FROM topics t
LEFT JOIN comments c ON c.topic_id = t.id
WHERE t.id = 9;
SELECT t.*, SUM(v.vote) as votes
FROM topics t
LEFT JOIN votes v ON v.topic_id = t.id
WHERE t.id = 9;
As you have found, combining these two into one query results in a Cartesian product. There may be clever and subtle ways to force it to give you the correct answer in one query, but what happens when you need a third statistic? It's much simpler to do it in two queries.
SELECT t.*, COUNT(c.id) as comments, COALESCE(SUM(v.vote), 0) as votes
FROM (topics t)
LEFT JOIN comments c ON c.topic_id = t.id
LEFT JOIN votes v ON v.topic_id = t.id
WHERE t.id = 9
GROUP BY t.id
or perhaps
SELECT `topics`.*,
(
SELECT COUNT(*)
FROM `comments`
WHERE `topic_id` = `topics`.`id`
) AS `num_comments`,
(
SELECT IFNULL(SUM(`vote`), 0)
FROM `votes`
WHERE `topic_id` = `topics`.`id`
) AS `vote_total`
FROM `topics`
WHERE `id` = 9
SELECT t.*, COUNT(DISTINCT c.id) as comments, COALESCE(SUM(v.vote), 0) as votes
FROM (topics t)
LEFT JOIN comments c ON c.topic_id = t.id
LEFT JOIN votes v ON v.topic_id = t.id
WHERE t.id = 9