I have a SQL query which does most of what I need it to do but I'm running into a problem.
There are 3 tables in total. entries, entry_meta and votes.
I need to get an entire row from entries when competition_id = 420 in the entry_meta table and the ID either doesn't exist in votes or it does exist but the user_id column value isn't 1.
Here's the query I'm using:
SELECT entries.* FROM entries
INNER JOIN entry_meta ON (entries.ID = entry_meta.entry_id)
WHERE 1=1
AND ( ( entry_meta.meta_key = 'competition_id' AND CAST(entry_meta.meta_value AS CHAR) = '420') )
GROUP BY entries.ID
ORDER BY entries.submission_date DESC
LIMIT 0, 25;
The votes table has 4 columns. vote_id, entry_id, user_id, value.
One option I was thinking of was to SELECT entry_id FROM votes WHERE user_id = 1 and include it in an AND clause in my query. Is this acceptable/efficient?
E.g.
AND entries.ID NOT IN (SELECT entry_id FROM votes WHERE user_id = 1)
A left join with an appropriate where clause might be useful:
SELECT
entries.*
FROM
entries
INNER JOIN entry_meta ON (entries.ID = entry_meta.entry_id)
LEFT JOIN votes ON entries.ID = votes.entry_id
WHERE 1=1
AND (
entry_meta.meta_key = 'competition_id'
AND CAST(entry_meta.meta_value AS CHAR) = '420')
AND votes.entry_id IS NULL -- This will remove any entry with votes
)
GROUP BY entries.ID
ORDER BY entries.submission_date DESC
Here's an implementation of Andrew's suggestion to use exists / not exists.
select
e.*
from
entries e
join entry_meta em on e.ID = em.entry_id
where
em.meta_key = 'competition_id'
and cast(em.meta_value as char) = '420'
and (
not exists (
select 1
from votes v
where
v.entry_id = e.ID
)
or exists (
select 1
from votes v
where
v.entry_id = e.ID
and v.user_id != 1
)
)
group by e.ID
order by e.submission_date desc
limit 0, 25;
Note: it's generally not a good idea to put a function inside a where clause (due to performance reasons), but since you're also joining on IDs you should be OK.
Also, The left join suggestion by Barranka may cause the query to return more rows than your are expecting (assuming that there is a 1:many relationship between entries and votes).
Related
My query gets the results of these products based on if they exist in a separate table index. I am trying to get a count of all the instances where they exist so I can ORDER the results by relevance. Everything I try seems to return the variable #priority as 0. Any ideas?
Maybe it is better to use join statements?
Thank you for your help. Here is my MySQL query:
SELECT `products` . * , #priority
FROM `products`
LEFT JOIN productstypes_index ON productstypes_index.product_id = products.id
WHERE (
EXISTS (
SELECT *
FROM `productstypes_index`
WHERE `productstypes_index`.`product_id` = `products`.`id`
AND `productstypes_index`.`_type_id` = '1'
)
AND (
(
(
EXISTS (
SELECT #priority := COUNT( * )
FROM `producthashtags_index`
WHERE `producthashtags_index`.`product_id` = `products`.`id`
AND `producthashtags_index`.`producthashtag_id` = '43'
)
)
AND (
EXISTS (
SELECT #priority := COUNT( * )
FROM `producthashtags_index`
WHERE `producthashtags_index`.`product_id` = `products`.`id`
AND `producthashtags_index`.`producthashtag_id` = '11'
)
)
)
)
)
ORDER BY `updated_at` DESC;
You could do without those exists, and without variables. Also, a left join has no sense if you have an exists condition on the joined table. Then you might as well do the more efficient inner join and put the extra type condition in the join condition.
The priority can be calculated by a count over the hash tags, but only those with id in ('43', '11').
SELECT products.*
count(distinct producthashtags_index.producthashtag_id) priority
FROM products
INNER JOIN productstypes_index
ON productstypes_index.product_id = products.id
AND productstypes_index._type_id = '1'
INNER JOIN producthashtags_index
ON producthashtags_index.product_id = products.id
AND producthashtags_index.producthashtag_id in ('43', '11')
GROUP BY products.id
ORDER BY updated_at DESC;
MySQL ignores the SELECT list in EXISTS subquery, so it makes no difference what you type in there. This is documented here.
An approach using joins would look like below:
SELECT p.id,
COUNT(case when phi.product_id is not null then 1 end) AS instances
FROM products p
INNER JOIN productstypes_index pti ON pti.product_id = p.id AND pti.`_type_id` = 1
LEFT JOIN producthashtags_index phi ON phi.product_id = p.id AND phi.producthashtag_id IN (11,43)
GROUP BY p.id
ORDER BY instances DESC;
I have removed additional backticks where I believe they are not neccessary and also if your id columns in tables are integers, you do not need quotation marks.
I'm trying to delete an SQL result set but it won't work:
DELETE FROM votes
WHERE id IN (
SELECT *
FROM votes v
LEFT JOIN comments c ON f.id = v.post_id
GROUP BY v.id
HAVING COUNT(c.comment) = 0 )
It's true, that you can't use the same table from which you want to delete rows in a direct subselect, but with a little trick - a subselect on a subselect as derived table - you can do it:
DELETE FROM votes
WHERE id IN (
SELECT
t.id
FROM (
SELECT v.id, COUNT(c.comment) cnt
FROM votes v
LEFT JOIN comments c ON f.id = v.post_id
GROUP BY v.id
HAVING COUNT(c.comment) = 0
) t
);
I'm assuming that the rows without comments should be deleted.
You are close...2 changes
a subquery in a where in() statement can only return one field. Change select * to select v.id
having count = 0 doesn't quite work in a logical sense. If count = 0 then it's not there to delete anyway. I suspect with the left join syntax you've used, you are going for votes that have 0 comments? Right idea with the left join, but you want where c.comment is null (left join produces nulls...where c.comment is null means there was no comment found).
Of course this won't work due to mysql:
DELETE FROM votes
WHERE id IN (
SELECT v.id
FROM votes v
LEFT JOIN comments c ON f.id = v.post_id
where c.comments is null)
If I was stuck in MySQL...(sorry this is psuedo code, I haven't been in mysql long enough to get this exact in a text window.
select id
into #temp
FROM votes v
LEFT JOIN comments c ON f.id = v.post_id
where c.comments is null
delete from votes where id in (select id from #temp)
drop table #temp
Seems like a silly work around
Here is a sample SQL dump: https://gist.github.com/JREAM/99287d033320b2978728
I have a SELECT that grabs a bundle of users.
I then do a foreach loop to attach all the associated tree_processes to that user.
So I end up doing X Queries: users * tree.
Wouldn't it be much more efficient to fetch the two together?
I've thought about doing a LEFT JOIN Subselect, but I'm having a hard time getting it correct.
Below I've done a query to select the correct data in the SELECT, however I would have to do this for all 15 rows and it seems like a TERRIBLE waste of memory.
This is my dirty Ateempt:
-
SELECT
s.id,
s.firstname,
s.lastname,
s.email,
(
SELECT tp.id FROM tree_processes AS tp
JOIN tree AS t ON (
t.id = tp.tree_id
)
WHERE subscribers_id = s.id
ORDER BY tp.id DESC
LIMIT 1
) AS newest_tree_id,
#
# Don't want to have to do this below for every row
(
SELECT t.type FROM tree_processes AS tp
JOIN tree AS t ON (
t.id = tp.tree_id
)
WHERE subscribers_id = s.id
ORDER BY tp.id DESC
LIMIT 1
) AS tree_type
FROM subscribers AS s
INNER JOIN scenario_subscriptions AS ss ON (
ss.subscribers_id = s.id
)
WHERE ss.scenarios_id = 1
AND ss.completed != 1
AND ss.purchased_exit != 1
AND deleted != 1
GROUP BY s.id
LIMIT 0, 100
This is my LEFT JOIN attempt, but I am having trouble getting the SELECT values
SELECT
s.id,
s.firstname,
s.lastname,
s.email,
freshness.id,
# freshness.subscribers_id < -- Cant get multiples out of the LEFT join
FROM subscribers AS s
INNER JOIN scenario_subscriptions AS ss ON (
ss.subscribers_id = s.id
)
LEFT JOIN ( SELECT tp.id, tp.subscribers_id AS tp FROM tree_processes AS tp
JOIN tree AS t ON (
t.id = tp.tree_id
)
ORDER BY tp.id DESC
LIMIT 1 ) AS freshness
ON (
s.id = subscribers_id
)
WHERE ss.scenarios_id = 1
AND ss.completed != 1
AND ss.purchased_exit != 1
AND deleted != 1
GROUP BY s.id
LIMIT 0, 100
In the LEFT JOIN you are using 'freshness' as the table alias. This in you select you need to additionally state what column(s) you want from it. Since there is only one column (id) you need to add:
freshness.id
to the select clause.
Your ON clause of the left join looks pretty dodgy too. Maybe freshness.id = ss.subscribers_id?
Cheers -
Say I have a table of ratings:
create table ratings (
user_id int unsigned not null,
post_id int unsigned not null,
rating set('like', 'dislike') not null,
primary key (user_id, post_id)
);
And a given user with id 1, how can I select the user with more likes in common? And the user with more dislikes in common? And the user with more ratings (likes or dislikes) in common? I guess that the queries would be very similar, buy I can't figure any of them out yet. I'll update with any progress I make.
Any help is appreciated, thanks!
select
r1.user_id as user1
,r2.user_id as user2
,r1.rating as rating
,count(*) as num_matching_ratings
from
ratings r1
inner join ratings r2
on r1.post_id = r2.post_id
and r1.rating = r2.rating
and r1.user_id <> r2.user_id --don't want to count
--matches with self
where
r1.user_id = 1 -- change this to any user, or use a
-- variable to increase reusebility
and r1.rating = 'like' -- set this to dislike to common dislikes
group by
r1.user_id
,r2.user_id
,r1.rating
having
count(*) > 1 --show only those with more than 1 in common
order by
count(*) desc
/* limit 1 -- uncomment to show just the top match */
By joining the tables together, we can count the number of occurances where the second user has rated an article similarly. This query will return the evalution from the most in common to the least. If you uncomment the "limit 1" statement, it will only return the match with the most in common.
Give this a try:
select r2.user_id from (
select post_id, rating from ratings,
(select #userId := 2) init
where user_id = #userId
) as r1
join ratings r2
on r1.post_id = r2.post_id and r1.rating = r2.rating
where r2.user_id != #userId and r2.rating = 'like'
group by r2.user_id
order by count(*) desc
limit 1
It should work for likes and dislikes by changing the string. And to change the user just modify the variable assignation.
The following should work for both dislikes and likes in common (just by removing the filtering condition):
select r2.user_id from (
select post_id, rating from ratings,
(select #userId := 2) init
where user_id = #userId
) as r1
join ratings r2
on r1.post_id = r2.post_id and r1.rating = r2.rating
where r2.user_id != #userId
group by r2.user_id
order by count(*) desc
limit 1
pardon my syntax, i don't write raw sql very often. you can consider this psudocode.
first, i'd get the table where id is 1
view1 = SELECT * FROM ratings, WHERE ( user_id = 1)
then i'd join it with ratings
view2 = select * from view1, ratings, where(view1.rating = ratings.rating AND view1.post_id = records.post_id)
then i'd aggregate by count
view3 = select count from view2 group by (user_id)
and then i'd get the max of that.
now, that's only an algorithmic overview of what my first thoughts would be. I don't think it would be particularly efficient, and you probably wouldn't use that syntax.
Building on Chris's and Mostacho's answers, I made the following query. I'm not 100% sure that it works every time, but I havent found a flaw yet.
select r2.user_id
from ratings r1
join ratings r2
on r1.user_id <> r2.user_id
and r1.post_id = r2.post_id
and r1.rating = r2.rating
where r1.user_id = 1
and r1.rating = 'like'
group by r2.user_id
order by count(r2.user_id) desc
limit 1
This query returns the id of the user with more common likes with the user 1. To fetch the user with more common ratings, just remove and r1.rating = 'like' from the where clause.
I have three tables - tblpollquestions, tblpollanswers and tblpollresponses.
I want to select a random question that a user hasn't responded to yet, with the respective answers.
The SQL below returns exactly what I need, but I'm concerned that it takes three SELECTs to do it. There must surely be a more efficient way?
SELECT
poll.id,
poll.question,
a.answer
FROM tblpollquestions poll
INNER JOIN tblpollanswers a ON a.question_id = poll.id
INNER JOIN (
SELECT id FROM tblpollquestions WHERE id NOT IN(
SELECT question_id FROM tblpollresponses WHERE user_id = 1
) ORDER BY RAND() LIMIT 1
) as t ON t.id = poll.id
This could be made a bit better by switching NOT IN(SELECT...) into LEFT JOIN
SELECT
poll.id,
poll.question,
a.answer
FROM
tblpollquestions poll
INNER JOIN
tblpollanswers a
ON
a.question_id = poll.id
INNER JOIN (
SELECT
q.id
FROM
tblpollquestions AS q
LEFT JOIN
tblpollresponses AS r
ON
q.id = r.question_id
AND r.user_id = 1
WHERE
r.question_id IS NULL
ORDER BY RAND() LIMIT 1
) as t ON t.id = poll.id
ORDER BY RAND() can also be slow if there are many rows in tblpollquestions table. See this presentation from Bill Karwin (slide 142 and onwards) for some other ideas on selecting a random row.
http://www.slideshare.net/billkarwin/sql-antipatterns-strike-back
Is seems fine to me, although I would change it slightly:
SELECT
poll.id,
poll.question,
a.answer
FROM tblpollquestions poll
INNER JOIN tblpollanswers a ON a.question_id = poll.id
WHERE poll.id = (
SELECT id FROM tblpollquestions WHERE NOT EXISTS (
SELECT * FROM tblpollresponses WHERE user_id = 1 AND question_id = tblpollquestions.id )
ORDER BY RAND() LIMIT 1)
Written that way should do a better job of using indexes, and not checking the join conditions for every single tblpollanswers.
Make sure you have a UNIQUE index (or primary key) on tblpollresponses for (user_id, question_id) (in that order). If you need it for other queries, you can add an additional UNIQUE index with the columns in the reverse order.
Edit: Actually putting it in the where might not be so good http://jan.kneschke.de/projects/mysql/order-by-rand/ You will need to explain the query and compare.
Use left join like this:
SELECT ques.id, ques.question, ans.answer FROM tblpollquestions ques
INNER JOIN tblpollanswers ans ON(ans.question_id = ques.id)
left join tblpollresponses res on(res.question_id=ques.id and user_id = 1)
where res.question_id is null ORDER BY RAND() LIMIT 1;
I changed your table aliases to make better sense.