I have in my DB two tables :
Advices and Votes
an advice can have 0 or many votes
I'd like to get all advices validated order by the notoriety :
SELECT advices.*, COUNT(upvotes.id) - COUNT(downvotes.id) AS notoriety
FROM `advices`
LEFT JOIN votes AS upvotes ON upvotes.is_good=1 AND upvotes.advice_id=advices.id
LEFT JOIN votes AS downvotes ON downvotes.is_good=0 AND downvotes.advice_id=advices.id
WHERE `advices`.`subject_id` = 1
AND `advices`.`state` = 'validated'
ORDER BY notoriety ASC
But, the result only show advices with votes ! What should I change to have advices without vote too ?
Thanks
Use conditional aggregation instead of two joins:
SELECT a.*,
(SUM(downvotes.is_good = 1) - SUM(downvotes.is_good = 0) ) AS notoriety
FROM advices a LEFT JOIN
votes v
ON a.id = v.advice_id
WHERE a.`subject_id` = 1 AND a.`state` = 'validated'
GROUP BY a.id
ORDER BY notoriety ASC;
You can get your version to work using count(distinct) rather than count(). However, the above version is simpler and should perform better.
The reason you're having a problem is that the count is returning null where no votes exist. You can use the NVL function to replace those null's with 0.
SELECT advices.*, nvl(COUNT(upvotes.id),0) - nvl(COUNT(downvotes.id),0) AS notoriety
FROM `advices`
LEFT JOIN votes AS upvotes ON upvotes.is_good=1 AND upvotes.advice_id=advices.id
LEFT JOIN votes AS downvotes ON downvotes.is_good=0 AND downvotes.advice_id=advices.id
WHERE `advices`.`subject_id` = 1
AND `advices`.`state` = 'validated'
ORDER BY notoriety ASC
Related
I have an almost complex query like this:
SELECT qa.id,
qa.subject,
qa.category cat,
qa.keywords tags,
qa.body_html,
qa.amount,
qa.visibility,
qa.date_time,
COALESCE(u.reputation, 'N') reputation,
COALESCE(Concat(u.user_fname, ' ', u.user_lname), 'unknown') NAME,
COALESCE(u.avatar, 'anonymous.png') avatar,
(
SELECT COALESCE(Sum(vv.value),0)
FROM votes vv
WHERE qa.id = vv.post_id
AND 15 = vv.table_code) AS total_votes,
(
SELECT COALESCE(Sum(vt.total_viewed),0)
FROM viewed_total vt
WHERE qa.id = vt.post_id
AND 15 = vt.table_code limit 1) AS total_viewed
FROM qanda qa
LEFT JOIN users u
ON qa.author_id = u.id
AND qa.visibility = 1
WHERE qa.type = 0 $query_where
ORDER BY $query_order
LIMIT :j, 11;
Noted that $query_where variable contains some other conditions which will be created dynamically. Anyway, as you see, maximum it returns 10 posts.
Currently, to count total matched rows, I use another query like this:
SELECT COUNT(amount) paid_qs,
COUNT(*) all_qs
FROM qanda qa
WHERE type = 0 $query_where
I guess there is some waste processing. I mean two separated queries (with complex conditions on the where clause) will be too much.
Is there any approach to use one query instead of them?
You can query the found rows after the query with the FOUND_ROWS() function.
Reference: MySQL Reference Manual
You have to include the SELECT SQL_CALC_FOUND_ROWS ... clause into your query.
I have this query for load user stream in my app , is it too hard if we have 10000 matched row in 'follow' ?
SELECT *
FROM post
WHERE user_id
IN (SELECT follow_id
FROM follow
WHERE id='$some_id')
AND type='accepted'
ORDER BY id DESC LIMIT $page , 20
Syntactically your code looks correct.. I don't see any errors so then if you're talking about efficiency I would join the tables and include the second filter on the JOIN
SELECT p.*
FROM post p
JOIN follow f
ON f.follow_id = p.user_id
AND f.id = '$some_id'
WHERE p.type = 'accepted'
ORDER BY p.id DESC LIMIT $page , 20
MySQL handles large sets of data a lot better through a join than with an IN()...
Think of it this way.. because the IN() can have pretty much anything inside of it, MySQL has to check it with everything returned for each row... instead of checking once when you JOIN..
With that many returning, I have a feeling a Join might be more efficient
SELECT *
FROM post p Join
follow f On p.user_id = f.follow_id
WHERE f.id='$some_id'
AND p.type='accepted'
ORDER BY p.id DESC LIMIT $page , 20
I have written an sql statement that besides all the other columns should return the number of comments and the number of likes of a certain post. It works perfectly when I don't try to get the number of times it has been shared too. When I try to get the number of time it was shared instead it returns a wrong number of like that seems to be either the number of shares and likes or something like that. Here is the code:
SELECT
[...],
count(CS.commentId) as shares,
count(CL.commentId) as numberOfLikes
FROM
(SELECT *
FROM accountSpecifics
WHERE institutionId= '{$keyword['id']}') `AS`
INNER JOIN
account A ON A.id = `AS`.accountId
INNER JOIN
comment C ON C.accountId = A.id
LEFT JOIN
commentLikes CL ON C.commentId = CL.commentId
LEFT JOIN
commentShares CS ON C.commentId = CS.commentId
GROUP BY
C.time
ORDER BY
year, month, hour, month
Could you also tell me if you think this is an efficient SQL statement or if you would do it differently? thank you!
Do this instead:
SELECT
[...],
(select count(*) from commentLikes CL where C.commentId = CL.commentId) as shares,
(select count(*) from commentShares CS where C.commentId = CS.commentId) as numberOfLikes
FROM
(SELECT *
FROM accountSpecifics
WHERE institutionId= '{$keyword['id']}') `AS`
INNER JOIN account A ON A.id = `AS`.accountId
INNER JOIN comment C ON C.accountId = A.id
GROUP BY C.time
ORDER BY year, month, hour, month
If you use JOINs, you're getting back one result set, and COUNT(any field) simply counts the rows and will always compute the same thing, and in this case the wrong thing. Subqueries are what you need here. Good luck!
EDIT: as posted below, count(distinct something) can also work, but it's making the database do more work than necessary for the answer you want to end up with.
Quick fix:
SELECT
[...],
count(DISTINCT CS.commentId) as shares,
count(DISTINCT CL.commentId) as numberOfLikes
Better approach:
SELECT [...]
, Coalesce(shares.numberOfShares, 0) As numberOfShares
, Coalesce(likes.numberOfLikes , 0) As numberOfLikes
FROM [...]
LEFT
JOIN (
SELECT commentId
, Count(*) As numberOfShares
FROM commentShares
GROUP
BY commentId
) As shares
ON shares.commentId = c.commentId
LEFT
JOIN (
SELECT commentId
, Count(*) As numberOfLikes
FROM commentLikes
GROUP
BY commentId
) As likes
ON likes.commentId = c.commentId
I cannot figure out why this is not working. Basically, I am running a subquery to count all rows of p.songid WHERE trackDeleted=0. The subquery works fine when I execute it by itself, but when I implement I get "subquery returned more than 1 row".
SELECT u.username, u.id, u.score, s.genre, s.songid, s.songTitle, s.timeSubmitted, s.userid, s.insWanted, s.bounty,
(SELECT COUNT(p.songid)
FROM songs s
LEFT JOIN users u
ON u.id = s.userid
LEFT JOIN posttracks p
ON s.songid = p.songid
WHERE p.trackDeleted=0
GROUP BY s.timeSubmitted ASC
LIMIT 25)
AS trackCount
FROM songs s
LEFT JOIN users u
ON u.id = s.userid
LEFT JOIN posttracks p
ON s.songid = p.songid
WHERE paid=1 AND s.timeSubmitted >= ( CURDATE() - INTERVAL 60 DAY )
GROUP BY s.timeSubmitted ASC
LIMIT 25
Obviously, a sub-query can't return more than one row, as this makes no sense. You only expect one value to be returned - COUNT(p.songid) - yet you GROUP BY s.timeSubmitted, which will make it return multiple rows, and multiple counts of p.songid.
Think about it this way, a subquery in the SELECT statement like you have needs to return a single value since it is going to act like just another column in your select list. Since you have a LIMIT 25 on yours, you're obviously expecting more than one value back, which is inocrrect for this usage.
OK, your query is a mess. Not only is the subquery broken, but I'm pretty sure the GROUP BY s.timeSubmitted ASC isn't doing what you think think it does. (Did you mean ORDER BY instead?) It might help if you explained in words what you're trying to accomplish.
Anyway, I'm going to take a wild guess and suggest that this might be what you want:
SELECT
u.username, u.id, u.score, s.genre, s.songid, s.songTitle,
s.timeSubmitted, s.userid, s.insWanted, s.bounty,
COUNT(p.songid) AS trackCount
FROM songs s
LEFT JOIN users u ON u.id = s.userid
LEFT JOIN posttracks p ON p.songid = s.songid AND p.trackDeleted = 0
WHERE paid = 1 AND s.timeSubmitted >= ( CURDATE() - INTERVAL 60 DAY )
GROUP BY s.songid
ORDER BY s.timeSubmitted ASC
LIMIT 25
Edit: Fixed the COUNT() so that it will correctly return 0 if there are no matching tracks.
SELECT i.*, i.id IN (
SELECT id
FROM w
WHERE w.status='active') AS wish
FROM i
INNER JOIN r ON i.id=r.id
WHERE r.member_id=1 && r.status='active'
ORDER BY wish DESC
LIMIT 0,50
That's a query that I'm trying to run. It doesn't scale well, and I'm wondering if someone here can tell me where I could improve things. I don't join w to r and i because I need to show rows from i that are unrepresented in w. I tried a left join, but it didn't perform too well. This is better, but not ideal yet. All three tables are very large. All three are indexed on the fields I'm joining and selecting on.
Any comments, pointers, or constructive criticisms would be greatly appreciated.
EDIT Addition:
I should have put this in my original question. It's the EXPLAIN as return from SQLYog.
id|select_type |table|type |possible_keys|key |key_len|ref |rows|Extra|
1 |PRIMARY |r |ref |member_id,id |member_id|3 |const|3120|Using where; Using temporary; Using filesort
1 |PRIMARY |i |eq_ref |id |id |8 |r.id |1 |
2 |DEPENDENT SUBQUERY|w |index_subquery|id,status |id |8 |func |8 |Using where
EDIT le dorfier - more comments ...
I should mention that the key for w is (member_id, id). So each id can exist multiple times in w, and I only want to know if it exists.
WHERE x IN () is identical to an INNER JOIN to a SELECT DISTINCT subquery, and in general, a join to a subquery will typically perform better if the optimizer doesn't turn the IN into a JOIN - which it should:
SELECT i.*
FROM i
INNER JOIN (
SELECT DISTINCT id
FROM w
WHERE w.status = 'active'
) AS wish
ON i.id = wish.id
INNER JOIN r
ON i.id = r.id
WHERE r.member_id = 1 && r.status = 'active'
ORDER BY wish.id DESC
LIMIT 0,50
Which, would probably be equivalent to this if you don't need the DISTINCT:
SELECT i.*
FROM i
INNER JOIN w
ON w.status = 'active'
AND i.id = wish.id
INNER JOIN r
ON i.id = r.id
AND r.member_id = 1 && r.status = 'active'
ORDER BY i.id DESC
LIMIT 0,50
Please post your schema.
If you are using wish as an existence flag, try:
SELECT i.*, CASE WHEN w.id IS NOT NULL THEN 1 ELSE 0 END AS wish
FROM i
INNER JOIN r
ON i.id = r.id
AND r.member_id = 1 && r.status = 'active'
LEFT JOIN w
ON w.status = 'active'
AND i.id = w.id
ORDER BY wish DESC
LIMIT 0,50
You can use the same technique with a LEFT JOIN to a SELECT DISTINCT subquery. I assume you aren't specifying the w.member_id because you want to know if any members have this? In this case, definitely use the SELECT DISTINCT. You should have an index with id as the first column on w as well in order for that to perform:
SELECT i.*, CASE WHEN w.id IS NOT NULL THEN 1 ELSE 0 END AS wish
FROM i
INNER JOIN r
ON i.id = r.id
AND r.member_id = 1 && r.status = 'active'
LEFT JOIN (
SELECT DISTINCT w.id
FROM w
WHERE w.status = 'active'
) AS w
ON i.id = w.id
ORDER BY wish DESC
LIMIT 0,50
I should have put this in my original question. It's the EXPLAIN as return from SQLYog.
id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|
1|PRIMARY|r|ref|member_id,id|member_id|3|const|3120|Using where; Using temporary; Using filesort
1|PRIMARY|i|eq_ref|id|id|8|r.id|1|
2|DEPENDENT SUBQUERY|w|index_subquery|id,status|id|8|func|8|Using where
Please post the EXPLAIN listing. And explain what the tables and columns mean.
wish appears to be a boolean - and you're ORDERing by it?
EDIT: Well, it looks like it's doing what it's being instructed to do. Cade seems to be thinking expansively on what this all could possibly mean (he probably deserves a vote just for effort.) But I'd really rather you tell us.
Wild guessing just confuses everyone (including you, I'm sure.)
OK, based on new info, here's my (slightly less wild) guess.
SELECT i.*,
CASE WHEN EXISTS (SELECT 1 FROM w WHERE id = i.id AND w.status = 'active' THEN 1 ELSE 0 END) AS wish
FROM i
INNER JOIN r ON i.id = r.id AND r.status = 'active'
WHERE r.member_id = 1
Do you want a row for each match in w? Or just to know for i.id , whether there is an active w record? I assumed the second answer, so you don't need to ORDER BY - it's for only one ID anyway. And since you're only returning columns from i, if there are multiple rows in r, you'll just get duplicate rows.
How about posting what you expect to get for a proper answer?
...
ORDER BY wish DESC
LIMIT 0,50
This appears to be the big expense. You're sorting by a computed column "wish" which cannot benefit from an index. This forces it to use a filesort (as indicated by the EXPLAIN) output, which means it writes the whole result set to disk and sorts it using disk I/O which is very slow.
When you post questions like this, you should not expect people to guess how you have defined your tables and indexes. It's very simple to get the full definitions:
mysql> SHOW CREATE TABLE w;
mysql> SHOW CREATE TABLE i;
mysql> SHOW CREATE TABLE r;
Then paste the output into your question.
It's not clear what your purpose is for the "wish" column. The "IN" predicate is a boolean expression, so it always results in 0 or 1. But I'm guessing you're trying to use "IN" in hopes of accomplishing a join without doing a join. It would help if you describe what you're trying to accomplish.
Try this:
SELECT i.*
FROM i
INNER JOIN r ON i.id=r.id
LEFT OUTER JOIN w ON i.id=w.id AND w.status='active'
WHERE r.member_id=1 AND r.status='active'
AND w.id IS NULL
LIMIT 0,50;
It uses an additional outer join, but it doesn't incur a filesort according to my test with EXPLAIN.
Have you tried this?
SELECT i.*, w.id as wish FROM i
LEFT OUTER JOIN w ON i.id = w.id
AND w.status = 'active'
WHERE i.id in (SELECT id FROM r WHERE r.member_id = 1 AND r.status = 'active')
ORDER BY wish DESC
LIMIT 0,50