Find out AVG column in SQL - mysql

I have this php/sql Query:
$result = mysql_query("
SELECT r.item_id, AVG(rating) AS avgrating, count(rating) AS count, i.item, c.category
FROM ratings AS r
LEFT JOIN items AS i
ON r.item_id = i.items_id
INNER JOIN master_cat c
ON c.cat_id = i.cat_id
GROUP BY item_id
ORDER BY avgrating DESC
LIMIT 25;");
When I output this, count is correct, it shows how much votes certain items have received.
I simply want to add a WHERE count >= 10 clause but everything breaks. Obviously, when there are thousands of items, some will get one vote and have 100%. But that is not a good indicator. I want to print out items that have at least 10 votes (or count >= 10)

You should to use having instead where
SELECT
r.item_id, AVG(rating) AS avgrating,
count(rating) AS count, i.item, c.category
FROM
ratings AS r
LEFT JOIN items AS i
ON r.item_id = i.items_id
INNER JOIN master_cat c
ON c.cat_id = i.cat_id
GROUP BY
item_id
HAVING
count >= 10
ORDER BY
avgrating DESC
LIMIT 25;

You can't use a where filter on the results of an aggregate function (count()). where is applied at the row-level, as the DB is deciding whether to include the row or not in the result set - at this point the results of the count aren't available yet.
What you want is a having clause, which is applied as one of the last steps before results are sent to the client, after all the aggregate results have been calculated.
...
GROUP BY item_id
HAVING count > 10
ORDER BY ...

you need to tell it what you want to count
having count(*) > 10

Related

MySQL SUM left join show all categories even with the sum of 0

I wish to select all categories based on the sum of scores of a set of data, even if the set of data does not include the categories. However, it always only display categories with actual data no matter what I try.
I have created the following SQLFiddle http://sqlfiddle.com/#!9/52a127/3
For ease of view, I'll paste the select statement here:
SELECT categories.id, IFNULL(SUM(raw_data.score), 0) as total
FROM categories
LEFT JOIN raw_data ON categories.id = raw_data.category_id
WHERE
(raw_data.quarter = '2018Q2' OR !raw_data.quarter) AND
raw_data.broker_id = 2
GROUP BY categories.id
ORDER BY total DESC
As you can see from the fiddle, it only displays 2 categories, but I wish to select all 6 and have 0 for those with no results.
Any help is appreciated, thanks!
You need to move condition from WHERE to ON clause:
SELECT categories.id, IFNULL(SUM(raw_data.score), 0) as total
FROM categories
LEFT JOIN raw_data ON categories.id = raw_data.category_id
AND (raw_data.quarter = '2018Q2' OR !raw_data.quarter)
AND raw_data.broker_id = 2
GROUP BY categories.id
ORDER BY total DESC;
SQLFiddle Demo

Select top 10 posts ordered by quantity of votes

I have two tables, one for image records (posts) and the other one is for likes records. So i made an INNER JOIN from one table to another because i needed to select the image and the quantity of likes that particular image has. but i also need to order them by the quantity of likes so i can make a top 10 of most voted images on the site, so here is my query:
SELECT
COUNT(DISTINCT B.votes),
A.id_image,
A.image,
A.title
FROM likes_images AS B INNER JOIN images AS A ON A.id_image = B.id_image
GROUP BY A.title
ORDER BY COUNT(DISTINCT B.votes) ASC
LIMIT 10
It works, but it's only ordering the images by the title (Alphabetical). I want to order them from the most voted to the less voted.
Any ideas?
In most SQL implementations, GROUP BY criterion implies any ORDER BY clause to be ignored in favor of criterion.
So you might try this:
SELECT L.id_image, A.image, A.title,
* FROM (
SELECT COUNT(votes) AS likes, id_image
FROM likes_images
GROUP BY id_image
) AS L
JOIN images B ON B.id_image = L.id_image
ORDER BY L.likes DESC
LIMIT 10
Note that I set ORDER BY to DESC (since you want top10 I don't understand you chose ASC)!

Ordering JOIN results when grouping

Take the below for example:
SELECT
*
FROM
auctions AS a
LEFT JOIN winners AS w ON a.auction_id=w.auction_id
WHERE
a.active=1
AND
a.approved=1
GROUP BY
a.auction_id
ORDER BY
a.start_time DESC
LIMIT
0, 10;
Sometimes this may match multiple results in the winners table; I don't need both of them, however I want to have control over which row I get if there are multiple matches. How can I do an ORDER BY on the winners table so that I can make sure the row I want is the first one?
It is difficult to accurately answer without seeing your table structure but if your winners table has a winner date column or something similar, then you can use an aggregate function to get the first record.
Then you can return the record with that earliest date similar to this:
SELECT *
FROM auctions AS a
LEFT JOIN winners w1
ON a.auction_id=w1.auction_id
LEFT JOIN
(
select auction_id, min(winner_date) MinDate -- replace this with the date column in winners
from winners
group by auction_id
) AS w2
ON a.auction_id=w2.auction_id
and w1.winner_date = w2.MinDate
WHERE a.active=1
AND a.approved=1
ORDER BY a.start_time DESC
SELECT *
FROM auctions AS a
LEFT JOIN (select auction_id from winners order BY auction_id limit 1) AS w ON a.auction_id = w.auction_id
WHERE a.active = 1
AND a.approved = 1
GROUP BY a.auction_id
ORDER BY a.start_time DESC
Change the reference to the winners table in the join clause to a sub-query. This then gives you control over the number of records returned, and in what order.

Count on joined table causes return of 1 row

I've got query like:
SELECT
b.title,
b.url,
b.`date`,
b.gallery,
count(c.id) as comments_count,
a.name,
b.content,
b.comments,
LEFT(b.content, LOCATE('<page>', b.content)-1) as content_short
FROM blog b
LEFT JOIN blog_comments c ON
(b.id = c.note AND c.approved = 1)
LEFT JOIN administrators a ON
(b.aid = a.id)
WHERE
b.`date` < now() AND
b.active = 1
ORDER BY b.`date` DESC;
Now, when I remove count(c.id) as comments_count,, I've got 2 rows returned. When it's present, there's only 1 row returned.
Is there some way to fix ot or I simply have to change
count(c.id) as comments_count, to (select count(id) ascomments_countfrom blog_comments where note = b.id) as comments_count,?
Count(*) is an aggregated function, so it will apply in a group.
That means that when you count on groups, it will apply the function on every group.
The groups are formed when you use Group By, in this case, you're not using, so MySQL consider that ALL select (your joins) is ONLY 1 GROUP.
So, applies the count on the unique group and returning the count of rows.
you should add a Group by by the field you want
An example is here

Getting rid of duplicate results in MySQL query when using UNION

I have a MySQL query to get items that have had recent activity. Basically users can post a review or add it to their wishlist, and I want to get all items that have either had a new review in the last x days, or was placed on someone's wishlist.
The query goes a bit like this (slightly simplified):
SELECT items.*, reaction.timestamp AS date FROM items
LEFT JOIN reactions ON reactions.item_id = items.id
WHERE reactions.timestamp > 1251806994
GROUP BY items.id
UNION
SELECT items.*, wishlists.timestamp AS date FROM items
LEFT JOIN wishlist ON wishlists.item_id = items.id
WHERE wishlists.timestamp > 1251806994
GROUP BY items.id
ORDER BY date DESC LIMIT 5
This works, but when an item has been placed both on someone's wishlist and a review was posted, the item is returned twice. UNION removes duplicates normally, but because the date differs between the two rows, both rows are returned. Can I somehow tell MySQL to ignore the date when removing duplicate rows?
I also tried doing something like this:
SELECT items.*, IF(wishlists.id IS NOT NULL, wishlists.timestamp, reactions.timestamp) AS date FROM items
LEFT JOIN reactions ON reactions.item_id = items.id
LEFT JOIN wishlist ON wishlists.item_id = items.id
WHERE (wishlists.id IS NOT NULL AND wishlists.timestamp > 1251806994) OR
(reactions.id IS NOT NULL AND reactions.timestamp > 1251806994)
GROUP BY items.id
ORDER BY date DESC LIMIT 5
But that turned out to be insanely slow for some reason (took about half a minute).
I solved it myself, based on larryb82's idea. I basically did the following:
SELECT * FROM (
SELECT items.*, reaction.timestamp AS date FROM items
LEFT JOIN reactions ON reactions.item_id = items.id
WHERE reactions.timestamp > 1251806994
GROUP BY items.id
UNION
SELECT items.*, wishlists.timestamp AS date FROM items
LEFT JOIN wishlist ON wishlists.item_id = items.id
WHERE wishlists.timestamp > 1251806994
GROUP BY items.id
ORDER BY date DESC LIMIT 5
) AS items
GROUP BY items.id
ORDER BY date DESC LIMIT 5
Though I realize this probably doesn't take into account which date is the highest for each item... Not sure yet if that matters and if so, what to do about it.
Not sure if this would be a huge performance hit but you could try
SELECT item_field_1, item_field_2, ..., max(date) as date
FROM
(the query you posted)
GROUP BY item_field_1, item_field_2, ...
I don't think you need a UNION here at all.
SELECT item.*, GREATEST(COALESCE(wishlists.timestamp, 0), COALESCE(reaction.timestamp, 0)) as date
FROM items
LEFT JOIN reactions ON reactions.item_id = items.id AND reactions.timestamp > 1251806994
LEFT JOIN wishlists ON wishlists.item_id = items.id AND wishlists.timestamp > 1251806994
ORDER BY date DESC limit 5
Your use of LEFT JOIN above was probably very slow because of the predicate with the OR in it. You asked the database to join the three tables together then examined that result for timestamp information. My statement should form a smaller intermediate table. Items that don't have either a reaction or a wishlist will get a date of 0, which presumably will cause them not to be reported.