I'm hoping to get some explanation on a ranking query I've come across. I have a similar setup where I've got a points field that I want to order by, but can't for the life of my understand this query: http://www.artfulsoftware.com/infotree/queries.php#460
I don't understand what the join is actually doing, and why if I don't include the group by statements, I just get one record that is completely jumbled and incorrect. I always see Group By statements as limiting the number of results, but this query seems to be adding them to the final results set (as without the group by, you get one row returned)
The query is essentially doing this:
For each row in the original votes table, count how many rows (in the same table) have votes <= to that row. The count is the same as the rank.
The JOIN is needed to link the votes table to each individual row in the votes table.
GROUP BY is needed when you have a COUNT() in your select list. For each person, the query is counting how many rows it finds that have votes <= for each person. The GROUP BY with the COUNT is limiting the number of results of the JOIN results which is a multiplication of the original votes table.
Related
okay I know that the query works as it runs just fine in my DB that I use for practice. however, I am still extremely new to MySQL and would just like to understand a little bit more.
here is my query...
SELECT
photos.id, photos.image_url, COUNT(*) as total
FROM photos
JOIN likes
ON likes.photo_id = photos.id
GROUP BY photos.id
ORDER BY total DESC
LIMIT 1;
so my question is, how does the aggregate function "COUNT(*)" know to count from the correct table? I want it to count from the "likes" table and it does that, but how does it understand that is what I am asking?
I was originally thinking I would need to do a "COUNT(likes.photo_id)" but it was unnecessary.
so how does it know?
am I just going down a rabbit hole that in the long run just does not matter?
Count() counts the number of rows returned by the query as a whole. If you run the query without the count, it returns a specific number of rows. That's what Count() is counting.
It isn't counting rows in either photos or likes. It's counting rows in the joined result set.
Here's a cool thing about SQL: the result of relational operations (for example JOIN or UNION) between tables is... another table! It isn't a table that is stored in your database, but it's a table.
You can think of an analogy in arithmetic: the sum of two positive integers is another positive integer. In mathematics, this is called a closure.
It's the same in relational algebra. When you combine two tables with one of the relational operators, the result is another thing that could be a table itself.
So COUNT(*) is not counting rows in either table. It's counting the rows in the table produced as the result of the JOIN.
Why do I not get the same results when running the two queries? If I run the second one I get the course with the smallest amount of credits and when I run the first one I get the courses ordered by courseid
select min(credits), title, courseid
from course
group by title, courseid
select min(credits)
from course
An aggregation query is any query that has a group by or an aggregation function in the select.
An aggregation query returns one row per group, where a "group" is defined as the unique combination of values of the keys in the group by clause. If there is no group by clause, then all rows are taken to be a single group and one row is returned.
So, your first query returns one row for each combination of title and courseid in the course table. That row contains the minimum value of credits for that combination. If the course table has only one row per courseid, then the results are very similar to the contents of the table.
The second query returns one row overall, with the minimum number of credits of all rows.
If you want to get one row from with the minimum number of credits, then you don't want an aggregation query. Instead, you can use:
select c.*
from course c
order by c.credits
limit 1;
When you use a group by, you are using a sort of "filter", in the first query you group by title, then all the same titles are grouped by courseid, in the second you only select the minimum value of credits without filtering.
Take a look at a group by doc maybe with some graphical examples like this:
https://www.geeksforgeeks.org/sql-group-by/
I have a very basic question which I cannot answer myself but shouldn't take much of your time.
The following query works, it lists all the exhibition_category_id and counts the total of objects that are assigned to each category.
My question is: Why does it do it? I don't understand the query. It says count(*) - why doesn't it give me the total of different exhibition_category_id's (79), but instead counts, how many objects are assigned to each category?
Here is the query in question, as well as a screen shot from the actual output:
SELECT eb.exhibition_category_id, count(*) AS total
FROM exhibition_brand eb
GROUP BY eb.exhibition_category_id
https://i.stack.imgur.com/6deMv.png
Hope its understandable what I am asking for, eager to improve my post based on feedback.
Cheers
Your query is a basic aggregation query:
SELECT eb.exhibition_category_id, count(*) AS total
FROM exhibition_brand eb
GROUP BY eb.exhibition_category_id;
The GROUP BY specifies that the result set will contain one row for each value of eb.exhibition_category_id. The result set consists of two columns, one is the value that defines the row. The other is a count of the number of rows in each group. That is what COUNT(*) does.
If you wanted the total count of different eb.exhibition_category_id, then you want one row and COUNT(DISTINCT):
select count(distinct eb.exhibition_category_id)
from exhibition_brand eb;
The GROUP BY function groups the COUNT() by eb.exhibition_category_id, so the query groups the records by eb.exhibition_category_id, then counts the corresponding records.
I need to get the average rating and the total number of ratings for a particular user and then select all single ratings (rating_value, rating_text, creator) as well:
$rating_query = mysql_query("SELECT COUNT(1) as rating_count
,AVG(rating_value), rating_value, rating_text, creator
FROM user_rating WHERE rated_user = $user_id");
This query would return the COUNT(1) result and the AVG(rating_value) for every row, but I only need those values once.
Is there any way to do this without making 2 separate queries?
There may be a trick I'm not aware of, but I don't think that's possible to do in a single query. You could try using a GROUP BY clause if that would make sense for you, but I'm guessing it probably doesn't from the column names you're using. Any relation requires a single atomic value at any given row and column, even if that value is null. What you are requesting is that columns 1 and 2 in every row but the first have no value, and again I don't think this is possible.
I have the classic 'get all rows in one table with number of corresponding rows in another table' issue which should be solved by this query:
SELECT
ideas.id,
ideas.idea,
submitted,
COUNT(votes.id) AS vote_count
FROM ideas
LEFT OUTER JOIN votes ON ideas.id = votes.idea
WHERE dead = 0
GROUP BY votes.idea
ORDER BY vote_count DESC, submitted DESC
LIMIT 10;
There are 4 rows (with dead = 0) in ideas and one row in votes (relating to the first idea). However this query returns two records (idea #1 and idea #2) with correct vote_counts. Why is this not returning all of the records in ideas?
When you say GROUP BY votes.idea, you are asking for one result row per idea value in votes. Since you say votes has only one row, you should expect only two records in the result — one corresponding to the idea value in that row of votes, and the other with NULL (condensing the three rows with no matching vote record).
Did you mean GROUP BY ideas.idea?
Change:
GROUP BY votes.idea
to:
GROUP BY ideas.id
Because votes.idea can be NULL.