Why does COUNT(*) does the same as COUNT(column) in this query? - mysql

I am finishing a SQL course and they have a query example that I don't quite understand.
I have
these tables
and I need to get how many 'etiquetas' (tags in Spanish), are in each post, so they have this solution:
SELECT posts.titulo, COUNT(*) num_etiquetas
FROM posts
INNER JOIN posts_etiquetas ON posts.id = posts_etiquetas.post_id
INNER JOIN etiquetas ON etiquetas.id = posts_etiquetas.etiqueta_id
GROUP BY posts.id
ORDER BY num_etiquetas DESC;
I have been trying to understand this query and two questions came up:
Why COUNT(asterisk) does the same as COUNT(etiquetas.nombre)? For me only the latter makes sense, I don't quite understand why COUNT(asterisk) works in the given solution, isn't COUNT(*) supposed to count the total number of rows? maybe the issue is that I don't really understand how GROUP BY really works.
Why does deleting this line doesn't change the result? What is its use in the original solution?
INNER JOIN etiquetas ON etiquetas.id = posts_etiquetas.etiqueta_id

Related

PHP & MySQL Distinct results using INNER JOIN, WHERE and OR clauses (filtering duplicates)

I know the question is very old, but it's hard to run a very specific search and situations may still differ in little but crucial ways. I'm trying to figure out the correct mysql query for my forum search script. What I need is to list both topics and posts containing matching strings with a single query, both have to be supplied with respective results from another table (topics and posts). In case of any matches in the topic name there should be only 1 post per each topic, which is the very first one. My initial query thus:
SELECT topics.topic, posts.post
FROM topics
JOIN posts
ON
posts.topic_id=topics.id
WHERE topics.topic LIKE '%$search%'
OR posts.post LIKE '%$search%'
This will return all the posts with matching strings correctly but also a lot of duplicate topics along with as many irrelevant posts as the matching topic contains. Typically one should use GROUP BY in such situation, but it won't do any good here since grouping by either topics.topic or posts.post will be mutually detrimental. I also don't see how SELECT DISTINCT could help any and it doesn't seem to do anything anyway, and LIMIT 1 cannot be applied individually here. As always, my last resort is 2 individual queries, but if a single call is possible I would be very delighted to see it.
Ok everyone, the solution was lots more simple than expected (no sub-queries are necessary after all):
SELECT topics.topic, posts.post
FROM topics
JOIN posts ON posts.topic_id = topics.id
WHERE topic LIKE '%$search%'
GROUP BY topics.id
UNION
SELECT topics.topic, posts.post
FROM posts
JOIN topics ON topics.id = posts.topic_id
WHERE posts.post LIKE '%$search%'
Of course, the SELECT order matters a lot and one little thing can change everything. Thank you all for help and let's hope this will assist others in the future!
Try this:
SELECT t.topic as tpost, ppost = (SELECT post.post FROM posts p WHERE p.topic_id = t.id ORDER BY posts.id LIMIT 1)
FROM topics t
where topics.topic LIKE 'xxx'
union
SELECT t2.topic, p2.post
FROM posts p2
JOIN topics t2 on p2.topic_id = t2.id
WHERE posts.post LIKE 'xxx'
EDIT
Testing of this suggested solution proved a mod is required:
SELECT t.topic as tpost, (SELECT post.post FROM posts p WHERE p.topic_id = t.id ORDER BY posts.id LIMIT 1) as ppost
FROM topics t
where topics.topic LIKE 'xxx'
union
SELECT t2.topic, p2.post
FROM posts p2
JOIN topics t2 on p2.topic_id = t2.id
WHERE posts.post LIKE 'xxx'

MySql: order by along with group by - performance

I have the performance problem with query that have order by and group by. I have checked similar problems on SO but I did not find the solution to this:(
I have something like this in my db schema:
pattern has many pattern_file belongs to project_template which belongs to project
Now I want to get projects filtered by some data(additional tables that I join) and want to get the result ordered for example by projects.priority and grouped by patterns.id. I have tried many things and to get the desired result I've figured out this query:
SELECT DISTINCT `projects`.* FROM `projects`
INNER JOIN `project_templates` ON `project_templates`.`project_id` = `projects`.`id`
INNER JOIN `pattern_files` ON `pattern_files`.`id` = `project_templates`.`pattern_file_id`
INNER JOIN `patterns` ON `patterns`.`id` = `pattern_files`.`pattern_id`
...[ truncated ]
INNER JOIN (SELECT DISTINCT projects.id FROM `projects` INNER JOIN `project_templates` ON `project_templates`.`project_id` = `projects`.`id`
INNER JOIN `pattern_files` ON `pattern_files`.`id` = `project_templates`.`pattern_file_id`
INNER JOIN `patterns` ON `patterns`.`id` = `pattern_files`.`pattern_id`
...[ truncated ]
WHERE [here my conditions] ORDER BY [here my order]) P
ON P.id = projects.id
WHERE [here my conditions]
GROUP BY patterns.id
ORDER BY [here my order]
From my research I have to INNER JOIN with subquery to conquer the problem "ORDER BY before GROUPing BY" => then I have put the same conditions on the outer query for performance purpose. The order by I had to use again in the outer query too, otherwise the result will be sorted by default.
Now there is real performance problem as I have about 6k projects and when I run this query without any conditions it takes about 15s :/ When I narrow the result by specify the conditions the time drastically dropped down. I've found somewhere that the subquery is run for every outer query row result which could be true when you watch at the execution time :/
Could you please give some advice how I can optimize the query? I do not work much with sql so maybe I do it from the wrong side from the very beginning?
P.S. I have tried WHERE projects.id IN (Select project.id FROM projects ....) and that discarded the performance issue but also discarded the ORDER BY before GROUPing BY
EDIT.
I want to retrieve list of projects, but I want also to filter it and order, and finally I want to get patterns.id unique(that is why I use the group by).
order by in your inner query (p) doesn't make sense (any inner sort will only
have an arbitrary effect).
#Solarflare Unfortunately it does. group by will take first row from grouped result. It preserve the order for join. Well, I believe that it is specific to MySql. Furthermore to keep the order from subquery I could use ORDER BY NULL in outer query :-)
Also, select projects.* ... group by pattern.id is fishy (although MySQL, in contrast to every other dbms, allows you to do this)
so we can assume I retrieve only projects.id, but from docs:
MySQL extends the use of GROUP BY to permit selecting fields that are not mentioned in the GROUP BY clause

Multiple joins on the same table with counting in one query

I have an elementary question about SQL query with joining the same table twice. It sounds very simple, but I have some troubles with it. I hope, anyone can help me with this issue :)
I have two little tables: "peoples" (columns: id, name, ...) and "likes" (id, who, whom). People may set the "likes" to each other. The relationship is many to many.
I want get the table with peoples likes: count of received "likes", delivered and count of mutual likes.
All is correctly, when I use only one join. But for two joins (or more) MySQL combine all rows (as expected) and I get wrong values in counts. I don't know, how I must use count/sum/group-by operators in this case:( I would like to do this without subqueries in one query.
I used a query like this:
SELECT *, count(l1.whom), count(l2.whom)
FROM people p
LEFT JOIN likes l1 ON l1.who = p.id
LEFT JOIN likes l2 ON l2.whom = p.id
GROUP BY p.id;
SELECT p.id, name,
count(lwho.who) delivered_likes,
count(lwhom.whom) received_likes,
count(lmut.who) mutual_likes
FROM people AS p
LEFT JOIN likes AS lwho ON p.id = lwho.who
LEFT JOIN likes AS lwhom ON lwhom.id = lwho.id
LEFT JOIN likes AS lmut ON lwhom.who = lmut.whom AND lwhom.whom = lmut.who
GROUP BY p.id;
But it's calculated the counts of likes incorrect.
It's issue just for training and performance is not important, but I guess, that three joins in my last query is too much. Can I do it using 2 joins?
Thanks in advance for help.
I surmise that there is a 1:N relationship between people and likes.
One problem with your second query, as far as I can tell, is that the lwhom correlation of likes is joined to lwho via id=id. Basically lwhom is lwho. I'd recommend changing the ON clause for this correlation from lwhom.id = lwho.id to p.id = lwhom.whom.
The counts will still be affected by the JOINs, however. Supposing that you have an ID column in the likes table, though, you could then have each COUNT tally the distinct Like IDs per person – if not, consider just using COUNT(DISTINCT correlation.*) instead.
Digressions aside, the following should hopefully work:
SELECT p.id, name,
count(distinct lwho.id) delivered_likes,
count(distinct lwhom.id) received_likes,
count(distinct lmut.id) mutual_likes
FROM people AS p
LEFT JOIN likes AS lwho ON p.id = lwho.who
LEFT JOIN likes AS lwhom ON p.id = lwhom.whom
LEFT JOIN likes AS lmut ON lwhom.who = lmut.whom AND lwhom.whom = lmut.who
GROUP BY p.id,p.name;
I have an SQL Fiddle here.

Adding count(*) from a join to a query

Ugh... I really struggle with these mySQL joins...
Here's what I'm after. My current query looks like this.
SELECT profiles.photo, postings.postid, postings.text, postings.date, members.fname, members.lname, members.userid
FROM profiles, postings, members
WHERE postings.wallid=postings.posterid
AND postings.wallid=members.userid
AND postings.wallid=profiles.userid
I'm trying to add in a count(*) of matching records from another table. The other table is called likes, and postings.postid = likes.post_id.
What I've tried (and returns no results) is this...
SELECT profiles.photo, postings.postid, postings.text, postings.date, members.fname, members.lname, members.userid,
(SELECT COUNT(*) FROM likes WHERE postings.postid=likes.post_id)
FROM profiles, postings, members, likes
WHERE postings.wallid=postings.posterid
AND postings.wallid=members.userid
AND postings.wallid=profiles.userid
What I've done here is add the nested SELECT, which I thought would resolve this.
Basically, what I'm asking... given my first query, how can I also obtain the count of the number of records in the likes table where postings.postid = likes.post_id? As always, any help / tips / suggestions is always appreciated.
try that:
SELECT profiles.photo, postings.postid, postings.text, postings.date, members.fname,members.lname, members.userid,
likes.like_count
FROM profiles, members, postings
LEFT JOIN (SELECT post_id, COUNT(*) like_count FROM likes GROUP BY post_id) as likes
ON postings.postid=likes.post_id
WHERE postings.wallid=postings.posterid
AND postings.wallid=members.userid
AND postings.wallid=profiles.userid
First off, it's a lot easier to use the new-style sql syntax for this. Conceptually, it's going to be confusing and tough to do outer joins with that old syntax. Second, you're missing the "profile" table and mistakenly seem to have self-joined the postings table. Try writing your queries more like this:
SELECT
profiles.photo,
postings.postid,
postings.text,
postings.date,
members.fname,
members.lname,
members.userid,
COUNT(DISTINCT like_id) as CountOfLikes
FROM profiles
INNER JOIN postings
ON postings.wallid=profiles.userid
INNER JOIN members
ON members.userid=postings.wallid
LEFT JOIN likes
ON likes.post_id = postings.post_id;

MySQL alternative to nested queries?

I have read that using nested queries is not a good idea, It was said that nested queries slow down mysql a great lot and stuff like that, so I figured I should not use nested queries, but what is really an alternative to that?
For example I have a comments rating system which helps bring top-rated comments to the top and it goes in 2 tables:
comments which stores comments
comment_ratings which stores the comment ID and the person who has rated it.
Note: there's only positive ratings so if a record exists in the comment_ratings table its +1.
So now if I wanted to pick up comments for some stuff I'd go like
SELECT stuff, (SELECT COUNT(*) FROM comment_ratings s WHERE s.id = c.id) as votes
FROM comments c
ORDER BY votes DESC
How would I do this without using a nested query?
Whether nested queries are good or bad depends on the situation. In your particular example, if you have an index on comment_ratings(id), then there is probably no issue. Maybe that should be comment_ratings(comment_id) -- the naming convention is poor for these tables.
You could replace this with an aggregation query:
select c.*, count(cr.id) as votes
from comments c left join
comment_ratings cr
on c.id = cr.id
group by c.id
order by votes desc;
However, because of the way that MySQL implements group by, this might perform worse than your original query. I prefer the group by. To me, it more clearly describes what you want and most other database engines will optimize it well.
select stuff, count(*) as votes
from comments c, comment_ratings cr
where c.id = cr.id
group by stuff
order by votes desc;
and as gordon mentioned, to not forget the comments with no rating.. go for left join:
select stuff, count(cr.id) as votes
from comments c left join
comment_ratings cr on c.id = cr.id
group by stuff
order by votes desc;
http://sqlfiddle.com/#!2/79e54/2