Sql Count on many to many - mysql

I have three tables
post
id | statement | date
features
id | feature
post_feature (many to many table between Post and Feature)
post_id | feature_id
I want to fire a query that will give me count of different distinct features and its respective features for the posts that are in given date period. I have just started learning SQL and I am not able to crack this one.
I tried the following one but not getting correct results.
SELECT f.feature, count(f.feature)
FROM post_feature l
JOIN features f ON (l.featureid = f.id AND l.featureid IN (
select post.id from post where post.date > 'some_date'))
GROUP BY f.feature

You can try like this:
SELECT f.feature, count(f.feature)
FROM post_feature l
JOIN features f ON l.featureid = f.id
JOIN post p ON l.post_id =p.id
WHERE p.date > 'some_date'
GROUP BY f.feature

select f.feature, count(*)
from post_feature l inner join features f on l.feature_id = f.id
inner join post p on l.post_id = p.id
where p.date > 'some_date'
group by f.feature

Your SQL is quite creative. However, your join in the IN clause is on the wrong columns. It should be on postid to postid.
Although that fixes the query, here is a better way to write it:
SELECT f.feature, count(f.feature)
FROM post p join
post_feature pf
on p.id = pf.postid join
feature f
on pf.featureid = f.id
where post.date > 'some_date'
GROUP BY f.feature
This joins all the tables, and then summarizes by the information you want to know.

Try
SELECT f.feature, count(DISTINCT f.feature)
FROM post_feature l
JOIN features f ON (l.featureid = f.id AND l.featureid IN (
select post.id from post where post.date > 'some_date'))
GROUP BY f.feature

Related

Multiple aggregate functions in SQL query

For this example I got 3 simple tables (Page, Subs and Followers):
For each page I need to know how many subs and followers it has.
My result is supposed to look like this:
I tried using the COUNT function in combination with a GROUP BY like this:
SELECT p.ID, COUNT(s.UID) AS SubCount, COUNT(f.UID) AS FollowCount
FROM page p, subs s, followers f
WHERE p.ID = s.ID AND p.ID = f.ID AND s.ID = f.ID
GROUP BY p.ID
Obviously this statement returns a wrong result.
My other attempt was using two different SELECT statements and then combining the two subresults into one table.
SELECT p.ID, COUNT(s.UID) AS SubCount FROM page p, subs s WHERE p.ID = s.ID GROUP BY p.ID
and
SELECT p.ID, COUNT(f.UID) AS FollowCount FROM page p, follow f WHERE p.ID = f.ID GROUP BY p.ID
I feel like there has to be a simpler / shorter way of doing it but I'm too unexperienced to find it.
Never use commas in the FROM clause. Always use proper, explicit, standard JOIN syntax.
Next, learn what COUNT() does. It counts the number of non-NULL values. So, your expressions are going to return the same value -- because f.UID and s.UID are never NULL (due to the JOIN conditions).
The issue is that the different dimensions are multiplying the amounts. A simple fix is to use COUNT(DISTINCT):
SELECT p.ID, COUNT(DISTINCT s.UID) AS SubCount, COUNT(DISTINCT f.UID) AS FollowCount
FROM page p JOIN
subs s
ON p.ID = s.ID JOIN
followers f
ON s.ID = f.ID
GROUP BY p.ID;
The inner joins are equivalent to the original query. You probably want left joins so you can get counts of zero:
SELECT p.ID, COUNT(DISTINCT s.UID) AS SubCount, COUNT(DISTINCT f.UID) AS FollowCount
FROM page p LEFT JOIN
subs s
ON p.ID = s.ID LEFT JOIN
followers f
ON p.ID = f.ID
GROUP BY p.ID;
Scalar subquery should work in this case.
SELECT p.id,
(SELECT Count(s_uid)
FROM subs s1
WHERE s1.s_id = p.id) AS cnt_subs,
(SELECT Count(f_uid)
FROM followers f1
WHERE f1.f_id = p.id) AS cnt_fol
FROM page p
GROUP BY p.id;

MySQL query with multiple INNER JOIN

I'm a little bit confused about a stupid query:
I get rows from the table posts joined with the table authors and the table comments, in a way like this:
SELECT posts.*, authors.name, COUNT(comments.id_post) AS num_comments
FROM posts JOIN authors ON posts.id_author = authors.id_author
LEFT JOIN comments ON posts.id_post = comments.id_post
WHERE posts.active = 1
AND comments.active = 1
this doesn't work, of course.
What I try to do is to retrieve:
1) all my active post (those that were not marked as deleted);
2) the names of their authors;
3) the number of active comments (those that were not marked as deleted) for each post (if there is at least one);
What's the way? I know it's a trivial one, but by now my brain is in offside…
Thanks!
Presumably, id_post uniquely identifies each row in posts. Try this:
SELECT p.*, a.name, COUNT(c.id_post) AS num_comments
FROM posts p JOIN
authors a
ON p.id_author = a.id_author LEFT JOIN
comments c
ON p.id_post = c.id_post
WHERE p.active = 1 AND c.active = 1
GROUP BY p.id_post;
Note that this uses a MySQL extension. In most other databases, you would need to list all the columns in posts plus a.name in the group by clause.
EDIT:
The above is based on your query. If you want all active posts with a count of active comments, just do:
SELECT p.*, a.name, SUM(c.active = 1) AS num_comments
FROM posts p LEFT JOIN
authors a
ON p.id_author = a.id_author LEFT JOIN
comments c
ON p.id_post = c.id_post
WHERE p.active = 1
GROUP BY p.id_post;
Since you are doing a count, you need to have a group by. So you will need to add
Group By posts.*, authors.name
You should you GROUP BY clause together with aggregate functions. Try something similar to:
SELECT posts.*, authors.name, COUNT(comments.id_post) AS num_comments
FROM posts JOIN authors ON posts.id_author = authors.id_author
LEFT JOIN comments ON posts.id_post = comments.id_post
-- group by
GROUP BY posts.*, authors.name
--
WHERE posts.active = 1
AND comments.active = 1
I found the correct solution:
SELECT posts.id_post, authors.name, COUNT(comments.id_post) AS num_comments
FROM posts JOIN authors
ON posts.id_author = authors.id_author
LEFT OUTER JOIN comments
ON (posts.id_post = comments.id_post AND comments.active = 1)
WHERE posts.active = 1
GROUP BY posts.id_post;
Thanks everyone for the help!

Disambiguating identical columns in a join

[Yes, I've searched for an answer for this here and in google but this is a little difficult to query for.]
(MySQL database.)
messages table:
messageid
senderid
recipientid
people table:
personid
name
I wish to issue a query that returns the following:
messageid sender_name recipient_name
1 larry jane
2 mark alice
etc.
The following doesn't do it, and I expected that it would not, but it's a place to start:
select m.messageid, p.name as "sender_name", p.name as "recipient_name"
from messages m, people p
where m.senderid = p.personid and m.recipientid = p.personid
The issue is that I don't know how in sql to specifically reference the sender and the recipient since they are part of the same join clause, if that makes sense.
thanks
try:
select m.messageid, pSender.name as "sender_name", pRecipient.name as "recipient_name"
from messages m
inner join people pSender on m.senderId = pSender.personId
inner join people pRecipient on m.recipientid = pRecipient.personId
For your join method (i think this should work... i'm not very familiar with comma joins)
select m.messageid, p.name as "sender_name", p.name as "recipient_name"
from messages m, people pSender, people pRecipient
where m.senderid = pSender.personid and m.recipientid = pRecipient.personid
You can join the same table into the query twice, just alias it differently, something aking to:
select m.messageid, s.name as "sender_name", r.name as "recipient_name"
from messages m
inner join people s on m.senderid = s.personid
inner join people r on m.recipientid = r.personid
Your query return only messages that is sent from a person to itself. Something like:
select m.messageid, p1.name as sender_name, p2.name as recipient_name
from messages m,
join people p1
on m.senderid = p1.personid
join people p2
on m.recipientid = p2.personid
That is you need one join for sender and one join for receiver

MySQL Subquery Question

I have a query to pull a total number for a given publisher ID. I'd like to use it as a subquery so I can iterate over all publisher IDs.
My working query for a given ID is:
SELECT SUM( d.our_cost )
FROM articles a
CROSS JOIN domains d ON a.domain_id = d.id
AND d.publisher_id = '1094'
I'd like to pull this figure for all ID's in publisher p table where d.publisher_id = p.id
So far I've tried the following to no avail:
SELECT p.id, p.contact_name, p.contact_email,
(SELECT SUM(d.our_cost)
FROM articles a
CROSS JOIN domains d ON a.domain_id = d.id and d.publisher_id = p.id) total
FROM publishers p
The specific error I'm getting is: Unknown column 'p.id' in 'on clause'
I think you should modify your query and put the subquery in the from clause, something like this:
SELECT p.id, p.contact_name, p.contact_email, total.total_cost
FROM
(
SELECT SUM(d.our_cost) as total_cost, d.publisher_id
FROM articles a CROSS JOIN domains d ON a.domain_id = d.id ) total
JOIN publishers p on total.publisher_id = p.id
I'm assuming you've gotten an error about your syntax, try:
SELECT p.id, p.contact_name, p.contact_email, SUM(d.our_cost) as total
FROM articles a
CROSS JOIN domains d ON a.domain_id = d.id
JOIN publishers p ON d.publisher_id = p.id
seems like a group by would be handy here instead
Also it seems like you dont need articles table at all (unless you have additional business rules)
SELECT p.id, p.contact_name, p.contact_email, IFNULL(SUM(d.our_cost),0) AS total
FROM publishers p
LEFT JOIN domains d ON d.publisher_id = p.id
GROUP BY p.id

Mysql performance/optimization help

So this was a small site that got extremely popular very fast and now and im having major problems with the below sql query.
I understand that my DB design is not great. I have text field for subjects and programs witch contains a serialized array and i search it using like.
the below query takes about a minute.
SELECT p.*, e.institution
FROM cv_personal p
LEFT JOIN cv_education e
ON p.id = e.user_id
LEFT JOIN cv_literacy l
ON p.id = l.user_id
WHERE 1 = 1
AND (e.qualification LIKE '%php%' OR e.subjects LIKE '%php%' OR l.programs LIKE '%php%')
GROUP BY p.id
ORDER BY p.created_on DESC
What an EXPLAIN show ?
I think you can add conditions to a join to reduce number of records which are used :
SELECT p.*, e.institution
FROM cv_personal p
LEFT JOIN cv_education e
ON (e.qualification LIKE '%php%' OR e.subjects LIKE '%php%') AND p.id = e.user_id
LEFT JOIN cv_literacy l
ON l.programs LIKE '%php%' AND p.id = l.user_id
ORDER BY p.created_on DESC
And why do you use GROUP BY ?