Refactor a join on multiple tables - mysql

I'm having an issue with a query of mine and how it's being joined. I need to pull some data from multiple tables in regards to CSR agents and the number of dealers they are associated with.
As shown below, I need to return a number of daily contact records for each user as well as a number of dealers associated with that number. Eventually I need to use a formula made from these 1 values, but I can do that with no problem I'm just having an issue getting the two values appropriately.
Currently, I'm getting the same number for both count values, where they should be different.
The code:
SELECT
c.user AS UserID,
COUNT(*) AS NumberOfDailyContacts, -- number of records in contact_events for this user
COUNT(d.csr) AS NumberOfDealerContacts, -- number of dealers associated with this user
FROM contact_events c
JOIN users u
ON c.user = u.id
JOIN dealers d
ON c.dealer_num = d.dealer_num
LEFT JOIN attr_list al
ON d.csr = al.data
GROUP BY UserID;
The fiddle: http://sqlfiddle.com/#!9/bd375/1
Desired output:
12345 | 2 | 3
23456 | 2 | 6
34567 | 2 | 2
45678 | 2 | 2
56789 | 2 | 5
67890 | 2 | 2
78911 | 2 | 4
But currently the fiddle is giving me all 2's for both columns.
The table structure for these tables sucks but it's what I'm given currently. The problem is that the contact events table uses the user ID for the CSR, where the dealer table associates by the 'data' value on the attribute_list table. So I basically have to say:
If the user ID In the contact_events table matches the user_id for a given data field in attr_list, show dealers associated with that user.
Hopefully the fiddle makes this a little more clear but I'll answer any questions you may have.

Use a subquery that joins attr_list with dealers to get the number of dealers per user.
select
c.user as UserID,
count(*) as NumberOfDailyContacts,
al.NumberOfDealerContacts
From contact_events c
join users u
on c.user = u.id
join dealers d
on c.dealer_num = d.dealer_num
left join (
SELECT user_id, COUNT(*) AS NumberOfDealerContacts
FROM attr_list AS al
JOIN dealers AS d ON d.csr = al.data
GROUP BY user_id) AS al
ON al.user_id = c.user
GROUP BY UserID
fiddle

Your joins were out of order, which caused your counts to get messed up. Here's what it should be, no subqueries needed:
SELECT
u.id AS UserID
,COUNT(DISTINCT c.id) AS NumberOfDailyContacts
,COUNT(DISTINCT d.dealer_num) AS NumberOfDealerContacts
FROM users u
LEFT JOIN attr_list al ON u.id = al.user_id
LEFT JOIN dealers d ON d.csr = al.data
LEFT JOIN contact_events c ON c.user = u.id
GROUP BY u.id;

Related

Count comments and get average rating from mysql

I just can't figure out how to get average rating and count comments from my mysql database.
I have 3 tables (activity, rating, comments) activity contains the main data the "activities", rating holds the ratings and comments - of course, the ratings.
activity_table
id | title |short_desc | long_desc | address | lat | long |last_updated
rating_table
id | activityid | userid | rating
comment_table
id | activityid | userid | rating
I'm now trying to the data from activity plus the comment_counts and average_rating in one query.
SELECT activity.*, AVG(rating.rating) as average_rating, count(comments.activityid) as total_comments
FROM activity LEFT JOIN
rating
ON activity.aid = rating.activityid LEFT JOIN
comments
ON activity.aid = comments.activityid
GROUP BY activity.aid
...doesn't do the job. It gives me the right average_rating, but the wrong amount of comments.
Any ideas?
Thanks a lot!
You are aggregating along two different dimensions. The Cartesian product generated by the joins affects the aggregation.
So, you should aggregate before the joins:
SELECT a.*, r.average_rating, COALESCE(c.total_comments, 0) as total_comments
FROM activity a LEFT JOIN
(SELECT r.activityid, AVG(r.rating) as average_rating
FROM rating r
GROUP BY r.activityid
) r
ON a.aid = r.activityid LEFT JOIN
(SELECT c.activityid, COUNT(*) as total_comments
FROM comments c
GROUP BY c.activityid
) c
ON a.aid = c.activityid;
Notice that the outer GROUP BY is no longer needed.

Sql conditional count with join

I cannot find the answer to my problem here on stackoverflow. I have a query that spans 3 tables:
newsitem
+------+----------+----------+----------+--------+----------+
| Guid | Supplier | LastEdit | ShowDate | Title | Contents |
+------+----------+----------+----------+--------+----------+
newsrating
+----+----------+--------+--------+
| Id | NewsGuid | UserId | Rating |
+----+----------+--------+--------+
usernews
+----+----------+--------+----------+
| Id | NewsGuid | UserId | ReadDate |
+----+----------+--------+----------+
Newsitem obviously contains newsitems, newsrating contains ratings that users give to newsitems, and usernews contains the date when a user has read a newsitem.
In my query I want to get every newsitem, including the number of ratings for that newsitem and the average rating, and how many times that newsitem has been read by the current user.
What I have so far is:
select newsitem.guid, supplier, count(newsrating.id) as numberofratings,
avg(newsrating.rating) as rating,
count(case usernews.UserId when 3 then 1 else null end) as numberofreads from newsitem
left join newsrating on newsitem.guid = newsrating.newsguid
left join usernews on newsitem.guid = usernews.newsguid
group by newsitem.guid
I have created an sql fiddle here: http://sqlfiddle.com/#!9/c8add/8
Both count() calls don't return the numbers I want. numberofratings should return the total number of ratings for that newsitem (by all users). numberofreads should return the number of reads for the current user for that newsitem.
So, newsitem with guid d104c330-c319-40e8-8be3-a7c4f549d35c should have 2 ratings and 3 reads for the current user with userid = 3.
I have tried conditional counts and sums, but no success yet. How can this be accomplished?
The main problem that I see is that you're joining in both tables together, which means that you're going to effectively be multiplying out by both numbers, which is why your counts aren't going to be correct. For example, if the Newsitem has been read 3 times by the user and rated by 8 users then you're going to end up getting 24 rows, so it will look like it has been rated 24 times. You can add a DISTINCT to your COUNT of the ratings IDs and that should correct that issue. Average should be unaffected because the average of 1 and 2 is the same as the average of 1, 1, 2, & 2 (for example).
You can then handle the reads by adding the userid to the JOIN condition (since it's an OUTER JOIN it shouldn't cause any loss of results) instead of in a CASE statement for your COUNT, then you can do a COUNT on distinct id values from Usernews. The resulting query would be:
SELECT
I.guid,
I.supplier,
COUNT(DISTINCT R.id) AS number_of_ratings,
AVG(R.rating) AS avg_rating,
COUNT(DISTINCT UN.id) AS number_of_reads
FROM
NewsItem I
LEFT OUTER JOIN NewsRating R ON R.newsguid = I.guid
LEFT OUTER JOIN UserNews UN ON
UN.newsguid = I.guid AND
UN.userid = #userid
GROUP BY
I.guid,
I.supplier
While that should work, you might get better results from a subquery, as the above needs to explode out the results and then aggregate them, perhaps unnecessarily. Also, some people might find the below to be a little clearer.
SELECT
I.guid,
I.supplier,
R.number_of_ratings,
R.avg_rating,
COUNT(*) AS number_of_reads
FROM
NewsItem I
LEFT OUTER JOIN
(
SELECT
newsguid,
COUNT(*) AS number_of_ratings,
AVG(rating) AS avg_rating
FROM
NewsRating
GROUP BY
newsguid
) R ON R.newsguid = I.guid
LEFT OUTER JOIN UserNews UN ON UN.newsguid = I.guid AND UN.userid = #userid
GROUP BY
I.guid,
I.supplier,
R.number_of_ratings,
R.avg_rating
I'm with Tom you should use a subquery to calculate the user count.
SQL Fiddle Demo
SELECT NI.guid,
NI.supplier,
COUNT(NR.ID) as numberofratings,
AVG(NR.rating) as rating,
user_read as numberofreads
FROM newsitem NI
LEFT JOIN newsrating NR
ON NI.guid = NR.newsguid
LEFT JOIN (SELECT NewsGuid, COUNT(*) user_read
FROM usernews
WHERE UserId = 3 -- use a variable #user_id here
GROUP BY NewsGuid) UR
ON NI.guid = UR.NewsGuid
GROUP BY NI.guid,
NI.supplier,
numberofreads;

Select with Multiple Counts on Left Join on Same Table

I'm not certain that this can be done, I have a table of users with a related table of user activity joined on a foreign key. Activity has different types, e.g. comment, like etc. I need to get users filtered by the number of each different type of activity.
What I have so far is this:
SELECT
users.*,
COUNT(t1.id) AS comments,
COUNT(t2.id) AS likes
FROM users
LEFT JOIN activity AS t1 ON users.id = t1.user_id
LEFT JOIN activity AS t2 ON users.id = t2.user_id
WHERE t1.activity_type_id = 1 AND t2.activity_type_id = 2
GROUP BY users.id
HAVING comments >= 5 AND likes >= 5
This seems to be close but it's returning a user with a count of 5 both likes and comments, when in reality the user has 5 likes and 1 comment.
To be clear I want this query to return users who have 5 or more likes and also users who have 5 or more comments.
UPDATE:
I've created an SQL Fiddle. In this case I have 3 users:
User 1: 6 comments, 8 likes
User 2: 3 comments, 2 likes
User 3: 5 comments, 2 likes
I want the query to return only user 1 and 3, with their respective totals of likes and comments.
http://sqlfiddle.com/#!2/dcc63/4
You can use conditional summing to do the count and due to the way MySQL treats boolean expressions an expression like sum(case when et.name = 'comment' then 1 else 0 end) (the "normal" SQL syntax) can be reduced to sum(case when et.name = 'comment').
SELECT
u.id,
sum(et.name = 'comment') AS comments,
sum(et.name = 'like') AS likes
FROM users AS u
LEFT JOIN engagements AS e ON u.id = e.user_id
JOIN engagement_types AS et ON e.engagement_type_id = et.id
GROUP BY u.id
HAVING sum(et.name = 'comment') >= 5
OR sum(et.name = 'like') >= 5
Result:
| ID | COMMENTS | LIKES |
|----|----------|-------|
| 1 | 6 | 8 |
| 3 | 5 | 2 |
Sample SQL Fiddle

Getting a list of data based on Items associated with User

I have a table called reviews. I get the most current user reviews like this:
SELECT b.item, b.item_id, a.review_id, a.review, c.category, u.username, c.cat_id
FROM reviews a
INNER JOIN items b
ON a.item_id = b.item_id
INNER JOIN master_cat c
ON c.cat_id = b.cat_id
INNER JOIN users AS u
ON u.user_id = a.user_id
ORDER BY a.review_id DESC;
What I want to do is slightly alter it to be more personable for users.
I have another table of user "connections". Kind of like Twitter. When a user follows someone, it gets logged in this table called profile_follow. This has three columns. id, user_id, follow_id. Simply: If I am user #1, and I "follow" user # 3 and user #5, two rows will be added in this table:
profile_follow
------------------------
id | user_id | follow_id
| 1 | 3
| 1 | 5
Here is how I want to change the query above. I want to only show newest reviews, from people you follow.
So I will need at least one more join, for table profile_follow. And I need to pass in a user_id (it's a php function), doing something like `WHERE profile_follow.user_id = '{$user_id}'. I think I will have to add a sub query on this, not use.
Can someone show me how to finish this query? I am not sure how to handle it from here? All of my attempts have been off so far.
I think I need to do something like:
Selectfollow_idwhereuser_id= (logged in user)
And then in the main query:
Select reviews only with profile_follow.follow_id = review.user_id.
I can't figure out how to make this filter work.
Always difficult without testing, but:
SELECT b.item, b.item_id, a.review_id, a.review, c.category, u.username, c.cat_id
FROM reviews a
INNER JOIN items b
ON a.item_id = b.item_id
INNER JOIN master_cat c
ON c.cat_id = b.cat_id
INNER JOIN profile_follow pf
ON pf.follow_id = a.user_id
WHERE profile_follow.user_id = '{$user_id}'
ORDER BY a.review_id DESC;

MySQL - Retrieve last update per day per user

I have a hypothetical table "users" with the columns
user_id (auto incremented)
name
foo
bar
last_updated
This table is updated multiple times per day. How can I query to get the last update, per user, per day, going back X days?
Example Data
1 John a b "2013-01-31 02:01:12"
2 Rich c d "2013-01-31 22:41:12"
3 John e f "2013-01-31 22:01:15"
4 Rich g h "2013-02-01 16:01:12"
5 John i j "2013-02-01 22:21:12"
6 Rich k m "2013-02-01 22:21:12"
Desired Return Set:
2 Rich c d 2013-01-31
3 John e f 2013-01-31
5 John i j 2013-02-01
6 Rich k m 2013-02-01
I am able to get the last updated per user overall with the following query, it's applying it to each day that I am struggling with.
SELECT u1.*
FROM users u1
LEFT JOIN users u2
ON (u1.name = u2.name AND u1.user_id < u2.user_id)
WHERE u2.user_id IS NULL
First of all the table name users is very confusing, since these aren't the users but the logins.
Beyond that you're on the right way. You just need to add a comparison on the date in the join.
SELECT u1.*
FROM users u1
LEFT JOIN users u2
ON (u1.name = u2.name AND date_format(u1.last_updated, '%Y-%m-%d') = date_format(u2.last_updated, '%Y-%m-%d') AND u1.user_id < u2.user_id)
WHERE u2.user_id IS NULL
See it work in this SQL fiddle.
Do this by summarizing the data at the date level and then joining back:
select u.*
from users u join
(select u.nanme, DATE(last_updated) as thedate, MAX(last_updated) as maxlastupdated
from users u
group by u.name, DATE(last_updated)
) usum
on u.name = usum.name and
u.last_updated = usum.maxlastupdated