MYSQL Counting matching results across multiple tables - mysql

I have the following tables
Business
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| b_id | bigint(20) | NO | PRI | NULL | auto_increment |
| b_name | varchar(255) | NO | | NULL | |
+-------------+--------------+------+-----+---------+----------------+
Locations
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| l_id | bigint(20) | NO | PRI | NULL | auto_increment |
| l_name | varchar(255) | NO | | NULL | |
| b_id | big(20) | NO | | NULL | |
+-------------+--------------+------+-----+---------+----------------+
Jobs
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| j_id | bigint(20) | NO | PRI | NULL | auto_increment |
| j_name | varchar(255) | NO | | NULL | |
| b_id | bigint(20) | NO | | NULL | |
| l_id | bigint(20) | NO | | NULL | |
+-------------+--------------+------+-----+---------+----------------+
People
+-------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+----------------+
| u_id | bigint(20) | NO | PRI | NULL | auto_increment |
| salutation | varchar(10) | NO | | NULL | |
| first_name | varchar(25) | NO | | NULL | |
| last_name | varchar(25) | NO | | NULL | |
+-------------+---------------+------+-----+---------+----------------+
People's Jobs
+-------------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------+------+-----+---------+----------------+
| pj_id | bigint(20) | NO | PRI | NULL | auto_increment |
| u_id | bigint(20) | NO | | NULL | |
| j_id | bigint(20) | NO | | NULL | |
| l_id | bigint(20) | NO | MUL | NULL | |
+-------------+------------+------+-----+---------+----------------+
I need to produce a table that shows
+----------+-------------------------+------------+------------+------------+
| b_id | b_name | Locations | Jobs | People |
+----------+-------------------------+------------+------------+------------+
| 21 | Widgets Inc | 0 | x | 0 |
| 24 | Prince Privates | 0 | 0 | 0 |
| 23 | Halon plc | x | 0 | 0 |
| 18 | Stinky Hotels | x | x | x |
| 20 | Pylon Catering Corps | x | x | x |
| 22 | Skytrain Biscuits | 0 | 0 | 0 |
+----------+-------------------------+------------+------------+------------+
I can achieve a correct count of matching locations for each business with:
SELECT b.b_id,
b.b_name,
count(l.l_id) AS locations
FROM business AS b
LEFT JOIN locations AS l ON b.b_id=l.b_id
GROUP BY b.b_id
ORDER BY b_name
If I extend it to include a count of the jobs at each business and then the count of people at each business it all goes pear shaped.
I know that the following is inherently wrong with regards to getting the count of people (as people can hold more than 1 job). I don't know if I need to use sub selects or COALESCE?
SELECT b.b_id,
b.b_name,
count(l.l_id) AS locations,
count(j.j_id) AS jobs,
count(p.u_id) AS people
FROM business AS b
LEFT JOIN locations AS l ON b.b_id=l.b_id
LEFT JOIN job AS j ON b.b_id=j.b_id
LEFT JOIN people_jobs AS p ON l.l_id=p.l_id
GROUP BY b.b_id
ORDER BY b_name

I think you can do a quick-and-dirty fix of your query by using count(distinct):
SELECT b.b_id, b.b_name,
count(distinct l.l_id) AS locations,
count(distinct j.j_id) AS jobs,
count(distinct p.u_id) AS people
FROM business b LEFT JOIN
locations l
ON b.b_id = l.b_id LEFT JOIN
job j
ON b.b_id = j.b_id LEFT JOIN
people_jobs p
ON l.l_id = p.l_id
GROUP BY b.b_id
ORDER BY b_name ;
It is also possible that the problem is simply that the join to people_jobs needs more conditions:
people_jobs p
ON l.l_id = p.l_id and j.j_id = p.j_id
And maybe a condition on u.
Your problem is that you are trying to do aggregation across multiple dimensions and getting a cartesian product for each business. An alternative that is sometimes necessary is to do the counts in subqueries.

This query should do what you need:
SELECT
b.b_id,
b.b_name,
(SELECT COALESCE(COUNT(l_id ),0) FROM locations WHERE b_id=b.b_id) AS locations,
(SELECT COALESCE(COUNT(j_id ),0) FROM jobs WHERE b_id=b.b_id) AS jobs,
(SELECT COALESCE(COUNT(DISTINCT u_id),0)
FROM jobs j
JOIN people_jobs pj ON pj.j_id=j.j_id
WHERE j.b_id=b.b_id
) AS people
FROM business as b
ORDER BY b_name
You don't need the GROUP BY if you use subSELECTs, as the outer query will return 1 row per b_id, no more.
If instead you do JOIN the 4 tables at the main query level, like you were doing, you have two difficulties:
number of rows increases (avoidable with GROUP BY)
a simple COUNT does not work properly (avoidable with COUNT(DISTINCT
...))
(as shown in Gordon's answer)

You can try This Query:-
SELECT b.b_id,b.b_name,count(l.l_id) AS locations,count(j.j_id) AS jobs,count(p.u_id) AS people
FROM business as b LEFT JOIN locations as l ON b.b_id=l.b_id
LEFT JOIN job as j ON b.b_id=j.b_id
LEFT JOIN people_jobs as p ON l.l_id=p.l_id
GROUP BY b.b_id, b.b_name
ORDER BY b_name
I hope this will work for you.

Related

Debugging a rather difficult/complex MySQL query

I'm having troubles in making a rather difficult MySQL query work. I've been trying, but creating complex queries has never been my strong side.
This query includes 4 tables, which I'll describe of course.
First, we have song table, which I need to select the needed info from.
+--------------+-----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------+------+-----+---------+----------------+
| ID | int(6) | NO | PRI | - | auto_increment |
| Anime | char(100) | NO | | - | |
| Title | char(100) | NO | | - | |
| Type | char(20) | NO | | - | |
| Singer | char(50) | NO | | - | |
| Youtube | char(30) | NO | | - | |
| Score | double | NO | | 0 | |
| Ratings | int(8) | NO | | 0 | |
| Favourites | int(7) | NO | | 0 | |
| comments | int(11) | NO | | 0 | |
| release_year | int(4) | NO | | 2019 | |
| season | char(10) | NO | | Spring | |
+--------------+-----------+------+-----+---------+----------------+
Then we have song_ratings, which basically represents the lists of each user, since once you rate a song, it appears on your list.
+------------+----------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+----------+------+-----+-------------------+----------------+
| ID | int(11) | NO | PRI | 0 | auto_increment |
| UserID | int(11) | NO | MUL | 0 | |
| SongID | int(11) | NO | MUL | 0 | |
| Rating | double | NO | | 0 | |
| RatedAt | datetime | NO | | CURRENT_TIMESTAMP | |
| Favourited | int(1) | NO | | 0 | |
+------------+----------+------+-----+-------------------+----------------+
Users have the option to create custom lists(playlists), and this is the table which they are stored in. This is table lists.
+------------+-----------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+-----------+------+-----+-------------------+----------------+
| ID | int(11) | NO | PRI | 0 | auto_increment |
| userID | int(11) | NO | MUL | 0 | |
| name | char(50) | NO | | - | |
| likes | int(11) | NO | | 0 | |
| favourites | int(11) | NO | | 0 | |
| created_at | datetime | NO | | CURRENT_TIMESTAMP | |
| cover | char(100) | NO | | - | |
| locked | int(1) | NO | | 0 | |
| private | int(1) | NO | | 0 | |
+------------+-----------+------+-----+-------------------+----------------+
And finally, the table which contains all the songs that have been added to any playlists, called list_elements.
+--------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+---------+------+-----+---------+----------------+
| ID | int(11) | NO | PRI | 0 | auto_increment |
| listID | int(11) | NO | MUL | 0 | |
| songID | int(11) | NO | MUL | 0 | |
+--------+---------+------+-----+---------+----------------+
What my query needs to do is list all the songs that are on the list of a user, basically these are the record in song_ratings where the userID = ?(obviously the ID of the user), but are not on a specific playlist(has no record in list_elements) where the ID/listID = ?(the ID of that playlist).
This is the query I've been using so far, but after a while I had realized this doesn't actually work the way I wanted to.
SELECT DISTINCT
COUNT(*)
FROM
song
INNER JOIN song_ratings ON song_ratings.songID = song.ID
LEFT JOIN list_elements ON song_ratings.songID = list_elements.songID
WHERE
song_ratings.userID = 34 AND list_elements.songID IS NULL
I have also tried something like this, and several variants of it
SELECT DISTINCT
COUNT(*)
FROM
song
INNER JOIN song_ratings ON song_ratings.songID = song.ID
INNER JOIN lists ON lists.userID = song_ratings.userID
LEFT JOIN list_elements ON song_ratings.songID = list_elements.songID
WHERE
song_ratings.userID = 34 AND lists.ID = 1
To make it easier, here's a SQL Fiddle, with all the necessary tables and records in them.
What you need to know. When you check for the playlist with the ID of 1, the query needs to return 23(basically all matches).
When you do the same with the ID 4, it need to return 21, if the query works correctly, because the playlist 1 is empty, thus all of the songs in the table song_ratings can be added to it(at least the ones that exist in song table, which is only half of the overall records now).
But playlist 4 already has 2 songs added to it, so only 21 are left available for adding.
Or in case the number are wrong, playlist 1 needs to return all matches. playlist 4 need to return all matches-2(because 2 songs are already added).
The userID needs to remain the same(34), and there are no records with different ID, so don't change it.
You could try subquery with NOT IN clause
SELECT DISTINCT
COUNT(*)
FROM
song
INNER JOIN song_ratings ON song_ratings.songID = song.ID
WHERE
song_ratings.userID = 34 AND song.ID not in (select songID from list_elements group by songID)
Your original query was almost correct. When you use a column from a joined table with a LEFT JOIN in the WHERE-clause, it causes the LEFT JOIN to turn into an INNER JOIN.
You can put the condition into the ON-clause:
SELECT COUNT(*)
FROM song
INNER JOIN song_ratings ON song_ratings.songID = song.ID
LEFT JOIN list_elements ON song_ratings.songID = list_elements.songID
AND list_elements.songID IS NULL
WHERE song_ratings.userID = 34
Using JOINs in MySQL is faster than using subqueries, this would probably be faster as well.
Btw, you do not need DISTINCT when you only have COUNT(*). The COUNT(*) returns only one row so there is no need to take distinct values from one value.

MYSQL Joins where conditions may be null

I'm having an issue with a query using INNER JOIN.
I have two tables. I need the department name and all three approvers. If any of the approvers are NULL, I need that displayed also.
mysql> desc department;
+-------------------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+----------+------+-----+---------+----------------+
| id | int(8) | NO | PRI | NULL | auto_increment |
| departmentName | tinytext | YES | | NULL | |
| primaryApprover | int(8) | YES | | NULL | |
| secondaryApprover | int(8) | YES | | NULL | |
| tertiaryApprover | int(8) | YES | | NULL | |
+-------------------+----------+------+-----+---------+----------------+
mysql> desc approver;
+------------------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+------------+------+-----+---------+----------------+
| id | int(8) | NO | PRI | NULL | auto_increment |
| approverName | tinytext | YES | | NULL | |
| approverPosition | tinytext | YES | | NULL | |
| approverLogonId | tinytext | YES | | NULL | |
| approverEmail | tinytext | YES | | NULL | |
| isActive | tinyint(1) | YES | | NULL | |
+------------------+------------+------+-----+---------+----------------+
The following query works, but it does not give me data where the primary or secondary approver are NULL:
SELECT
a.departmentName as DEPARTMENT,
pa.approvername as PRIMARY,
sa.approvername as SECONDARY,
ta.approvername as TERTIARY
FROM
department as a
INNER JOIN
approver pa on a.primaryapprover=pa.id
INNER JOIN
approver sa on a.secondaryapprover = sa.id
INNER JOIN
approver ta on a.tertiaryapprover = ta.id
ORDER BY
a.departmentname;
Using this query, I get this result:
+--------------------------------+---------------------------+---------------------------+------------------------+
| DEPARTMENT | PRIMARY_APPROVER | SECONDARY_APPROVER | TERTIARY_APPROVER |
+--------------------------------+---------------------------+---------------------------+------------------------+
| Facilities | Washburn, Hoban | Cobb, Jayne | Reynolds, Malcomn |
| Personnel / HR | Frye, Kaylee | Serra, Inara | Book, Dariel |
+--------------------------------+---------------------------+---------------------------+------------------------+
2 rows in set (0.00 sec)
but should get this result:
+--------------------------------+---------------------------+---------------------------+------------------------+
| DEPARTMENT | PRIMARY_APPROVER | SECONDARY_APPROVER | TERTIARY_APPROVER |
+--------------------------------+---------------------------+---------------------------+------------------------+
| Business Office | NULL | Rample, Fanty | Niska, Adelei |
| Facilities | Washburn, Hoban | Cobb, Jayne | Reynolds, Malcomn |
| Personnel / HR | Frye, Kaylee | Serra, Inara | Book, Dariel |
| Technical Services | Tam, River | NULL | Tam, Simon |
+--------------------------------+---------------------------+---------------------------+------------------------+
4 rows in set (0.00 sec)
I'm not good at joins to begin with....what am I missing here?
Just use LEFT JOINS
SELECT
a.departmentName as DEPARTMENT,
pa.approvername as PRIMARY,
sa.approvername as SECONDARY,
ta.approvername as TERTIARY
FROM
department as a
LEFT JOIN
approver pa on a.primaryapprover=pa.id
LEFT JOIN
approver sa on a.secondaryapprover = sa.id
LEFT JOIN
approver ta on a.tertiaryapprover = ta.id
ORDER BY
a.departmentname;
INNER JOIN - keeps only records that match from both sides .
LEFT JOIN - keeps all the records from the left table, and only the record matching from the right table.
You can also use COALESCE to replace null values with a default value like '-1' or something.

Get records from third table

I have three tables, and duplicate column names also :) I want to join albums to products and images to albums. Images are many. Trying such query, it gives me duplicate products. Is there a chance to grab everything in one query?
SELECT
*, p.name as nazwa, a.name as nazwa_al, i.name as obrazek
FROM products p
JOIN
albums a on p.album_id=a.id
JOIN
(SELECT *, images.name AS nazwa_im FROM images ORDER BY images.order ASC) i
ON i.album_id=a.id
ORDER BY p.order ASC
Products
+-------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | text | NO | | NULL | |
| description | text | NO | | NULL | |
| album_id | int(11) | YES | | NULL | |
| order | int(11) | NO | | NULL | |
+-------------+---------+------+-----+---------+----------------+
Albums
+-------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | text | NO | | NULL | |
+-------+---------+------+-----+---------+----------------+
Images
+----------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | text | NO | | NULL | |
| alt | text | NO | | NULL | |
| album_id | int(11) | NO | | NULL | |
| order | int(11) | NO | | NULL | |
+----------+---------+------+-----+---------+----------------+
For the sake of simplicity, I don't want to modify structure of db. The easiest solution for me would be: one product=>one album=>many images
Use joins and use aliases to solve duplicate name error.
You can use distint or group by have results aligned as per same product id.
SELECT
*, p.name as nazwa, a.name as nazwa_al, i.name as obrazek
FROM
products p
JOIN
albums a on p.album_id = a.id
JOIN
images i ON i.album_id = a.id
GROUP BY p.id
ORDER BY p.order ASC
You need to use group_concat if multiple rows on right side.
SELECT
*, p.name as nazwa, a.name as nazwa_al, group_concat(i.name) as obrazek
FROM
products p
JOIN
albums a on p.album_id = a.id
JOIN
images i ON i.album_id = a.id
GROUP BY p.id
ORDER BY p.order ASC

Getting a SQL query to print 0 for null count results across 3 tables

I'm trying to get a SQL query to give me the results of a count but I need the result to include rows where the count is 0. What I found for solutions to this was to use IFNULL(COUNT(*), 0) in place of COUNT(*) however that had no effect on the result. I also tried using a LEFT JOIN but SQL gave me a syntax error if I tried to put in those. Here's my table setup
User
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| UserID | mediumint(9) | NO | PRI | NULL | auto_increment |
| firstName | varchar(15) | NO | | NULL | |
| lastName | varchar(15) | NO | | NULL | |
| Protocol | varchar(10) | NO | | NULL | |
| Endpoint | varchar(50) | NO | | NULL | |
| UsergroupID | mediumint(9) | NO | MUL | NULL | |
+-------------+--------------+------+-----+---------+----------------+
Subscription
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| SubscriptionID | mediumint(9) | NO | PRI | NULL | auto_increment |
| TopicID | mediumint(9) | NO | MUL | NULL | |
| UserID | mediumint(9) | NO | MUL | NULL | |
+----------------+--------------+------+-----+---------+----------------+
Topic
+----------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+--------------+------+-----+---------+----------------+
| TopicID | mediumint(9) | NO | PRI | NULL | auto_increment |
| Name | varchar(50) | NO | | NULL | |
| FBName | varchar(30) | YES | | NULL | |
| FBToken | varchar(255) | YES | | NULL | |
| TWName | varchar(10) | YES | | NULL | |
| TWToken | varchar(50) | YES | | NULL | |
| TWSecret | varchar(50) | YES | | NULL | |
+----------+--------------+------+-----+---------+----------------+
My SQL query to try and get the COUNT is :
SELECT Topic.TopicID as ID, Topic.Name AS TopicName, COUNT(*) AS numSubscriptions
FROM User, Topic, Subscription
WHERE Subscription.UserID = User.UserID
AND Subscription.TopicID = Topic.TopicID
GROUP BY Topic.TopicID;
I've tried replacing COUNT(*) with IFNULL(COUNT(*), 0) and I've tried to replace User,Topic,Subscription with User JOIN Subscription JOIN Topic and I also tried User LEFT JOIN Subscription LEFT JOIN Topic but that got a SQL error.
The output I'm getting is:
+----+-----------+------------------+
| ID | TopicName | numSubscriptions |
+----+-----------+------------------+
| 2 | test | 2 |
| 3 | test2 | 1 |
+----+-----------+------------------+
I need to be getting
+----+-----------+------------------+
| ID | TopicName | numSubscriptions |
+----+-----------+------------------+
| 2 | test | 2 |
| 3 | test2 | 1 |
| 4 | test3 | 0 |
+----+-----------+------------------+
By default, outer joins are left to right. So, the trick is to start with Topic:
SELECT Topic.TopicID as ID, Topic.Name AS TopicName,
COUNT(User.UserID) AS numSubscriptions
FROM Topic
LEFT JOIN Subscription
ON Subscription.TopicID = Topic.TopicID
JOIN User
ON User.UserID = Subscription.UserID
GROUP BY Topic.TopicID
This allows for multiple subscriptions per user and requires that the user record exists to be considered in the count.
COUNT(NULL) evaluates to 0, so any topic records without a corresponding subscription and user record will show as 0.
If you're not concerned whether the user record exists, you could simplify it to the following:
SELECT Topic.TopicID as ID, Topic.Name AS TopicName,
COUNT(Subscription.TopicID) AS numSubscriptions
FROM Topic
LEFT JOIN Subscription
ON Subscription.TopicID = Topic.TopicID
GROUP BY Topic.TopicID
The example below should do what you're after. The column in the COUNT() can be any column of the subscription table, but using its ID is a good practice.
Using the left join ensures that all entries of the user table will show up in the results, even if there are no matching subscriptions.
SELECT User.firstName,
User.lastName,
Topic.Name AS TopicName,
COUNT(Subscription.SubscriptionId) AS numSubscriptions
FROM USER
LEFT OUTER JOIN Subscription ON Subscription.UserID=USER.UserID
LEFT OUTER JOIN Topic ON Subscription.TopicID=Topic.TopicID
GROUP BY User.firstName, User.lastName, Topic.Name;

Replacing right join with derived table to left join

How do i write this query, with left join. since the framework i use doesn't support right join i need to rewrite the query. Can any one suggest me a possible solution.
select Audit.history_id,Audit.field,modifiedtime,operation from Audit right join (select History.history_id from History where refid=2000000020088 order by
modifiedtime limit 5) as Hist on Audit.history_id=Hist.history_id;
desc Audit
+------------------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------+------------+------+-----+---------+-------+
| AUDIT_ID | bigint(19) | | PRI | 0 | |
| HISTORY_ID | bigint(19) | | MUL | 0 | |
| FIELD | varchar(50) | | | |
| OLD_VALUE | varchar(50)| YES | | NULL | |
| NEW_VALUE | varchar(50)| YES | | NULL | |
+------------------+------------+------+-----+---------+-------+
desc History
+---------------+-------------+------+-----+---------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-------------+------+-----+---------------------+-------+
| HISTORY_ID | bigint(19) | | PRI | 0 | |
| REFID | bigint(19) | | MUL | 0 | |
| OPERATION | varchar(50) | YES | | NULL | |
| MODIFIED_TIME | datetime | | | 0000-00-00 00:00:00 | |
+---------------+-------------+------+-----+---------------------+-------+
Simply switch the relations for a left join:
In practice, explicit right outer joins are rarely used, since they can always be replaced with left outer joins (with the table order switched) and provide no additional functionality.
Source:Wikipedia.org
select
Audit.history_id,Audit.field,modifiedtime,operation
from
(
select History.history_id
from History where refid=2000000020088
order by modifiedtime limit 5
) as Hist
left join Audit on (Audit.history_id = Hist.history_id);
Not sure if this will produce exactly same output as the one you have now, but it might give you the right idea:
select Audit.history_id, Audit.field, History.modifiedtime, History.operation
from History
left join Audit on Audit.history_id=History.history_id
where History.refid=2000000020088
order by History.modifiedtime