Hybrid Left/Right Join based on condition? - mysql

I'm trying to write an SQL statement to retrieve a list of users from a database, along side their company name (if they have a company associated with them). However, there are a couple gotchas:
Not all users have companies, but I still need to show these people in the list.
Even if a user has a company, that company could be soft-deleted (the record is still in the database, but is flagged with is_deleted = 1), and I don't want to show users that are associated with "deleted" companies.
So essentially I want to SELECT from the User table and LEFT JOIN the company table, but I don't want to include the User record at all if the company they are assigned to is_deleted.
My first inclination is that I would have to use a UNION to merge two queries together, but I was hoping there would be a cleaner way to do it?
Using Mysql 5.1

SELECT U.name Username, C.name Company
FROM User U
LEFT OUTER JOIN Company C
ON U.companyid = C.id
WHERE C.id IS NULL OR C.is_deleted = 0
C.id IS NULL gets the users with no company, and C.is_deleted = 0 gets the users with companies that haven't been soft-deleted.

Try joining to a table that excludes the deleted companies:
SELECT U.Name, C.Name
FROM User U LEFT OUTER JOIN
(SELECT CompanyId, CompanyName
FROM Company
WHERE is_deleted = 0)
C ON U.CompanyId = C.CompanyId

Related

join sql query based on value of another table (in one query)

Let's say that I have two tables A and B where
A is table countries with columns id, name, created, modified
that contains a bunch of countries
And B is table users with columns id, first_name, last_name, email, country_id, created, modified
that contains a bunch of users linked to countries via foreign key country_id
What is the most efficient query to get all the countries that don't have a user with email address "myemail#test.com" associated to it?
I tried something like the following but that didn't work:
SELECT DISTINCT
c.*
FROM
countries c
LEFT JOIN
users u ON u.country_id = c.id
WHERE
u.email <> 'myemail#test.com'
Thanks for any help
NOTE I also tried putting the condition on the email column in the ON clause that didn't work either
A left join is fine, you just need to set it up correctly:
SELECT c.*
FROM countries c LEFT JOIN
users u
ON u.country_id = c.id AND u.email = 'myemail#test.com'
WHERE u.country_id IS NULL;
In terms of performance, this should be pretty similar to NOT EXISTS and NOT IN (although I do not recommend the latter because it has different behavior when there are NULL values).
When you say "that don't have a user with email address "myemail#test.com"",
do you mean no email address -or- not that exact email address?
Updated
Then this should do:
SELECT DISTINCT c.*
FROM countries c
LEFT JOIN users u ON u.country_id = c.id and u.email = 'myemail#test.com'
WHERE u.country_id is null
Which I believe is what Gordon already had.
Updated Again
In that case, try:
SELECT DISTINCT c.*
FROM countries c
INNER JOIN users u ON u.country_id = c.id and ISNULL(u.email, '') = ''
This looks for Null or Empty String email adresses all others are excluded from the join and therefore from the result set.
I hope this helps.

COUNT via multi-chain join

I have this hierarchy in my database (from lowest to highest):
User => Dept => Area => Company
Now I need to make a table that shows all companies (some info about them taken directly from companies table) but the last column in the HTML table I want to be Number of users. I know I need to join the tables together and perhaps join table to itself, but how do I do this?
Each of these tables have a column linking to its parent table (except Company ofc).
JOIN the tables:
SELECT
c.companyId,
c.CompanyName,
IFNULL(COUNT(u.userID), 0) AS 'Number Of Users'
FROM Company AS c
LEFT JOIN Area AS a ON c.CompanyID = a.CompanyID
LEFT JOIN Dept AS d ON a.DeptId = d.DeptId
LEFT JOIN users AS u ON D.UserId = u.UserId
GROUP BY c.companyId,
c.CompanyName;
Note that: LEFT JOIN with IFNULL will give you those companies that has no matched rows in the other tables; with count zero in this case

Can this query be rewritten without UNION, and is it scaleable?

I have a few tables:
In the product-table, I have a list of products.
In the user-table, I have a list of users.
In the group-table, I have groups of users.
IN the group_member-table, I have linked group and member (many-to-many)
In the user_product-table, I have linked user and product (many-to-many)
In the group_product-table, I have linked group and product (many-to-many)
So a user could have many products, a product could have many users. A user can be member of many groups, a group could have many members. A group could have many products, a product could have many groups. In other words, a product can have both groups and users.
What I want to ask the database is: "List all the products that a given user has access to, either through a direct relation in the user_product-table, or through the groups that the user is member of. I want the name of the product and the name of the user."
This is the query that I have come up with:
# First get all the products the user has access to via a group.
SELECT product.name,
user.first_name
FROM product
INNER JOIN group_product
ON group_product.product_id = product.product_id
INNER JOIN group
ON group.group_id = group_product.group_id
INNER JOIN group_member
ON group_member.group_id = group.group_id
INNER JOIN user
ON user.user_id = group_member.user_id
WHERE user.user_id = 1
UNION
# Now get all the products via direct access from user_product.
SELECT product.name,
user.first_name
FROM product
INNER JOIN user_product
ON user_product.product_id = product.product_id
INNER JOIN user
ON user.user_id = user_product.user_id
WHERE user.user_id = 1
Is this a good query, or is it better/possible to rewrite this into a JOIN only query? Would this be a fast query if there were 100 000 users, 10 000 groups and 100 products?
Is this a good database design, or is it better to store this logic in another manner?
(This is my first more complex query.)
Your query has the correct approach for your data model. The "correctness" of your data model really depends on volumes and frequency of change- you could opt to always store the explicit user-product relationship whenever a user is added to or removed from a group. This is a denormalizing tactic and moves the overhead from querying to updating - usually best not to consider these moves unless performance is tested and deficient.
A very tiny optimisation may be to avoid the join to user and product until after the union. At present you are only selecting the product name and user first_name, but if you were selecting many columns the sort/distinct would involve more work than strictly necessary, so something like:-
select product.name, user.first_name
from
(
select
group_product.product_id
from
group_product
inner join group on group.group_id = group_product.group_id
inner join group_member on group_member.group_id = group.group_id
where group_member.user_id = 1
union
select product_id product.name,
from user_product
where user_product.user_id = 1
) as d
inner join product on product.product_id = d.product_id
inner join user on user.user_id = 1

Selecting data from multiple tables, where a specific table does not contain any identifiable columns?

I have a dilemma.
Let's assume(for simplicity's sake) I have four tables, with different numbers of columns and rows, they are: users, mail, events and service.
When I receive a request, I have an ID that links on three of those tables, but with different columns it matches to.
Let's say, users matches on user_id, mail matches on user_ref and events matches on user_ref as well.
That would've been a fine query for me to write up, even with single, multiple or even all IDs.
The problem arrives on the next step I have to take, and that's the *service table.
The service table doesn't conform to the same standards of the others, thus it does not have an user_id, or user_ref that can be pulled.
What it has instead, is a *mail_ref* column, and it has the potential to contain duplicates.
My current method is trying to use an IN() method, but it only works for selecting a single user/row.
Here's my current query:
SELECT
u.Name as Name,
COUNT(m.user_ref) AS Mail_total,
e.mail_id,
COUNT(e.user_ref) AS Event_total,
COUNT(s.mail_ref) AS service_total
FROM
users u
LEFT JOIN
mail m ON m.user_ref = u.user_id
LEFT JOIN
service s ON s.mail_ref IN(e.mail_id)
LEFT JOIN
events e ON e.user_ref = u.user_id
WHERE u.user_id IN(my,list,of,ids)
GROUP BY s.mail_ref
The problem I have with it currently, is that although it's selecting the correct data, it's not selecting unique data for every id I specify.
It works marginally fine when given a single id, but as mentioned above, not when it has to retrieve multiple rows.
If anyone could help me out it would be much appreciated.
Do a subquery in the left join for service. Instead of:
LEFT JOIN
service s ON s.mail_ref IN(e.mail_id)
Try
LEFT JOIN
(select TOP 1 mail_ref from server) as S on s.mail_ref = e.mail_id
See if that works.
SELECT
u.Name as Name,
(select count(*) from mail m where m.user_ref = u.user_id) AS Mail_total,
e.mail_id,
(select count(*) from events e where e.user_ref = u.user_id) AS Event_total,
(select count(*) from
events e
inner join services s on s.mail_ref = e.mail_id
where
e.user_ref = u.user_id) as service_total
FROM
users u
WHERE u.user_id IN(my,list,of,ids)

Using left join for the same table or is there a better way?

I have a table called bans where I have the follow fields:
room_id, banned_user_id, banned_by_id, reason, ts_start, ts_end
The users data come from the table called users, now I wanted to query the bans to retrive the name of who was banned and by who along with reason, time the ban was placed and time it ends.
So I have this query:
SELECT u.username, us.username, b.reason, b.ts_start, b.ts_end
FROM `bans` b
LEFT JOIN users us ON b.banned_by_uid = us.uid
LEFT JOIN users u ON b.banned_uid = u.uid
WHERE room_id = 3
My question here is wether my query is ok by using the LEFT JOIN for the 2 data I have to grab from the table users or there is a different approach for this kinda of scenario ?
Your query is perfectly acceptable. Each join to users is on a specific ID, which translates into a simple lookup, with minimal overhead.