Can this query be rewritten without UNION, and is it scaleable? - mysql

I have a few tables:
In the product-table, I have a list of products.
In the user-table, I have a list of users.
In the group-table, I have groups of users.
IN the group_member-table, I have linked group and member (many-to-many)
In the user_product-table, I have linked user and product (many-to-many)
In the group_product-table, I have linked group and product (many-to-many)
So a user could have many products, a product could have many users. A user can be member of many groups, a group could have many members. A group could have many products, a product could have many groups. In other words, a product can have both groups and users.
What I want to ask the database is: "List all the products that a given user has access to, either through a direct relation in the user_product-table, or through the groups that the user is member of. I want the name of the product and the name of the user."
This is the query that I have come up with:
# First get all the products the user has access to via a group.
SELECT product.name,
user.first_name
FROM product
INNER JOIN group_product
ON group_product.product_id = product.product_id
INNER JOIN group
ON group.group_id = group_product.group_id
INNER JOIN group_member
ON group_member.group_id = group.group_id
INNER JOIN user
ON user.user_id = group_member.user_id
WHERE user.user_id = 1
UNION
# Now get all the products via direct access from user_product.
SELECT product.name,
user.first_name
FROM product
INNER JOIN user_product
ON user_product.product_id = product.product_id
INNER JOIN user
ON user.user_id = user_product.user_id
WHERE user.user_id = 1
Is this a good query, or is it better/possible to rewrite this into a JOIN only query? Would this be a fast query if there were 100 000 users, 10 000 groups and 100 products?
Is this a good database design, or is it better to store this logic in another manner?
(This is my first more complex query.)

Your query has the correct approach for your data model. The "correctness" of your data model really depends on volumes and frequency of change- you could opt to always store the explicit user-product relationship whenever a user is added to or removed from a group. This is a denormalizing tactic and moves the overhead from querying to updating - usually best not to consider these moves unless performance is tested and deficient.
A very tiny optimisation may be to avoid the join to user and product until after the union. At present you are only selecting the product name and user first_name, but if you were selecting many columns the sort/distinct would involve more work than strictly necessary, so something like:-
select product.name, user.first_name
from
(
select
group_product.product_id
from
group_product
inner join group on group.group_id = group_product.group_id
inner join group_member on group_member.group_id = group.group_id
where group_member.user_id = 1
union
select product_id product.name,
from user_product
where user_product.user_id = 1
) as d
inner join product on product.product_id = d.product_id
inner join user on user.user_id = 1

Related

Unable to join from 3 different tables - MySql

I am struggling to understand if this query is possible. I have 3 tables, User, Account, Loan. 2 users are linked by an account, and any loans facilitated between the 2 of them gets linked to the account_id. However, when i want to show the data in a table, i want to show the borrowers name from the users table, and the loan details from the loan table. but it is referenced from the lenders_id in the account table.
Here is the ERD:
A practical example would be that if you were a lender and wanted to go to the dashboard to see all the loans you have given out, but wanted to see the names of the borrower, and the principal amount, interest rate etc. and nothing really about the account.
I want the final table to look something like:
BorrwerName | PrincipalAmount | InterestRate | RepaymentDate | Relationship
What i have so far
SELECT user.first_name, user.last_name, loan.principal, loan.interest_rate,
loan.repayment_date, account.relationship
FROM user
INNER JOIN account ON account.borrower_id = user.id
INNER JOIN loan ON loan.account_id = account.id
The issue here is that no where do i even reference the Lender_ID. which is the Variable i need to pass to query whose loans to show. Im very lost, any help would be great.
Just add where clause in your query:
SELECT user.first_name, user.last_name, loan.principal, loan.interest_rate,
loan.repayment_date, account.relationship
FROM user
INNER JOIN account ON account.borrower_id = user.id
INNER JOIN loan ON loan.account_id = account.id
where account.lender_id = value_to_be_searched;
You can add a WHERE clause:
SELECT u.first_name, u.last_name, l.principal, l.interest_rate,
l.repayment_date, a.relationship
FROM user u JOIN
account a
ON a.borrower_id = u.id JOIN
loan l
ON l.account_id = a.id
WHERE a.lender_id = ?;
You don't need to SELECT the column to filter on it.

List of all products with user's saved products

I'm having an issue trying to figure out a query that will allow me to show a list of all of my product as well as showing whether or not a user has saved any given product.
I have 3 tables involved in the query (users, product_user, product).
I am determing whether or not a user has saved a product by joining the three tables and checking if user_id is null or not with the following query:
SELECT products.*, users.id as 'user_id' FROM products
LEFT JOIN product_user ON products.id = product_user.product_id
LEFT JOIN users ON product_user.user_id = users.id AND users.id =1;
However this returns duplicate rows when the user has saved a product (user_id null version and user_id = 1 version). A distinct statement won't work because the rows aren't distinct in this case. What is best practices to ensure that I only get back distinct products? I need to get back the entire list of products, whether or not the user has saved it.
This is being queried in mysql.
I think this does what you want:
select p.*,
(select pu.user_id
from product_user pu
where pu.product_id = p.id and pu.user_id = 1
limit 1
) as user_id
from products p;
This will return only one row per product. The row will have the user_id -- only once and it has to match whatever you pass in.

using JOIN and subquery in mysql

I posted a question about 2 weeks ago about 'one to many' relation between SQL tables. Now I have a bit of a different scenario. Basically, there are two tables - coffee_users and coffee_product_registrations. The latter is connected to coffee_users table with 'uid' column. So basically coffee_users.uid = coffee_product_registrations.uid
A single user can have multiple products registered.
What I want to do is to display some product information (from coffee_product_registrations) along with some user information (from coffee_users), BUT retrieve only those rows that have more than 1 product registrations.
So to simplify, here are the steps I need to take:
Join two tables
Select users that have multiple products registered
Display all their products along with their names and stuff
My current SQL query looks like this:
SELECT c.uid, c.name, cpr.model
FROM coffee_users c
JOIN coffee_product_registrations cpr on c.uid = cpr.uid
GROUP BY c.uid
HAVING COUNT(cpr.uid) > 1
This joins the two tables on 'uid' column but displays only 1 row for each user. It selects just users that have multiple products registered.
Now I need to take these IDs and select ALL the products from coffee_product_registrations based on them.
I cannot figure out how to put this in one query.
Replace cpr.*, c.* with columns which you want to extract feom the query
Try this:
SELECT cpr.*, c.*
FROM coffee_product_registrations cpr
INNER JOIN coffee_users c ON c.uid = cpr.uid
INNER JOIN (SELECT cpr.uid
FROM coffee_product_registrations cpr
GROUP BY cpr.uid
HAVING COUNT(DISTINCT cpr.productId) > 1
) AS A ON c.uid = A.uid;

Complex SQL query over four tables does not fetch wanted result

Imagine the following scenario: Employees of a company can give votes to an arbitrary question (integer value).
I have a complex request where I want to fetch five information:
Name of the company
Average vote value per company
Number of employees
Number of votes
Participation (no of votes/no of employees)
The SQL query shall only fetch votes of companies, that the current user is employed at.
Therefore I am accessing four different tables, following you see an excerpt of the table declarations:
User
- id
Company
- id
- name
Employment
- user_id (FK User.id)
- company_id (FK Company.id)
Vote
- company_name
- vote_value
- timestamp
User and Company are related by an Employment (n:m relation, but needs to be extra table). The table Vote shall not be connected by PK/FK-relation, but they can be related to a company by their company name (Company.name = Vote.company_name).
I managed to fetch all information except for the number of employees correctly by the following SQL query:
SELECT
c.name AS company,
AVG(v.vote_value) AS value,
COUNT(e.user_id) AS employees,
COUNT(f.face) AS votes,
(COUNT(e.user_id) / COUNT(v.vote_value)) AS participation
FROM Company c
JOIN Employment e ON e.company_id = c.id
JOIN User u ON u.id = e.user_id
JOIN Vote v
ON v.company_name = c.name
AND YEAR(v.timestamp) = :year
AND MONTH(v.timestamp) = :month
AND DAY(v.timestamp) = :day
WHERE u.id = :u_id
GROUP BY v.company_name, e.company_id
But instead of fetching the correct number of employees, the employee field is always equal the number of votes. (And therefore the participation value is also wrong.)
Is there any way to perform this in one query without subqueries1? What do I have to change so that the query fetches the correct number of employees?
1 I am using Doctrine2 and try to avoid subqueries as Doctrine does not support them. I just did not want to pull this into a Doctrine discussion. That's I why I broke this topic down to SQL level.
If you want to fetch the number of employees then the issue is that you are filtering by only 1 employee:
WHERE u.id = :u_id
Secondly, bear in mind that if you want to count the amount of employees and you have gotten into the vote grouping level, then of course you will have the amount of rows equal to the amount of votes. So you will have to distinct count as #Przem... mentioned:
COUNT(DISTINCT e.user_id) AS employees,
That way you will uniquely count the employees for the company (getting rid of the repeated employee ids for all the votes the employee has).
As you mentioned in a comment:
It returns the 1 as employee count
This is because of the where condition forcing to 1 employee with many votes. The distinct will only count the unique 1 employee filtered by the where clause and that is why you get only 1. However, that is the correct result (based on your filter condition).
Adding subqueries in the select clause will also get you to the right result but at the expense of performance.
Try this--it calculates the votes as one subquery and the employees as another subquery.
SELECT c.name,
ce.employee_count,
cv.vote_count,
cv.vote_count / ce.employee_count,
cv.vote_value
FROM
(select company, count(*) AS 'employee_count'
FROM employment GROUP BY company) ce
INNER JOIN company c
ON c.id = ce.company
INNER JOIN
(select company, AVG(vote_value) AS 'vote_value', count(*) as 'vote_count'
FROM vote v GROUP BY company) cv
ON c.id = cv.company
Well I think with a query defined like that you should add the DISTINCT keyword while counting the number of employees:
SELECT
c.name AS company,
AVG(v.vote_value) AS value,
COUNT(DISTINCT e.user_id) AS employees,
COUNT(f.face) AS votes,
(COUNT(DISTINCT e.user_id) / COUNT(v.vote_value)) AS participation
FROM Company c
JOIN Employment e ON e.company_id = c.id
JOIN User u ON u.id = e.user_id
JOIN Vote v
ON v.company_name = c.name
AND YEAR(v.timestamp) = :year
AND MONTH(v.timestamp) = :month
AND DAY(v.timestamp) = :day
GROUP BY v.company_name, e.company_id;
Not sure if it is possible in MySQL, though.
Edit: as #Mosty Mostacho pointed out, the condition on u.id was the problem, and without it and with addition of DISTINCT keyword, the query returns correct results and I edited the above query.

Hybrid Left/Right Join based on condition?

I'm trying to write an SQL statement to retrieve a list of users from a database, along side their company name (if they have a company associated with them). However, there are a couple gotchas:
Not all users have companies, but I still need to show these people in the list.
Even if a user has a company, that company could be soft-deleted (the record is still in the database, but is flagged with is_deleted = 1), and I don't want to show users that are associated with "deleted" companies.
So essentially I want to SELECT from the User table and LEFT JOIN the company table, but I don't want to include the User record at all if the company they are assigned to is_deleted.
My first inclination is that I would have to use a UNION to merge two queries together, but I was hoping there would be a cleaner way to do it?
Using Mysql 5.1
SELECT U.name Username, C.name Company
FROM User U
LEFT OUTER JOIN Company C
ON U.companyid = C.id
WHERE C.id IS NULL OR C.is_deleted = 0
C.id IS NULL gets the users with no company, and C.is_deleted = 0 gets the users with companies that haven't been soft-deleted.
Try joining to a table that excludes the deleted companies:
SELECT U.Name, C.Name
FROM User U LEFT OUTER JOIN
(SELECT CompanyId, CompanyName
FROM Company
WHERE is_deleted = 0)
C ON U.CompanyId = C.CompanyId