SQL - Does an object have all the required components? - mysql

I'm not quite sure how to phrase the question in order to really get across what I mean, so I suppose the following example illustrates the question.
Let's say I have a recipe website where users can sign up (data stored in a Users table, with user ID being the primary key) and record ingredients (global ingredient book stored in an AllIngredients table, with ingredient ID being the primary key) that they have in their cabinet (data stored in a UserCabinet table, which links to user ID and ingredient ID).
Then, let's say I have a collection of recipes (stored in a Recipes table, with a recipe ID being the primary key) which are made up of a set of ingredients (stored in a RecipeIngredients table, which links to recipe ID and ingredient ID).
In this scenario, the question I'm asking is how do I determine which recipes a user has all of the ingredients for? They might have more ingredients than the recipe calls for, which is fine, but they can't have less (i.e. they can't be missing any). Is this possible with only SQL, or does it require several queries / manipulation using a programming language?
edit: The following is the SQL to create the sample tables I'm talking about: http://pastebin.com/N9pqmC2r

select r.*
from recipes r
join recipeComponents rc on rc.recipe_id = r.id
join userCabinet uc on uc.ingredient_id = rc.ingredient_id
where uc.user_id = ?
group by r.id
having count(uc.ingredient_id) = (
select count(*)
from recipeComponents rc1
where rc1.recipe_id = r.id
)
Or
select distinct r.*
from recipes r
join recipeComponents rc on rc.recipe_id = r.id
join userCabinet uc on uc.ingredient_id = rc.ingredient_id
where uc.user_id = ?
and not exists (
select *
from recipeComponents rc1
where rc1.recipe_id = r.id
and not exists (
select *
from userCabinet uc1
where uc1.ingredient_id = rc1.ingredient_id
)
)
Or
select r.*
from recipes r
left join (
select rc.recipe_id
from recipeComponents rc
left join userCabinet uc
on uc.user_id = ?
and uc.ingredient_id = rc.ingredient_id
where uc.ingredient_id is null
) u on u.recipe_id = r.id
where u.recipe_id is null

select distinct u.user_id, r.recipe_id
from recipeComponents r
left join userCabinet u on r.ingredient_id = u.ingredient_id
where recipe_id not in (
select recipe_id
from recipeComponents r
left join userCabinet u on r.ingredient_id = u.ingredient_id
where u.user_id is null
)

Related

Substitute "OR EXISTS" in MySql query so i can get better perfomance results

This query is taking forever to finish in MySql 8, doing some research i found out that the "EXISTS" in this code can be extremely slow in some queries.
When i remove the "OR EXISTS" sub-query part, it runs in less than a second.
So i need to substitute the "OR EXISTS" in this query so i can get all the users i need:
SELECT u.name,
u.email,
u.cpf,
u.register,
r.name AS role_name,
s.name AS sector_name,
b.name AS branch_name,
u.status
FROM users u
INNER JOIN roles r ON r.id = u.role_id
INNER JOIN sectors s ON s.id = u.sector_id
INNER JOIN branches b ON b.id = u.branch_id
WHERE u.status = 2 OR EXISTS (
SELECT *
FROM user_recovery ur
WHERE ur.user_id = u.id
AND ur.status_recovery = 1
)
Is there a way to do it without the "OR EXISTS"?
Or can enforce a full scan
try
you can't get rid of the eXISTS clause because it increases the number of returned rows.
Add a INDEX on user status and user_recovery userid,status_recovery and on the on Clause columns.
SELECT u.name,
u.email,
u.cpf,
u.register,
r.name AS role_name,
s.name AS sector_name,
b.name AS branch_name,
u.status
FROM users u
INNER JOIN roles r ON r.id = u.role_id
INNER JOIN sectors s ON s.id = u.sector_id
INNER JOIN branches b ON b.id = u.branch_id
WHERE u.status = 2
UNION
SELECT u.name,
u.email,
u.cpf,
u.register,
r.name AS role_name,
s.name AS sector_name,
b.name AS branch_name,
u.status
FROM users u
INNER JOIN roles r ON r.id = u.role_id
INNER JOIN sectors s ON s.id = u.sector_id
INNER JOIN branches b ON b.id = u.branch_id
WHERE EXISTS (
SELECT 1
FROM user_recovery ur
WHERE ur.user_id = u.id
AND ur.status_recovery = 1
)
"I'll see your UNION; and raise you a derived table."
SELECT u.name,
u.email,
u.cpf,
u.register,
r.name AS role_name,
s.name AS sector_name,
b.name AS branch_name,
u.status
FROM ( SELECT id
FROM users
WHERE status = 2
UNION DISTINCT -- or UNION ALL; see below
SELECT user_id
FROM user_recovery
WHERE status_recovery = 1 -- see new index
) AS u1
JOIN users AS u USING(id) -- self-join to pick up other columns
JOIN roles r ON r.id = u.role_id
JOIN sectors s ON s.id = u.sector_id
JOIN branches b ON b.id = u.branch_id;
Indexes:
user_recovery: INDEX(status_recovery, user_id) -- in this order
users: INDEX(status, id) -- in this order
(I assume `id` is the PRIMARY KEY in each table)
The general rule here is... When you have a bunch of JOINs, but a single table that controls which rows, but that is messy or slow (eg UNION in this case, GROUP BY or LIMIT in other cases),
Optimize finding the ids (user.id aka user_id) is the optimal way.
Then JOIN back to the original table (if needed), plus the other tables.
In doing all that, it became apparent that a new index for user_recovery might be beneficial.
(If UNION ALL won't produce any dups, switch to it for a little more speed.)

MYSQL: Handling Multiple LEFT JOINS

I have a query with one LEFT JOIN that works fine. When I add a second LEFT JOIN to a table with multiple records per field in the first table, however, I am getting the product of the results in the two tables ie books x publishers returned. How can I prevent this from happening?
SELECT a.*,b.*,p.*, group_concat(b.id as `bids`)
FROM authors `a`
LEFT JOIN books `b`
ON b.authorid = a.id
LEFT JOIN publishers `p`
on p.authorid = a.id
GROUP by a.id
EDIT:
Figured it out. The way to do this is to use subqueries as in this answer:
SELECT u.id
, u.account_balance
, g.grocery_visits
, f.fishmarket_visits
FROM users u
LEFT JOIN (
SELECT user_id, count(*) AS grocery_visits
FROM grocery
GROUP BY user_id
) g ON g.user_id = u.id
LEFT JOIN (
SELECT user_id, count(*) AS fishmarket_visits
FROM fishmarket
GROUP BY user_id
) f ON f.user_id = u.id
ORDER BY u.id;
If you do multiple LEFT Joins, your query will return a cartesian product of the results. To avoid this and get only one copy of fields you desire, do a subquery for each table you wish to join as below. Hope this helps someone in the future.
SELECT u.id
, u.account_balance
, g.grocery_visits
, f.fishmarket_visits
FROM users u
LEFT JOIN (
SELECT user_id, count(*) AS grocery_visits
FROM grocery
GROUP BY user_id
) g ON g.user_id = u.id
LEFT JOIN (
SELECT user_id, count(*) AS fishmarket_visits
FROM fishmarket
GROUP BY user_id
) f ON f.user_id = u.id
ORDER BY u.id;

SQL query improvement (extra column in table)

I have this query in SQL that I KNOW it is horribly written. Could you guys help me write it in a decent, normal person manner?
Thanks.
select distinct R.*, X.LIKED
from Recipe R
left join (select distinct R.* , '1' as LIKED
from Recipe R, Likes L
where R.id = L.idRecipe
and L.email = 'dvader#deathstar.galacticempire') X
on R.id = X.id
looks like you need all from recipe with marks on liked by vader#deathstar.galacticempire
select R.*, likedR.LIKED
from Recipe R
left join (select distinct R.id , '1' as LIKED
from Recipe R
inner join Likes L on R.id = L.idRecipe
where
L.email = 'dvader#deathstar.galacticempire') likedR
on R.id = likedR.id
Thanks everyone for your help.
I was able to do what i wanted with this query:
select distinct R.*, X.LIKED, U.imgUrl
from User U, Recipe R
left join
(select distinct R.* , '1' as LIKED
from Recipe R, Likes L
where R.id = L.idRecipe and L.email = 'dvader#deathstar.ge') X
on R.id = X.id
where R.email = U.email
This will bring all the info i need in one table plus 1 extra column with either a 1 or a null if the entry of dvader is in another table, Using joins.

SQL COUNT in related table of 2nd order for selecting data for a forum index

Nijas
I'm having a problem with a query for receiving data to generate a classic forum index with all it's information, you something like phpBB.
My tables looks like this:
categories:
gategory varchar(50) -> primary key
forums:
id int -> primary key
name varchar(255)
description text
category varchar(50) -> foreign key to category
topics:
id int -> primary key
forum_id int -> foreign key to forums
subject varchar(255)
posts:
id int -> primary key
topic_id int -> foreign key to topics
user_id int -> foreign key to users
post text
create_date datetime
modify_date timestramp, on_update(current_time)
users:
id int -> primary key
username varchar(32)
password varchar(32)
And that is just great easy peasy.
Then I began building the query, and it got really complex (in my world) pretty fast.
I would like to get:
catories:
forums:
name,
description,
count(topics)
count(posts)
last_post user_id
last_post username
last_post create_date
I ended up with a working query looking like this:
SELECT
f.id as fid,
f.name as name,
f.description as description,
f.category as category,
( SELECT COUNT(*)
FROM forum_topics
WHERE forum_id = f.id
) as topics,
( SELECT COUNT(*)
FROM forum_posts fp
WHERE fp.topic_id IN (
SELECT id
FROM forum_topics
WHERE forum_id = f.id
)
) as posts,
lp.user_id as lp_userid,
u.username as lp_username,
lp.create_date as lp_date
FROM forums f
LEFT OUTER JOIN (
SELECT p.create_date, p.user_id, t.forum_id
FROM forum_topics t
INNER JOIN forum_posts p ON ( t.id = p.topic_id )
ORDER BY p.create_date DESC
) lp ON (lp.forum_id = f.id)
LEFT OUTER JOIN users u ON ( u.id = lp.user_id )
GROUP BY category, f.order
It's fine; it works, but it performs very badly.
So I was wondering some of you clever folks at this place,
could give me some advice on how to optimize the query,
maybe put in some indices some smart places, or reconstruct the schema in a smarter way.
// Thank you very much in advance
The basic query is to join all the tables together along their nature dimensions. This gets you everything except for the last post.
The following query uses standard SQL and should work in both mysql and SQL Server (except for typos).
SELECT
f.category,
f.id,
f.name,
f.description,
count(distinct t.id) AS topics,
count(distinct p.id) AS posts,
min(lastuser.id),
min(lastuser.username),
min(p.create_date)
FROM posts p
JOIN users u ON p.user_id = u.id
JOIN topics t ON p.topic_id = t.id
JOIN forums f ON t.forum_id = f.id
JOIN (SELECT
t.forum_id,
u.id,
u.username,
p.create_date
FROM posts p
JOIN topics t ON p.topic_id = t.id
JOIN users u ON p.user_id = u.id
JOIN (SELECT
t.forum_id, max(p.id) AS max_postid
FROM posts p
JOIN topics t ON p.topic_id = t.id
GROUP BY t.forum_id
) lastpost ON p.id = lastpost.max_postid
AND t.forum_id = lastpost.forum_id
) lastuser on lastuser.forum_id = f.id
GROUP BY f.category, f.id, f.name, f.description
It gets the last user by another complicated set of joins. The query assumes that the posts are assigned monotonically, so the most recent post has the highest post id.
There are other approaches. In particular, SQL Server supports window functions, which would simplify the query.

MySQL query optimization

Just wondering what's a better way to write this query. Cheers.
SELECT r.user_id AS ID, m.prenom, m.nom
FROM `0_rank` AS l
LEFT JOIN `0_right` AS r ON r.rank_id = l.id
LEFT JOIN `0_user` AS m ON r.user_id = m.id
WHERE r.section_id = $section_id
AND l.rank = '$rank_name' AND depart_id IN
(SELECT depart_id FROM 0_depart WHERE user_id = $user_id AND section_id = $section_id)
GROUP BY r.user_id
Here are the table structures:
0_rank: id | section_id | rank_name |
other_stuffs
0_user: id | prenom | nom | other_stuffs
0_right: id | section_id | user_id |
rank_id | other_stuffs
0_depart: id | section_id | user_id | depart_id
| other_stuffs
The idea is to use the same in a function like:
public function usergroup($section_id,$rank_name,$user_id) {
// mysql query goes here to get a list of appropriate users
}
Update: I think I have not been able to express myself clearly earlier. Here is the most recent query that seems to be working.
SELECT m.id, m.prenom, m.nom,
CAST( GROUP_CONCAT( DISTINCT d.depart ) AS char ) AS deps,
CAST( GROUP_CONCAT( DISTINCT x.depart ) AS char ) AS depx
FROM `0_rank` AS l
LEFT JOIN `0_right` AS r ON r.rank_id = l.id
LEFT JOIN `0_member` AS m ON r.user_id = m.id
LEFT JOIN `0_depart` AS d ON m.id = d.user_id
LEFT JOIN `0_depart` AS x ON x.user_id = $user_id
WHERE r.section = $section_id
AND l.rank = '$rank_name'
GROUP BY r.user_id ORDER BY prenom, nom
Now I want to get only those result, where all entries of deps are present in entries in depx.
In other term, every user is associated with some departs. $user_id is also an user is associated with some departs.
I want to get those users whose departs are common to the departs of $user_id.
Cheers.
Update
I'm not sure without being able to see the data but I believe this query will give you the results you want the fastest.
SELECT m.id, m.prenom, m.nom,
CAST( GROUP_CONCAT( DISTINCT d.depart ) AS char ) AS deps,
FROM `0_rank` AS l
LEFT JOIN `0_right` AS r ON r.rank_id = l.id and r.user_id = $user_id
LEFT JOIN `0_member` AS m ON r.user_id = m.id
LEFT JOIN `0_depart` AS d ON m.id = d.user_id
WHERE r.section = $section_id
AND l.rank = '$rank_name'
GROUP BY r.user_id ORDER BY prenom, nom
Let me know if this works.
Try this:
(By converting the functionality of the IN (SELECT...) to an inner join, you get exactly the same results but it might be the optimizer will make better choices.)
SELECT r.user_id AS ID, m.prenom, m.nom
FROM `0_rank` AS l
LEFT JOIN `0_right` AS r ON r.rank_id = l.id and r.section_id = 2
LEFT JOIN `0_user` AS m ON r.user_id = m.id
INNER JOIN `0_depart` AS x ON l.section_id = x.section_id and x.user_id = $user_id AND x.section_id = $section_id
WHERE l.rank = 'mod'
GROUP BY r.user_id
I also moved the constraints on 0_right to the join statement because I think that is clearer -- presumably this change won't matter to the optimizer.
I know nothing about your DB structure but your subselect looks like it can be replaced with a simple INNER JOIN against whatever table has the depart column. MySQL is well known for its poor subquery optimization.
Without knowing the structures or indexes, I would first add "STRAIGHT_JOIN" if the critical criteria is in-fact from the 0-rank table. Then, ensure 0_rank has an index on "rank". Next, ensure the 0_right has an index on rank_id at a minimum, but rank_id, section to take advantage of BOTH your criteria. Index on 0_member on id.
Additionally, do you mean left-join (ie: record only required in the 0_rank or 0_member) on the respective 0_right and 0_member tables instead of a normal join (where BOTH tables must match on their IDs).
Finally, ensure index on the depart table on user_id.
SELECT STRAIGHT_JOIN
r.user_id AS ID,
m.prenom,
m.nom
FROM
0_rank AS l
LEFT JOIN `0_right` AS r
ON l.id = r.rank_id
AND r.section = 2
LEFT JOIN `0_member` AS m
ON r.user_id = m.id
WHERE
l.rank = 'mod'
AND depart IN (SELECT depart
FROM 0_depart
WHERE user_id = 2
AND user_sec = 2)
GROUP BY
r.user_id
---- revised post from feedback.
From the parameters you are listing, you are always including the User ID... If so, I would completely restructure it to get whatever info is for that user. Each user should apparently can be associated to multiple departments and may or may NOT match the given rank / department / section you are looking for... I would START the query with the ONE USER because THAT will guarantee a single entry, THEN tune-down to the other elements...
select STRAIGHT_JOIN
u.id,
u.prenom,
u.nom,
u.other_stuffs,
rank.rank_name
from
0_user u
left join 0_right r
on u.id = r.user_id
AND r.section_id = $section_id
join 0_rank rank
on r.rank_id = rank.id
AND rank.rank_name = '$rank_name'
left join 0_dept dept
on u.id = dept.user_id
where
u.id = $user_id
Additionally, I have concern about your table relationships and don't see a legit join to the department table...
0_user
0_right by User_ID
0_rank by right.rank_id
0_dept has section which could join to rank or right, but nothing to user_id directly
Run explain on the query - it will help you find where the caveats are:
EXPLAIN SELECT r.user_id AS ID, m.prenom, m.nom
FROM 0_rank AS l
LEFT JOIN `0_right` AS r ON r.rank_id = l.id
LEFT JOIN `0_member` AS m ON r.user_id = m.id
WHERE r.section = 2
AND l.rank = 'mod' AND depart IN
(SELECT depart FROM 0_depart WHERE user_id = 2 AND user_sec = 2)
GROUP BY r.user_id\G