Converting NOT IN with WHERE to join - mysql

I am trying to fetch all record that are not related to a certain member,
registration is an intersection entity that solves the many to many for members to lessons.
SELECT * FROM lesson
INNER JOIN training
ON training.id = lesson.training_id
WHERE lesson.id
NOT IN (SELECT registration.lesson_id FROM registration WHERE registration.member_id = 42)
I have been trying to convert this code to use a join statement but for I just can't get it to work.
SELECT l.*
FROM t_left l
LEFT JOIN
t_right r
ON r.value = l.value
WHERE r.value IS NULL
this technique doesn't work since this returns record of lesson that don't have any relation ship at all, I need it to look if it doesn't have a relationship to a specific member.
entity relationships model

You want to turn your NOT IN into an anti join? Why? I find NOT IN much more readable.
Anyway:
SELECT *
FROM lesson l
INNER JOIN training t ON t.id = l.training_id
LEFT JOIN registration r ON r.lesson_id = l.id AND r.member_id = 42
WHERE r.id IS NULL;

I think that you just need to add a join condition on the member that you are looking for:
SELECT l.*
FROM t_left l
LEFT JOIN t_right r ON r.value = l.value and r.member_id = 42
WHERE r.value IS NULL
Or, starting from your other query:
SELECT l.*, t.*
FROM lesson l
INNER JOIN training t ON t.id = l.training_id
INNER JOIN registration r ON r.lesson_id = l.id AND r.member_id = 42
WHERE r.lesson_id IS NULL

The conversion from NOT IN or NOT EXISTs uses LEFT JOIN and a comparison to indicate that the JOIN failed.
That would be:
SELECT *
FROM lesson l INNER JOIN
training t
ON t.id = l.training_id LEFT JOIN
registrations r
ON r.lesson_id = l.id AND r.member_id = 42
WHERE r.lesson_id IS NULL;
Very important: The comparison on member_id needs to go in the ON clause, not the WHERE clause.

Related

How do I replace EXISTS and NOT EXISTS with JOIN in SQL so I can translate it to relational algebra?

I want to replace EXISTS and NOT EXISTS in the following query:
SELECT pokemon_name FROM sinnohdex s
WHERE NOT EXISTS
(SELECT * FROM hoenndex h
WHERE s.id = h.id)
AND EXISTS
(SELECT * FROM johtodex j
WHERE j.id = s.id);
This is what I've got so far:
SELECT pokemon_name FROM sinnohdex s
LEFT JOIN hoenndex h ON s.id = h.id
INNER JOIN johtodex j ON j.id = s.id
WHERE h.id IS NULL;
My goal is to later "translate" the query into relational algebra.I don't know if this is even accurate.The idea is to write the query only with joins. Is this right?
I would suggest using the LEFT JOIN and INNER JOIN as follows:
SELECT distinct pokemon_name
FROM sinnohdex s JOIN johtodex j ON j.id = s.id
LEFT JOIN hoenndex h ON s.id = h.id
WHERE h.id is null
Close, but not quite. I would suggest:
SELECT s.pokemon_name
FROM sinnohdex s JOIN
(SELECT DISTINCT j.id
FROM johtodex j
) j
ON j.id = s.id LEFT JOIN
hoenndex h
ON s.id = h.id
WHERE h.id IS NULL;
The SELECT DISTINCT is needed in the subquery to ensure that the JOIN does not product duplicates.
Removing duplicates before joining is usually more performant than removing duplicates after joining (although that might depend on the data).

Count matched words from IN operator

i have this little mysql query :
select t.title FROM title t
inner join movie_keyword mk on mk.movie_id = t.id
inner join keyword k on k.id = mk.keyword_id
where k.keyword IN (
select k.keyword
FROM title t
inner join movie_keyword mk on mk.movie_id = t.id
inner join keyword k on k.id = mk.keyword_id
where t.id = 166282
)
LIMIT 15
as you can see it will return all titles from title that have at least one the same keyword that have movie with id 166282.
Now i have problem, because i want also count how many keywords was matched in IN operator(let's say i want to see only titles that have 3 or more the same keywords), i tried something with aggregate functions, but everything failed, so i came here with my problem. Maybe somebody can give me some advice, or code example.
I'm not also sure, if this "subquery way" is good, so if there are some better options how i should solve my problem, I am open to any suggestions or tips.
Thank you!
#Edit
So after some problems, i have one more. This is my current query :
SELECT s.title,s.vote,s.rating,count(dk.key) as keywordCnt, count(dg.name) as genreCnt
FROM series s
INNER JOIN series_has_genre shg ON shg.series_id = s.id
INNER JOIN dict_genre dg ON dg.id = shg.dict_genre_id
INNER JOIN series_has_keyword shk ON shk.series_id = s.id
INNER JOIN dict_keyword dk ON dk.id = shk.dict_keyword_id
WHERE dk.key IN (
SELECT dki.key FROM series si
INNER JOIN series_has_keyword shki ON shki.series_id = si.id
INNER JOIN dict_keyword dki ON dki.id = shki.dict_keyword_id
WHERE si.title LIKE 'The Wire'
)
and dg.name IN (
SELECT dgo.name FROM series so
INNER JOIN series_has_genre shgo ON shgo.series_id = so.id
INNER JOIN dict_genre dgo ON dgo.id = shgo.dict_genre_id
WHERE so.title LIKE 'The Wire'
)
and s.production_year > 2000
GROUP BY s.title
ORDER BY s.vote DESC, keywordCnt DESC ,s.rating DESC, genreCnt DESC
LIMIT 5
Problem is, it is very, very, very slow. Any tips what i should change, to run it faster ?
Will this work for you:
select t.title, count(k.keyword) as keywordCount FROM title t
inner join movie_keyword mk on mk.movie_id = t.id
inner join keyword k on k.id = mk.keyword_id
where k.keyword IN (
select ki.keyword
FROM title ti
inner join movie_keyword mki on mki.movie_id = ti.id
inner join keyword ki on ki.id = mki.keyword_id
where ti.id = 166282
) group by t.title
LIMIT 15
Note that I have changed the table names inside the nested query to avoid confusion.

MySQL query with and without WHERE condition at once

How can I select both values at once? For example, I have a Lesson that have Students and each Student is linked to a Client, so what I want to achieve is something like:
SELECT l.id,
l.value * clientStudents/totalStudents as total
FROM Lesson l
JOIN lesson_student ls ON l.id = ls.lesson_id
JOIN Student s ON ls.student_id = s.id
JOIN Client c ON s.client_id = c.id
**WHERE c.id = <SOME_CLIENT>**
being clientStudents the count using the WHERE clause and totalStudents without using it.
You can move the condition in the calculation phase. Something like:
SELECT l.id,
l.value * SUM(if(c.id = <SOME_CLIENT>,clientStudents,0)) / SUM(totalStudents) as total
FROM Lesson l
JOIN lesson_student ls ON l.id = ls.lesson_id
JOIN Student s ON ls.student_id = s.id
JOIN Client c ON s.client_id = c.id
GROUP BY l.id, l.value

Rows missing from mysql pivot query results

I have a mysql query as stated below, it returns exactly the results I want for one row, but doesn't return any other rows where I expect there to be 8 in my test data (there are 8 unique test ids). I was inspired by this answer but obviously messed up my implementation, does anyone see where I'm going wrong?
SELECT
c.first_name,
c.last_name,
n.test_name,
e.doc_name,
e.email,
e.lab_id,
a.test_id,
a.date_req,
a.date_approved,
a.accepts_terms,
a.res_value,
a.reason,
a.test_type,
a.date_collected,
a.date_received,
k.kind_name,
sum(case when metabolite_name = "Creatinine" then t.res_val end) as Creatinine,
sum(case when metabolite_name = "Glucose" then t.res_val end) as Glucose,
sum(case when metabolite_name = "pH" then t.res_val end) as pH
FROM test_requisitions AS a
INNER JOIN personal_info AS c ON (a.user_id = c.user_id)
INNER JOIN test_types AS d ON (a.test_type = d.test_type)
INNER JOIN kinds AS k ON (k.id = d.kind_id)
INNER JOIN test_names AS n ON (d.name_id = n.id)
INNER JOIN docs AS e ON (a.doc_id = e.id)
INNER JOIN test_metabolites AS t ON (t.test_id = a.test_id)
RIGHT JOIN metabolites AS m ON (m.id = t.metabolite_id)
GROUP BY a.test_id
ORDER BY (a.date_approved IS NOT NULL),(a.res_value IS NOT NULL), a.date_req, c.last_name ASC;
Most of your joins are inner joins. The last is a right outer join. As written, the query keeps all the metabolites, but not necessarily all the tests.
I would suggest that you change them all to left outer joins, because you want to keep all the rows in the first table:
FROM test_requisitions AS a
LEFT JOIN personal_info AS c ON (a.user_id = c.user_id)
LEFT JOIN test_types AS d ON (a.test_type = d.test_type)
LEFT JOIN kinds AS k ON (k.id = d.kind_id)
LEFT JOIN test_names AS n ON (d.name_id = n.id)
LEFT JOIN docs AS e ON (a.doc_id = e.id)
LEFT JOIN test_metabolites AS t ON (t.test_id = a.test_id)
LEFT JOIN metabolites AS m ON (m.id = t.metabolite_id)
I would also suggest that your aliases be related to the table, so tr for test_requisition, pi for personal_info, and so on.

MySQL query optimization

Just wondering what's a better way to write this query. Cheers.
SELECT r.user_id AS ID, m.prenom, m.nom
FROM `0_rank` AS l
LEFT JOIN `0_right` AS r ON r.rank_id = l.id
LEFT JOIN `0_user` AS m ON r.user_id = m.id
WHERE r.section_id = $section_id
AND l.rank = '$rank_name' AND depart_id IN
(SELECT depart_id FROM 0_depart WHERE user_id = $user_id AND section_id = $section_id)
GROUP BY r.user_id
Here are the table structures:
0_rank: id | section_id | rank_name |
other_stuffs
0_user: id | prenom | nom | other_stuffs
0_right: id | section_id | user_id |
rank_id | other_stuffs
0_depart: id | section_id | user_id | depart_id
| other_stuffs
The idea is to use the same in a function like:
public function usergroup($section_id,$rank_name,$user_id) {
// mysql query goes here to get a list of appropriate users
}
Update: I think I have not been able to express myself clearly earlier. Here is the most recent query that seems to be working.
SELECT m.id, m.prenom, m.nom,
CAST( GROUP_CONCAT( DISTINCT d.depart ) AS char ) AS deps,
CAST( GROUP_CONCAT( DISTINCT x.depart ) AS char ) AS depx
FROM `0_rank` AS l
LEFT JOIN `0_right` AS r ON r.rank_id = l.id
LEFT JOIN `0_member` AS m ON r.user_id = m.id
LEFT JOIN `0_depart` AS d ON m.id = d.user_id
LEFT JOIN `0_depart` AS x ON x.user_id = $user_id
WHERE r.section = $section_id
AND l.rank = '$rank_name'
GROUP BY r.user_id ORDER BY prenom, nom
Now I want to get only those result, where all entries of deps are present in entries in depx.
In other term, every user is associated with some departs. $user_id is also an user is associated with some departs.
I want to get those users whose departs are common to the departs of $user_id.
Cheers.
Update
I'm not sure without being able to see the data but I believe this query will give you the results you want the fastest.
SELECT m.id, m.prenom, m.nom,
CAST( GROUP_CONCAT( DISTINCT d.depart ) AS char ) AS deps,
FROM `0_rank` AS l
LEFT JOIN `0_right` AS r ON r.rank_id = l.id and r.user_id = $user_id
LEFT JOIN `0_member` AS m ON r.user_id = m.id
LEFT JOIN `0_depart` AS d ON m.id = d.user_id
WHERE r.section = $section_id
AND l.rank = '$rank_name'
GROUP BY r.user_id ORDER BY prenom, nom
Let me know if this works.
Try this:
(By converting the functionality of the IN (SELECT...) to an inner join, you get exactly the same results but it might be the optimizer will make better choices.)
SELECT r.user_id AS ID, m.prenom, m.nom
FROM `0_rank` AS l
LEFT JOIN `0_right` AS r ON r.rank_id = l.id and r.section_id = 2
LEFT JOIN `0_user` AS m ON r.user_id = m.id
INNER JOIN `0_depart` AS x ON l.section_id = x.section_id and x.user_id = $user_id AND x.section_id = $section_id
WHERE l.rank = 'mod'
GROUP BY r.user_id
I also moved the constraints on 0_right to the join statement because I think that is clearer -- presumably this change won't matter to the optimizer.
I know nothing about your DB structure but your subselect looks like it can be replaced with a simple INNER JOIN against whatever table has the depart column. MySQL is well known for its poor subquery optimization.
Without knowing the structures or indexes, I would first add "STRAIGHT_JOIN" if the critical criteria is in-fact from the 0-rank table. Then, ensure 0_rank has an index on "rank". Next, ensure the 0_right has an index on rank_id at a minimum, but rank_id, section to take advantage of BOTH your criteria. Index on 0_member on id.
Additionally, do you mean left-join (ie: record only required in the 0_rank or 0_member) on the respective 0_right and 0_member tables instead of a normal join (where BOTH tables must match on their IDs).
Finally, ensure index on the depart table on user_id.
SELECT STRAIGHT_JOIN
r.user_id AS ID,
m.prenom,
m.nom
FROM
0_rank AS l
LEFT JOIN `0_right` AS r
ON l.id = r.rank_id
AND r.section = 2
LEFT JOIN `0_member` AS m
ON r.user_id = m.id
WHERE
l.rank = 'mod'
AND depart IN (SELECT depart
FROM 0_depart
WHERE user_id = 2
AND user_sec = 2)
GROUP BY
r.user_id
---- revised post from feedback.
From the parameters you are listing, you are always including the User ID... If so, I would completely restructure it to get whatever info is for that user. Each user should apparently can be associated to multiple departments and may or may NOT match the given rank / department / section you are looking for... I would START the query with the ONE USER because THAT will guarantee a single entry, THEN tune-down to the other elements...
select STRAIGHT_JOIN
u.id,
u.prenom,
u.nom,
u.other_stuffs,
rank.rank_name
from
0_user u
left join 0_right r
on u.id = r.user_id
AND r.section_id = $section_id
join 0_rank rank
on r.rank_id = rank.id
AND rank.rank_name = '$rank_name'
left join 0_dept dept
on u.id = dept.user_id
where
u.id = $user_id
Additionally, I have concern about your table relationships and don't see a legit join to the department table...
0_user
0_right by User_ID
0_rank by right.rank_id
0_dept has section which could join to rank or right, but nothing to user_id directly
Run explain on the query - it will help you find where the caveats are:
EXPLAIN SELECT r.user_id AS ID, m.prenom, m.nom
FROM 0_rank AS l
LEFT JOIN `0_right` AS r ON r.rank_id = l.id
LEFT JOIN `0_member` AS m ON r.user_id = m.id
WHERE r.section = 2
AND l.rank = 'mod' AND depart IN
(SELECT depart FROM 0_depart WHERE user_id = 2 AND user_sec = 2)
GROUP BY r.user_id\G