Join: three tables and a or condition - mysql

I think I should know this somehow, especially after reading a lot of questions and answers regarding "The condition must go into the ON clause, not in the WHERE clause". However, I am still lost.
I have three tables, and I join them normally with LEFT (OUTER) joins. The joined tables looks like this (retty standard):
task_id task_questions_taskId taskQuestions_questionId question_id
1 1 5 5
1 1 8 8
2 2 8 8
SELECT `t`.`id` AS `task_id` ,
`task_questions`.`taskId` AS `task_questions_taskId` ,
`task_questions`.`questionId` AS `task_questions_questionId` ,
questions.id AS question_id
FROM `task` `t`
LEFT OUTER JOIN `task_questions` `task_questions`
ON ( `task_questions`.`taskId` = `t`.`id` )
LEFT OUTER JOIN `question` `questions`
ON ( `task_questions`.`questionId` = `questions`.`id` )
This is the standard query to get all the records. (It's taken from Yii; I actually want to to this with Active Record, but can't even get plain SQL right).
And now I want to get ONLY those tasks that have the question_id 2 AND 8 (e.g)
So if a task has not both of those question.ids, I don't want it in the result set.
In this case, the task could have other question_ids, too. Although it would be interesting to see how the query would look if it should return only those that have exactly those 2 (or any other set).
It's easy to get all the tasks that have one question, with WHERE question.id = 2,
but an AND in the WHERE clause leads to an empty result.

The WHERE clause can only apply conditions to one row at a time. But your questions of different id occur on different rows. How to solve this? Join both rows onto one row using a self-join.
Here's an example:
SELECT t.`id` AS `task_id`, ...
FROM `task` AS t
INNER JOIN `task_questions` AS tq2 ON ( tq2.`taskId` = t.`id` )
INNER JOIN `questions` AS q2 ON ( tq2.`questionId` = q2.`id` )
INNER JOIN `task_questions` AS tq8 ON ( tq8.`taskId` = t.`id` )
INNER JOIN `questions` AS q8 ON ( tq8.`questionId` = q8.`id` )
WHERE q2.`id` = 2 AND q8.`id` = 8
Another solution is to find the tasks that have questions 2 OR 8, and then use GROUP BY and HAVING to filter by groups that have exactly two of those.
SELECT t.`id` AS `task_id`, ...
FROM `task` AS t
INNER JOIN `task_questions` AS tq ON ( tq.`taskId` = t.`id` )
INNER JOIN `questions` AS q ON ( tq.`questionId` = q.`id` )
WHERE tq.`questionId` IN (2, 8)
GROUP BY t.`id`
HAVING COUNT(DISTINCT q.`id`) = 2

you can do this even with out using and
... where question.id IN (2,8)

Use IN:
SELECT `t`.`id` AS `task_id` ,
`task_questions`.`taskId` AS `task_questions_taskId` ,
`task_questions`.`questionId` AS `task_questions_questionId` ,
questions.id AS question_id
FROM `task` `t`
LEFT OUTER JOIN `task_questions` `task_questions`
ON ( `task_questions`.`taskId` = `t`.`id`)
LEFT OUTER JOIN `question` `questions`
ON ( `task_questions`.`questionId` = `questions`.`id` )
WHERE `task_questions`.`questionId` IN (2, 8)

This should do it
SELECT `t`.`id` AS `task_id` ,
`task_questions`.`taskId` AS `task_questions_taskId` ,
`task_questions`.`questionId` AS `task_questions_questionId` ,
questions.id AS question_id
FROM `task` `t`
LEFT OUTER JOIN `task_questions` `task_questions`
ON ( `task_questions`.`taskId` = `t`.`id` )
LEFT OUTER JOIN `question` `questions`
ON ( `task_questions`.`questionId` = `questions`.`id` )
WHERE questions.id in (2,8)

You're not looking for AND, you're looking for OR, or an IN:
WHERE `questions`.`id` IN (2,8) -- grab everything in the parens.
Or
WHERE `questions`.`id` = 2 OR -- grab each item individually
`questions`.`id` = 8
If you use AND that would mean the ID would have to be 8 and 2 at the same time. Bad deal.

Related

WHERE clause, on joined table, with multiple rows

I have an incidents table which has a 1 to many relationship with a few tables - mainly, for the context of this question, people.
Basically, one incident may have many people (involved).
At the moment, I'm retrieving the incident details - plus a concatenated comma-delimited string of people's IDs using this query:
SELECT
i.`ID` AS `id`,
i.`Author_ID` AS `author_id`,
i.`Description` AS `description`,
i.`Date` AS `date`,
i.`Datetime_Created` AS `created`,
p.`Title` AS `period`,
GROUP_CONCAT(DISTINCT ip.`Person_ID` ORDER BY FIELD(ip.`Involvement`, 'V', 'P', 'W') ASC SEPARATOR ',') AS `people_ids`,
( SELECT COUNT(`ID`) FROM `reports` r WHERE r.Incident_ID = i.ID ) AS `reports`,
i.`Status` AS `status`
FROM `incidents` i
LEFT JOIN `reports` ir ON ir.Incident_ID = i.ID
LEFT JOIN `people` ip ON ip.Incident_ID = i.ID
LEFT JOIN `periods` p ON i.Period_ID = p.ID
WHERE 1 NOT IN ( SELECT Category_ID FROM `categories_link` WHERE `Incident_ID` = i.ID )
GROUP BY i.ID
ORDER BY i.`Date` DESC, p.`ID` DESC
This works fine, and produces data like:
What I'm trying to do now is filter these reports so that only incidents where one of the people involved is a student from a certain year group.
This information can be found by joining their IDs to the students table. The students table contains their ID and a Year_Group field.
One of the complexities is that some of the IDs from the people_involved table may not relate just to students - they could be staff, parents or other members of our community.
I don't want to exclude reports which have other people involved, as long as there is a student from a specific year group involved too.
I've written a query which seems to partially work:
SELECT
i.`ID` AS `id`,
i.`Author_ID` AS `author_id`,
i.`Description` AS `description`,
i.`Date` AS `date`,
i.`Datetime_Created` AS `created`,
p.`Title` AS `period`,
GROUP_CONCAT(DISTINCT ip.`Person_ID` ORDER BY FIELD(ip.`Involvement`, 'V', 'P', 'W') ASC SEPARATOR ',') AS `people_ids`,
( SELECT COUNT(`ID`) FROM `reports` r WHERE r.Incident_ID = i.ID ) AS `reports`,
i.`Status` AS `status`
FROM `incidents` i
LEFT JOIN `reports` ir ON ir.Incident_ID = i.ID
LEFT JOIN `people` ip ON ip.Incident_ID = i.ID
<< LEFT JOIN `student` stu ON ip.Person_ID = stu.db_id >>
LEFT JOIN `periods` p ON i.Period_ID = p.ID
WHERE 1 NOT IN ( SELECT Category_ID FROM `categories_link` WHERE `Incident_ID` = i.ID )
<< AND `stu`.`Year_Group` = 11 >>
GROUP BY i.ID
ORDER BY i.`Date` DESC, p.`ID` DESC
But I just can't imagine that a single simple JOIN would be sufficient for the task I'm trying to achieve.
I think a subquery might do it, but I don't know where to begin with that.
The code I would use to access this information (for year 7 students) without all of the necessary incidents data would be (I think):
SELECT DISTINCT( p.`Incident_ID` )
FROM `people` p
LEFT JOIN `student` stu ON p.Person_ID = stu.db_id
WHERE stu.Year_Group = 7
How do I bundle that into this code?
To get incidents where students of only specific age group is included,use the following query.
SELECT p.Incident_ID
FROM people p
JOIN student stu ON p.Person_ID = stu.db_id
WHERE stu.Year_Group = 11
group by p.Incident_ID
Your original query returns the incidents and the group of people involved ,So in your original query filter incidents by comparing with the above query written by me.This way you will get all incidents where students from a specific year group involved plus other people also involved(if any).I think this will solve your problem.
SELECT
i.`ID` AS `id`,
i.`Author_ID` AS `author_id`,
i.`Description` AS `description`,
i.`Date` AS `date`,
i.`Datetime_Created` AS `created`,
p.`Title` AS `period`,
GROUP_CONCAT(DISTINCT ip.`Person_ID` ORDER BY FIELD(ip.`Involvement`, 'V', 'P', 'W') ASC SEPARATOR ',') AS `people_ids`,
( SELECT COUNT(`ID`) FROM `reports` r WHERE r.Incident_ID = i.ID ) AS `reports`,
i.`Status` AS `status`
FROM `incidents` i
LEFT JOIN `reports` ir ON ir.Incident_ID = i.ID
LEFT JOIN `people` ip ON ip.Incident_ID = i.ID
LEFT JOIN `periods` p ON i.Period_ID = p.ID
WHERE 1 NOT IN ( SELECT Category_ID FROM `categories_link` WHERE `Incident_ID` = i.ID )
and i.ID in //Here you will put the above query
(
SELECT p.Incident_ID
FROM people p
JOIN student stu ON p.Person_ID = stu.db_id
WHERE stu.Year_Group = 11
group by p.Incident_ID
)
GROUP BY i.ID
ORDER BY i.`Date` DESC, p.`ID` DESC
It looks to me like you want an OUTER JOIN on students.
LEFT OUTER JOIN 'student' stu on ip.Person_ID = stu.db_id
That will include all the incidents. Then, in the WHERE clause, add the filter
WHERE 1 NOT IN ( SELECT Category_ID FROM `categories_link` WHERE `Incident_ID` = i.ID ) AND `stu`.`Year_Group` = 7

LEFT JOIN of only the latest row from a many-to-one

Been banging my head against the wall and cannot solve this :\
SELECT
`people`.*,
`students`.*,
`student_class_relationships`.*,
`geo_checkin_on_campus`.`datetime_created` as checkin_time
FROM `student_class_relationships`
LEFT OUTER JOIN `students`
ON `student_class_relationships`.`student` = `students`.`id`
LEFT OUTER JOIN `people`
ON `students`.`student` = `people`.`id`
LEFT OUTER JOIN `geo_checkin_on_campus`
ON `students`.`id` = (
SELECT MIN(`geo_checkin_on_campus`.`student`)
FROM `geo_checkin_on_campus`
WHERE `geo_checkin_on_campus`.`student` = `students`.`id`
)
WHERE `class` = 56
The expected result is many rows that have only one entry per students.id.
Here is my schema
It is not the best query from performance perspective,
but just to fix your query here is my attempt:
SELECT
`people`.*,
`students`.*,
`student_class_relationships`.*,
geoCheckinOnCampus.datetimeCreated as checkin_time
FROM `student_class_relationships`
LEFT JOIN `students`
ON `student_class_relationships`.`student` = `students`.`id`
LEFT JOIN `people`
ON `students`.`student` = `people`.`id`
LEFT JOIN
(
SELECT
student,
MAX(datetime_created) datetimeCreated
FROM `geo_checkin_on_campus`
GROUP BY `student`
) geoCheckinOnCampus
ON `students`.`id` = geoCheckinOnCampus.`student`
WHERE `class` = 56
Note According to #xQbert answer I would really change MIN to MAX function if you are looking for the latest datetime.
If i assume you want the most recent checkin (and not the earliest created date) for each student in go_checkin_on_Campus then...
SELECT
`people`.*,
`students`.*,
`student_class_relationships`.*,
B.`datetime_Updated` as checkin_time
FROM `student_class_relationships`
LEFT OUTER JOIN `students`
ON `student_class_relationships`.`student` = `students`.`id`
LEFT OUTER JOIN `people`
ON `students`.`student` = `people`.`id`
LEFT OUTER JOIN (
SELECT max(datetime_updated), student
FROM `geo_checkin_on_campus`
group by student
) B
ON `students`.`id` = B.Student
WHERE `class` = 56
NOTE: This is a probable answer. I will edit / modify this according to comments from OP.
This basically does nothing. This is as good as just joining on the student id
LEFT OUTER JOIN `geo_checkin_on_campus`
ON `students`.`id` = (
SELECT MIN(`geo_checkin_on_campus`.`student`)
FROM `geo_checkin_on_campus`
WHERE `geo_checkin_on_campus`.`student` = `students`.`id`
)
If you want the min (or earliest) datetime_created use something like
LEFT OUTER JOIN (
SELECT `geo_checkin_on_campus`.`student` student,
MIN(`geo_checkin_on_campus`.`datetime_created`) dt
FROM `geo_checkin_on_campus`
WHERE `geo_checkin_on_campus`.`student` = `students`.`id`
GROUP BY `geo_checkin_on_campus`.`student`
) t
ON `students`.`id` = t.student

MYSQL GROUP BY - slow query

I have a query like below, please help to give proper index for this, the contact table have more than 20K records and it take nearly 20 secs to load.
Hope the group by clause makes the problem, if I remove the group by clause total record is more than 300k.
SELECT `a`.*, CONCAT(a.`firstname`, " ", a.`lastname`) AS `cont_name`, CONCAT(a.`position`, " / ", a.`company`) AS `comp_pos`, `e`.`name` AS `industry_name`, CONCAT(f.`firstname`, " ", f.`lastname`) AS `created_by`
FROM `contacts` AS `a`
LEFT JOIN `user_centres` AS `b` ON a.user_id = b.user_id
LEFT JOIN `group_contacts` AS `c` ON a.id = c.contact_id
LEFT JOIN `groups` AS `d` ON c.group_id = d.id
LEFT JOIN `industries` AS `e` ON e.id = a.industry_id
LEFT JOIN `users` AS `f` ON f.id = a.user_id
WHERE (1)
GROUP BY `a`.`id`
ORDER BY `a`.`created` desc
Explain shows like this - 20145 Using temporary; Using filesort
You can try these steps
There is no use of table group ( d ), so you can remove left join to group from this query
Add index for user_id, contact_id and industry_id ( I hope these ids are joining table primary_keys ) in contacts table
check these ids user_id, contact_id and industry_id types ( INT ) are same.

Unknown column in 'where clause'. Neste subquery to improve perfomance of 'ORDER BY'

I have an a relationship a little odd but must be so:
SELECT COUNT(DISTINCT `t`.`id`)
FROM `radcliente` `t`
LEFT OUTER JOIN `radcliente_endereco_instalacao` `endereco_instalacao`
ON (`endereco_instalacao`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radcliente_telefone` `telefones`
ON (`telefones`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radcliente_email` `emails`
ON (`emails`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radmetodo_cobranca` `metodo_cobranca`
ON (`metodo_cobranca`.`cliente_id`=`t`.`id`) AND (metodo_cobranca.arquivo = 'nao')
LEFT OUTER JOIN `radacct` `ultimo_acct`
ON (`ultimo_acct`.`username`=`t`.`login`)
AND (ultimo_acct.radacctid = (
SELECT `radacctid` FROM
(SELECT radacctid
FROM `radacct` `fUzDDUDv`
WHERE username = t.login
) AS `fUzDDUDv` ORDER BY `radacctid` DESC LIMIT 1
)
)
WHERE (ultimo_acct.framedipaddress = '177.23.209.194')
Unknown column 't.login' in 'where clause'.
UPDATE:
Yes, that would solve a problem. I created this sub query why I'm using the Yii Framework, and the has_one relationships there is no limit, ie if one 'cliente' with millions of 'acct', the framework seeks all 'acct', to take only one this gives some GB of traffic and is very slow. To solve, I used the subquery seeking only the newest id to get the 'cliente' and the last 'acct' and make it work in Active Record, until then everything was fine, but the search was slow estremamente to seek a record took 40 seconds, then I discovered that the problem was in the 'ORDER BY' (radacctid is an index and tables are innodb), hence I did to solve the 'ORDER BY' outside the subquery, then resolvou, the problem that the subquery column the table 'cliente' (t.login) is like there, as I explained above.
I've tried also sort by another field, eg 'acctstarttime', and continued slow, only solved when I did it this way:
SELECT `radacctid` FROM
(SELECT radacctid
FROM `radacct` `fUzDDUDv`
WHERE username = t.login
) AS `fUzDDUDv` ORDER BY `radacctid` DESC LIMIT 1
UPDATE:
But the problem of INNER JOIN is that if there is no result for 'acct' does not return the 'cliente'.
UPDATE
The problem is not where the t.login but it is not recognized as existing within the subquery that I can not put 'out' if the order is not slow.
UPDATE
Read my comments? This is the situation. It really is only on the production server that has inserts and updates all the time.
I hope this one could help you in some way:
SELECT
COUNT(DISTINCT `t`.`id`)
FROM
`radcliente` `t`
LEFT OUTER JOIN `radcliente_endereco_instalacao` `endereco_instalacao`
ON (`endereco_instalacao`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radcliente_telefone` `telefones`
ON (`telefones`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radcliente_email` `emails`
ON (`emails`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radmetodo_cobranca` `metodo_cobranca`
ON (`metodo_cobranca`.`cliente_id`=`t`.`id`)
AND (metodo_cobranca.arquivo = 'nao')
INNER JOIN
(
SELECT `ultimo_acct_INNER`.*
FROM
`radacct` AS `ultimo_acct_INNER`
INNER JOIN
(
SELECT `fUzDDUDv`.`username` AS `maxID_username`, MAX(`radacctid`) AS `maxID_radacctid`
FROM `radacct` `fUzDDUDv`
GROUP BY `fUzDDUDv`.`username`
) AS `radacct_MAX_ID`
ON `ultimo_acct_INNER`.`username`= `radacct_MAX_ID`.`maxID_username`
AND `ultimo_acct_INNER`.`radacctid` = `radacct_MAX_ID`.`maxID_radacctid`
) AS `ultimo_acct`
ON (`ultimo_acct`.`username`=`t`.`login`)
WHERE
(ultimo_acct.framedipaddress = '177.23.209.194')
I replaced section:
(...)
LEFT OUTER JOIN `radacct` `ultimo_acct`
ON (`ultimo_acct`.`username`=`t`.`login`)
AND (ultimo_acct.radacctid = (
SELECT `radacctid` FROM
(SELECT radacctid
FROM `radacct` `fUzDDUDv`
WHERE username = t.login
) AS `fUzDDUDv` ORDER BY `radacctid` DESC LIMIT 1
)
)
(...)
with this one:
(...)
INNER JOIN
(
SELECT `ultimo_acct_INNER`.*
FROM
`radacct` AS `ultimo_acct_INNER`
INNER JOIN
(
SELECT `fUzDDUDv`.`username` AS `maxID_username`, MAX(`radacctid`) AS `maxID_radacctid`
FROM `radacct` `fUzDDUDv`
GROUP BY `fUzDDUDv`.`username`
) AS `radacct_MAX_ID`
ON `ultimo_acct_INNER`.`username`= `radacct_MAX_ID`.`maxID_username`
AND `ultimo_acct_INNER`.`radacctid` = `radacct_MAX_ID`.`maxID_radacctid`
) AS `ultimo_acct`
ON (`ultimo_acct`.`username`=`t`.`login`)
(...)
Could you try the following modification (for optimizing performance) ?
SELECT COUNT(DISTINCT `t`.`id`)
FROM
(
SELECT DISTINCT `t_sub`.`id`
FROM
`radcliente` `t_sub`
LEFT OUTER JOIN `radacct` `ultimo_acct`
ON (`ultimo_acct`.`username`=`t_sub`.`login`)
AND (ultimo_acct.radacctid =
(SELECT `fUzDDUDv`.radacctid
FROM `radacct` `fUzDDUDv`
WHERE `fUzDDUDv`.username = `t_sub`.login
ORDER BY `fUzDDUDv`.`radacctid` DESC LIMIT 1))
WHERE (ultimo_acct.framedipaddress = '177.23.209.194')
) AS `t`
LEFT OUTER JOIN `radcliente_endereco_instalacao` `endereco_instalacao`
ON (`endereco_instalacao`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radcliente_telefone` `telefones`
ON (`telefones`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radcliente_email` `emails`
ON (`emails`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radmetodo_cobranca` `metodo_cobranca`
ON (`metodo_cobranca`.`cliente_id`=`t`.`id`) AND (metodo_cobranca.arquivo = 'nao')
It preselects records joining first the clients table to the 'big' table 'radacct', selecting only the distinct set of 'right' client id-s and then joining this (relatively) small set to the rest of the tables in the topmost query
I think I understand now. I have no experience with Yii framework, so for know (taking into account your last comment) I can't think of something better than the following:
SELECT COUNT(DISTINCT `t`.`id`)
FROM `radcliente` `t`
LEFT OUTER JOIN `radcliente_endereco_instalacao` `endereco_instalacao`
ON (`endereco_instalacao`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radcliente_telefone` `telefones`
ON (`telefones`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radcliente_email` `emails`
ON (`emails`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radmetodo_cobranca` `metodo_cobranca`
ON (`metodo_cobranca`.`cliente_id`=`t`.`id`) AND (metodo_cobranca.arquivo = 'nao')
LEFT OUTER JOIN `radacct` `ultimo_acct`
ON (`ultimo_acct`.`username`=`t`.`login`)
AND (ultimo_acct.radacctid =
(SELECT `fUzDDUDv`.radacctid
FROM `radacct` `fUzDDUDv`
WHERE `fUzDDUDv`.username = t.login
ORDER BY `fUzDDUDv`.`radacctid` DESC LIMIT 1))
WHERE (ultimo_acct.framedipaddress = '177.23.209.194')
I hope someone could help get more precise answer to your needs.

getting data from multiple tables in mysql

My goal is to get from the following tables - the user's unique group names and ids, the latest comments for the user's groups, the latest "done" article for the user's groups, the SUM of done articles, and total articles. Basically what is presented in the bottom sheet.
So far I've managed to get the data from the groups table and from the articles, but I can't get the latest comment.
Here is my query
SELECT `groups`.`name` , `groups`.`id` , (
SELECT MAX( `articles`.`written` )
FROM `articles`
WHERE `group` = `groups`.`id`
AND `articles`.`done` = '1'
) AS latestArt, (
SELECT MAX( `comments`.`date_added` )
FROM `comments`
WHERE `comments`.`article_id` = `a`.`id`
AND `comments`.`active` = '1'
) AS latestComm, SUM( `a`.`done` = '1' ) articlesAchieved, COUNT( `a`.`id` ) AS totalArticles
FROM `groups`
LEFT JOIN `articles` AS `a` ON `a`.`group` = `groups`.`id`
LEFT JOIN `comments` AS `c` ON `c`.`note_id` = `a`.`id`
WHERE `groups`.`user_id` = '6'
AND `n`.`active` = '1'
GROUP BY `groups`.`id`
I've also tried to get the data by joining everything to the article table but I wasn't successful with that either :(
UPDATED Your query might look like this
SELECT g.id group_id, g.name group_name,
a.last_written, a.total_articles, a.total_done,
c.last_comment
FROM groups g LEFT JOIN
(
SELECT `group`,
MAX(CASE WHEN done = 1 THEN written END) last_written,
COUNT(*) total_articles,
SUM(done) total_done
FROM articles
WHERE active = 1
AND user_id = 1
GROUP BY `group`
) a
ON g.id = a.`group` LEFT JOIN
(
SELECT a.`group`,
MAX(date_added) last_comment
FROM commants c JOIN articles a
ON c.article_id = a.id
WHERE a.active = 1
AND a.user_id = 1
GROUP BY a.`group`
) c
ON g.id = c.`group`
WHERE user_id = 1