Related
I'm trying to create a list of students whose behaviour is statistically worst across each of our school's year groups.
We have a table named students.
We then have behavioural flags and alerts, plus sanctions.
However, different categories of flag/alert/sanction are deemed more serious than others. These are stored with labels in their respective _categories table, e.g. flag_categories and sanction_categories. The flag table will then have a column called Category_ID (alerts is a bit different as it's just a Type field with 'A', 'C', 'P' and 'S' values).
If I want to look at data which shows our highest-flagged students in a specific year group, I'd run this query:
SELECT
CONCAT(stu.Firstname, " ", stu.Surname) AS `Student`,
COUNT(f.ID) AS `Flags`
FROM `students` stu
LEFT JOIN `flags` f ON f.Student_ID = stu.id
WHERE stu.Year_Group = 9
GROUP BY stu.id
ORDER BY `Flags` DESC
LIMIT 0, 20
If I wanted to show our students with the most Crisis alerts in a specific year group, I'd run this query:
SELECT
CONCAT(stu.Firstname, " ", stu.Surname) AS `Student`,
COUNT(f.ID) AS `Flags`
FROM `students` stu
LEFT JOIN `flags` f ON f.Student_ID = stu.id
WHERE stu.Year_Group = 9
AND f.Category_ID = 10
GROUP BY stu.id
ORDER BY `Flags` DESC
LIMIT 0, 20
If I want to find how many Late or Mobile flags a student has, and perhaps add these together (with weightings), I can run the following query:
SELECT
CONCAT(stu.Firstname, " ", stu.Surname) AS `Student`,
SUM(CASE WHEN f.Category_ID = 10 THEN 1 ELSE 0 END) AS `Late Flags`,
SUM(CASE WHEN f.Category_ID = 12 THEN 2 ELSE 0 END) AS `Mobile Flags`,
## not sure about this line below... is there a nicer way of doing it? `Late Flags` isn't recognised as a field apparently
## so I can't just do ( `Late Flags` + `Mobile Flags` )
(SUM(CASE WHEN f.Category_ID = 10 THEN 1 ELSE 0 END) + SUM(CASE WHEN f.Category_ID = 12 THEN 2 ELSE 0 END)) AS `Points`
FROM `flags` f
LEFT JOIN `students` stu ON f.Student_ID = stu.id
WHERE stu.Year_Group = 9
GROUP BY stu.id
ORDER BY `Points` DESC
LIMIT 0, 20
What I don't understand is how I would do this across myriad tables. I need to be able to weight:
Late (flags, Category_ID = 10), Absconded (flags, Category_ID = 15) and Community flags (flags, Category_ID = 13) plus Safeguarding alerts (alerts, Type = 'S') are all worth 1 point
Behavioural flags (flags, Category_ID IN (1, 7, 8)) are worth 2 points
Process alerts (alerts, Type = 'P') and detention sanctions (sanctions, Category_ID = 1) are worth 3 points
So on and so forth. That's far from an exhaustive list but I've included enough variables to help me get my head round a multi-table weighted sum.
The outcome I'm looking for is just 2 columns - student's name and weighted points.
So, according to the bullet points above, if a student has received 2 Late flags (1 point each) and 1 Process alert (3 points), the output should just say Joe Bloggs and 5.
Can anyone help me to understand how I can get these weighted values from different tables into one SUM'd output for each student?
[edit] SQLFiddle here: http://sqlfiddle.com/#!9/449218/1/0
Note, I am not doing this for the bounty. Please give to someone else.
This could be done with a few LEFT JOINs of derived tables. Note you did not supply the sanctions table. But the below would appear to be well illustrative. So I created a temp table. It would seem to allow for maximum flexibility without overcomplicating a larger left join notion that might be hard to debug. Afterall, you said your real querying will be much more complicated than this. As such, build out the temp table structure more.
This loads a tmp table up with default 0's for the students in the "passed by parameter Student Year" to a stored procedure. Two updates are performed. Then selects for a result set.
Schema / Load:
create schema s38741386; -- create a test database
use s38741386;
CREATE TABLE `students` (
`id` int(11) PRIMARY KEY,
`Firstname` varchar(50) NOT NULL,
`Surname` varchar(50) NOT NULL,
`Year_Group` int(2) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
# STUDENT INSERTS
INSERT INTO `students`
(`id`, `Firstname`, `Surname`, `Year_Group`)
VALUES
(201, 'Student', 'A', 9),
(202, 'Student', 'B', 9),
(203, 'Student', 'C', 9),
(204, 'Student', 'D', 9),
(205, 'Student', 'E', 9);
CREATE TABLE `alert` (
`ID` int(11) PRIMARY KEY,
`Staff_ID` int(6) NOT NULL,
`Datetime_Raised` datetime NOT NULL,
`Room_Label` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
`Type` enum('A','C','P','Q','S') COLLATE utf8_unicode_ci NOT NULL COMMENT 'A=Absconded, C=Crisis, P=Process, Q=Quiet, S=Safeguarding',
`Details` text COLLATE utf8_unicode_ci,
`Responder` int(8) DEFAULT NULL,
`Datetime_Responded` datetime DEFAULT NULL,
`Room_ID` int(11) NOT NULL COMMENT 'will be linked to internal room id.',
`Status` varchar(1) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'O:ngoing, R:esolved'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
# ALERT INSERTS
INSERT INTO `alert`
(`ID`, `Staff_ID`, `Datetime_Raised`, `Room_Label`, `Type`, `Details`, `Responder`, `Datetime_Responded`, `Room_ID`, `Status`)
VALUES
(1, '101', '2016-08-04 00:00:00', NULL, 'P', NULL, '103', '2016-08-04 00:00:01', '15', 'R'),
(2, '102', '2016-08-04 00:00:00', NULL, 'P', NULL, '103', '2016-08-04 00:00:01', '15', 'R'),
(3, '102', '2016-08-04 00:00:00', NULL, 'P', NULL, '103', '2016-08-04 00:00:01', '15', 'R'),
(4, '101', '2016-08-04 00:00:00', NULL, 'P', NULL, '103', '2016-08-04 00:00:01', '15', 'R');
CREATE TABLE `alert_students` (
`ID` int(11) PRIMARY KEY,
`Alert_ID` int(6) NOT NULL,
`Student_ID` int(6) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
# ALERT_STUDENT INSERTS
INSERT INTO `alert_students`
(`ID`, `Alert_ID`, `Student_ID`)
VALUES
(1, '1', '201'),
(2, '1', '202'),
(3, '2', '201'),
(4, '3', '202'),
(5, '4', '203'),
(6, '5', '204');
CREATE TABLE `flags` (
`ID` int(11) PRIMARY KEY,
`Staff_ID` int(11) NOT NULL,
`Student_ID` int(11) NOT NULL,
`Datetime` datetime NOT NULL,
`Category_ID` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
# ALERT INSERTS
-- TRUNCATE TABLE flags;
INSERT INTO `flags`
(`ID`, `Staff_ID`, `Student_ID`, `Datetime`, `Category_ID`)
VALUES
(1, '101', '201', '2016-08-04 00:00:01', 10),
(2, '102', '202', '2016-08-04 00:00:02', 12),
(3, '102', '203', '2016-08-04 00:00:03', 10),
(4, '101', '204', '2016-08-04 00:00:04', 13),
(5, '102', '202', '2016-08-04 00:00:02', 12),
(6, '102', '203', '2016-08-04 00:00:03', 10),
(7, '101', '204', '2016-08-04 00:00:04', 13),
(8, '102', '202', '2016-08-04 00:00:02', 10),
(9, '102', '203', '2016-08-04 00:00:03', 10),
(10, '101', '204', '2016-08-04 00:00:04', 7),
(11, '101', '204', '2016-08-04 00:00:07', 8),
(12, '101', '204', '2016-08-04 00:00:08', 1),
(13, '101', '204', '2016-08-04 00:00:09', 8);
Stored Procedure:
DROP PROCEDURE IF EXISTS rptSM_by_year;
DELIMITER $$
CREATE PROCEDURE rptSM_by_year
( pSY INT -- parameter student year
)
BEGIN
DROP TEMPORARY TABLE IF EXISTS tmpStudentMetrics;
CREATE TEMPORARY TABLE tmpStudentMetrics
( `StudentId` int(11) PRIMARY KEY,
LateFP INT NOT NULL,
MobiFP INT NOT NULL,
AbscFP INT NOT NULL,
CommFP INT NOT NULL,
SafeAP INT NOT NULL,
BehaFP INT NOT NULL,
ProcAP INT NOT NULL
)ENGINE=InnoDB;
INSERT tmpStudentMetrics (StudentId,LateFP,MobiFP,AbscFP,CommFP,SafeAP,BehaFP,ProcAP)
SELECT id,0,0,0,0,0,0,0
FROM students
WHERE Year_Group = pSY;
UPDATE tmpStudentMetrics tmp
JOIN
( SELECT
stu.id,
SUM(CASE WHEN f.Category_ID = 10 THEN 1 ELSE 0 END) AS `LateFP`,
SUM(CASE WHEN f.Category_ID = 15 THEN 1 ELSE 0 END) AS `AbscFP`,
SUM(CASE WHEN f.Category_ID = 13 THEN 1 ELSE 0 END) AS `CommFP`,
SUM(CASE WHEN f.Category_ID = 12 THEN 2 ELSE 0 END) AS `MobiFP`,
SUM(CASE WHEN f.Category_ID IN (1,7,8) THEN 2 ELSE 0 END) AS `BehaFP`
FROM `flags` f
LEFT JOIN `students` stu ON f.Student_ID = stu.id
WHERE stu.Year_Group = pSY
GROUP BY stu.id
) xDerived
ON xDerived.id=tmp.StudentId
SET tmp.LateFP=xDerived.LateFP,
tmp.AbscFP=xDerived.AbscFP,
tmp.CommFP=xDerived.CommFP,
tmp.MobiFP=xDerived.MobiFP,
tmp.BehaFP=xDerived.BehaFP;
UPDATE tmpStudentMetrics tmp
JOIN
( SELECT
stu.id,
SUM(CASE WHEN a.Type = 'S' THEN 1 ELSE 0 END) AS `SafeAP`,
SUM(CASE WHEN a.Type = 'P' THEN 3 ELSE 0 END) AS `ProcAP`
FROM `alert_students` als
JOIN `alert` a
ON a.ID=als.Alert_ID
JOIN `students` stu
ON stu.id=als.Student_ID and stu.Year_Group = pSY
GROUP BY stu.id
) xDerived
ON xDerived.id=tmp.StudentId
SET tmp.SafeAP=xDerived.SafeAP,
tmp.ProcAP=xDerived.ProcAP;
-- SELECT * FROM tmpStudentMetrics; -- check detail
SELECT stu.id,
CONCAT(stu.Firstname, " ", stu.Surname) AS `Student`,
tmp.LateFP+tmp.MobiFP+tmp.AbscFP+tmp.CommFP+tmp.SafeAP+tmp.BehaFP+tmp.ProcAP AS `Points`
FROM `students` stu
JOIN tmpStudentMetrics tmp
ON tmp.StudentId=stu.id
WHERE stu.`Year_Group` = pSY
ORDER BY stu.id;
-- SELECT * FROM tmpStudentMetrics; -- check detail
DROP TEMPORARY TABLE IF EXISTS tmpStudentMetrics;
-- TEMP TABLES are connection based. Explicityly dropped above for safety when done.
-- Depends on your connection type and life-span otherwise.
END$$
DELIMITER ;
Test:
call rptSM_by_year(9);
+-----+-----------+--------+
| id | Student | Points |
+-----+-----------+--------+
| 201 | Student A | 7 |
| 202 | Student B | 11 |
| 203 | Student C | 6 |
| 204 | Student D | 10 |
| 205 | Student E | 0 |
+-----+-----------+--------+
Cleanup:
drop schema s38741386; -- drop the test database
Think all you have asked can be done with a subquery and a couple of sub-SELECTs:
SELECT `Student`,
`Late Flags` * 1
+ `Absconded Flags` * 1
+ `Community Flags` * 1
+ `Safeguarding Alerts Flags` * 1
+ `Behavioural flags` * 2
+ `Process Alerts Flags` * 3 AS `Total Points`
FROM
(
SELECT
CONCAT(stu.Firstname, " ", stu.Surname) AS `Student`,
SUM(CASE WHEN f.Category_ID = 10 THEN 1 ELSE 0 END) AS `Late Flags`,
SUM(CASE WHEN f.Category_ID = 12 THEN 1 ELSE 0 END) AS `Mobile Flags`,
SUM(CASE WHEN f.Category_ID = 15 THEN 1 ELSE 0 END) AS `Absconded Flags`,
SUM(CASE WHEN f.Category_ID = 13 THEN 1 ELSE 0 END) AS `Community Flags`,
(SELECT COUNT(*) FROM `alert` a JOIN `alert_students` ast ON ast.`Alert_ID` = a.`ID`
WHERE ast.`Student_ID` = stu.`id` AND a.`Type` = 'S') AS `Safeguarding Alerts Flags`,
SUM(CASE WHEN f.Category_ID IN (1, 7, 8) THEN 1 ELSE 0 END) AS `Behavioural flags`,
(SELECT COUNT(*) FROM `alert` a JOIN `alert_students` ast ON ast.`Alert_ID` = a.`ID`
WHERE ast.`Student_ID` = stu.`id` AND a.`Type` = 'P') AS `Process Alerts Flags`
FROM `students` stu
LEFT JOIN `flags` f ON f.Student_ID = stu.id
WHERE stu.Year_Group = 9
GROUP BY stu.id
LIMIT 0, 20
) subq
ORDER BY `Total Points` DESC;
The above query includes everything you mentioned apart from sanctions (as your original SQL Fiddle demo didn't include this table).
Demo
An updated fiddle with the above query is here: http://sqlfiddle.com/#!9/449218/39.
You could use union all
Basically you create all your individual queries for each table and connect them all together using union all.
Here is an example, I used your student table twice but you would change the second one to what ever other table you want. SQLFiddle
You can do it with LEFT JOINS:
SELECT CONCAT(stu.firstname,' ', stu.surname) student,
COALESCE(f_group.weight_sum,0) + COALESCE(a_group.weight_sum,0) + COALESCE(s_group.weight_sum,0) points
FROM students stu
LEFT JOIN (
SELECT s_f.id, SUM(f.category_id IN (10,13,15) + 2 * f.category_id IN (1,7,8)) weight_sum
FROM students s_f
JOIN flags f
ON f.student_id = s_f.id
AND f.category_id IN (1,7,8,10,13,15)
WHERE s_f.year_group = :year_group
GROUP BY s_f.id
) f_group
LEFT JOIN (
SELECT s_a.id, 3 * COUNT(*) weight_sum
FROM students s_a
JOIN alerts a
ON a.student_id = s_a.id
AND a.type = 'P'
WHERE s_a.year_group = :year_group
GROUP BY s_a.id
) a_group
LEFT JOIN (
SELECT s_s.id, COUNT(*) weight_sum
FROM students s_s
JOIN sanctions s
ON s.student_id = s_s.id
AND s.category_id = 1
WHERE s_s.year_group = :year_group
GROUP BY s_s.id
) s_group
WHERE stu.year_group = :year_group
ORDER BY points DESC
LIMIT 0, 20
BUT if you have full access to the DB I'd be putting those weights in the respective categories and types, which will simplify the logic.
I'm working on a practice problem with DDL as follows:
CREATE TABLE people (
id SMALLINT NOT NULL AUTO_INCREMENT,
first_name VARCHAR(50),
last_name VARCHAR(50),
PRIMARY KEY (id)
)
;
CREATE TABLE cd (
id SMALLINT NOT NULL AUTO_INCREMENT,
artist VARCHAR(50),
title VARCHAR(50),
PRIMARY KEY(id),
owner SMALLINT,
FOREIGN KEY (owner) REFERENCES people(id)
)
;
CREATE TABLE lend (
id SMALLINT NOT NULL AUTO_INCREMENT,
cd_id SMALLINT,
lend_to SMALLINT,
FOREIGN KEY (lend_to) REFERENCES people(id),
FOREIGN KEY (cd_id) REFERENCES cd(id),
lend_date DATE DEFAULT '0000-00-00',
PRIMARY KEY(id)
)
;
INSERT INTO people (id, first_name, last_name) VALUES
(1, 'Brett', 'CEO'),
(2, 'Jeff', 'President'),
(3, 'Beta', 'Media'),
(4, 'Casey', 'Content')
;
INSERT INTO cd (id, artist, title, owner) VALUES
(1, 'The xx', 'Coexist', 2),
(2, 'ACDC', 'High Voltage', 1),
(3, 'Bjork', 'Cocoon', 3),
(4, 'Ella Fitzgerald', 'Ella Sings Gershwin', 4),
(5, 'Fever Ray', 'Live in Lulea', 2),
(6, 'Tom Waits', 'Rain Dogs', 4),
(7, 'Howlin Wolf', 'Smokestack Lightning', 1),
(8, 'Tupac', 'Poetic Justice', 4)
;
INSERT INTO lend (id, cd_id, lend_to, lend_date) VALUES
(1, 2, 3, '2014/01/03'),
(2, 3, 1, '2014/04/02'),
(3, 7, 4, '2013/12/22'),
(4, 4, 2, '2014/01/03')
;
I want my query to show who the CD is lent to. I can get the ID from the lend table, but want to display the full name of the individual lending it from the people table. Do I need to rework the design of how the lend table connects to the people table, or just use some sort of case function in the query? Below is my query so-far where I'm getting the l.lent_to and want to be showing the CONCAT(p.first_name, ' ', p.last_name) who the CD is lent to.
SELECT /*cd.id,*/
CONCAT(p.first_name, ' ', p.last_name) 'CD OWNER',
cd.title,
l.lend_to,
p.id ,
(
CASE
WHEN l.lend_to IS NULL
THEN 'Not Lent'
ELSE DATE_FORMAT(l.lend_date, '%m-%d-%Y')
END
) 'LEND DATE',
(
CASE
WHEN l.lend_to IS NULL
THEN 'Not Lent'
ELSE TIMESTAMPDIFF(day, l.lend_date, NOW())
END
) 'DAYS LENT'
FROM
people p
LEFT JOIN cd cd
ON p.id = cd.owner
LEFT JOIN lend l
ON cd.id = l.cd_id
LEFT JOIN lend l1
on p.id = l1.lend_to
;
See if this query gives you the basic information you are looking for
select c.title as 'Title', c.artist as 'Artist', o.first_name as 'Owner',
l.lend_date as 'Lend Date', p.first_name as 'Lender'
from cd c
left outer join people o on c.owner = o.id
left outer join lend l on c.id = l.cd_id
left outer join people p on l.lend_to = p.id
You can add additional switch logic to refine the result, if this is what you are looking for.
I've resolved the issue with a data architecture redesign. Take a look if interested.
http://sqlfiddle.com/#!2/b6158/3
Assuming a main "job" table, and two corresponding "log" tables (one for server events and the other for user events, with quite different data stored in each).
What would be the best way to return a selection of "job" records and the latest corresponding log record (with multiple fields) from each of the two "log" tables (if there are any).
Did get some inspiration from: MySQL Order before Group by
The following SQL would create some example tables/data...
CREATE TABLE job (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` tinytext NOT NULL,
PRIMARY KEY (id)
);
CREATE TABLE job_log_server (
`id` int(11) NOT NULL AUTO_INCREMENT,
`job_id` int(11) NOT NULL,
`event` tinytext NOT NULL,
`ip` tinytext NOT NULL,
`created` datetime NOT NULL,
PRIMARY KEY (id),
KEY job_id (job_id)
);
CREATE TABLE job_log_user (
`id` int(11) NOT NULL AUTO_INCREMENT,
`job_id` int(11) NOT NULL,
`event` tinytext NOT NULL,
`user_id` int(11) NOT NULL,
`created` datetime NOT NULL,
PRIMARY KEY (id),
KEY job_id (job_id)
);
INSERT INTO job VALUES (1, 'Job A');
INSERT INTO job VALUES (2, 'Job B');
INSERT INTO job VALUES (3, 'Job C');
INSERT INTO job VALUES (4, 'Job D');
INSERT INTO job_log_server VALUES (1, 2, 'Job B Event 1', '127.0.0.1', '2000-01-01 00:00:01');
INSERT INTO job_log_server VALUES (2, 2, 'Job B Event 2', '127.0.0.1', '2000-01-01 00:00:02');
INSERT INTO job_log_server VALUES (3, 2, 'Job B Event 3*', '127.0.0.1', '2000-01-01 00:00:03');
INSERT INTO job_log_server VALUES (4, 3, 'Job C Event 1*', '127.0.0.1', '2000-01-01 00:00:04');
INSERT INTO job_log_user VALUES (1, 1, 'Job A Event 1', 5, '2000-01-01 00:00:01');
INSERT INTO job_log_user VALUES (2, 1, 'Job A Event 2*', 5, '2000-01-01 00:00:02');
INSERT INTO job_log_user VALUES (3, 2, 'Job B Event 1*', 5, '2000-01-01 00:00:03');
INSERT INTO job_log_user VALUES (4, 4, 'Job D Event 1', 5, '2000-01-01 00:00:04');
INSERT INTO job_log_user VALUES (5, 4, 'Job D Event 2', 5, '2000-01-01 00:00:05');
INSERT INTO job_log_user VALUES (6, 4, 'Job D Event 3*', 5, '2000-01-01 00:00:06');
One option (only returning 1 field from each table) would be to use nested sub-queries... but the ORDER BY will have to be done in separate queries to the GROUP BY (x2):
SELECT
*
FROM
(
SELECT
s2.*,
jlu.event AS user_event
FROM
(
SELECT
*
FROM
(
SELECT
j.id,
j.name,
jls.event AS server_event
FROM
job AS j
LEFT JOIN
job_log_server AS jls ON jls.job_id = j.id
ORDER BY
jls.created DESC
) AS s1
GROUP BY
s1.id
) AS s2
LEFT JOIN
job_log_user AS jlu ON jlu.job_id = s2.id
ORDER BY
jlu.created DESC
) AS s3
GROUP BY
s3.id;
Which actually seems to perform quite well... just not very easy to understand.
Or you could try to return and sort the log records in two separate sub-queries:
SELECT
j.id,
j.name,
jls2.event AS server_event,
jlu2.event AS user_event
FROM
job AS j
LEFT JOIN
(
SELECT
jls.job_id,
jls.event
FROM
job_log_server AS jls
ORDER BY
jls.created DESC
) AS jls2 ON jls2.job_id = j.id
LEFT JOIN
(
SELECT
jlu.job_id,
jlu.event
FROM
job_log_user AS jlu
ORDER BY
jlu.created DESC
) AS jlu2 ON jlu2.job_id = j.id
GROUP BY
j.id;
But this seems to take quite a bit longer to run... possibly because of the amount of records it's adding to a temporary table, which are then mostly ignored (to keep this short-ish, I've not added any conditions to the job table, which would otherwise be only returning active jobs).
Not sure if I've missed anything obvious.
How about the following SQL Fiddle. It produces the same results as both of your queries.
SELECT j.id, j.name,
(
SELECT s.event
FROM job_log_server s
WHERE j.id = s.job_id
ORDER BY s.id DESC
LIMIT 1
)AS SERVER_EVENT,
(
SELECT u.event
FROM job_log_user u
WHERE j.id = u.job_id
ORDER BY u.id DESC
LIMIT 1
)AS USER_EVENT
FROM job j
EDIT SQL Fiddle:
SELECT m.id, m.name, js.event AS SERVER_EVENT, ju.event AS USER_EVENT
FROM
(
SELECT j.id, j.name,
(
SELECT s.id
FROM job_log_server s
WHERE j.id = s.job_id
ORDER BY s.id DESC
LIMIT 1
)AS S_E,
(
SELECT u.id
FROM job_log_user u
WHERE j.id = u.job_id
ORDER BY u.id DESC
LIMIT 1
)AS U_E
FROM job j
) m
LEFT JOIN job_log_server js ON js.id = m.S_E
LEFT JOIN job_log_user ju ON ju.id = m.U_E
I maybe ask a relatively simple question. But I cannot find a solution to this. It's a matter of two tables MANY TO MANY, so there's a third table between them. The schema below:
CREATE TABLE `options` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(200) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `options` (`id`, `name`) VALUES
(1, 'something'),
(2, 'thing'),
(3, 'some option'),
(4, 'other thing'),
(5, 'vacuity'),
(6, 'etc');
CREATE TABLE `person` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(200) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `person` (`id`, `name`) VALUES
(1, 'ROBERT'),
(2, 'BOB'),
(3, 'FRANK'),
(4, 'JOHN'),
(5, 'PAULINE'),
(6, 'VERENA'),
(7, 'MARCEL'),
(8, 'PAULO'),
(9, 'SCHRODINGER');
CREATE TABLE `person_option_link` (
`person_id` int(11) NOT NULL,
`option_id` int(11) NOT NULL,
UNIQUE KEY `person_id` (`person_id`,`option_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `person_option_link` (`person_id`, `option_id`) VALUES
(1, 1),
(2, 1),
(2, 2),
(3, 2),
(3, 3),
(3, 4),
(3, 5),
(4, 1),
(4, 3),
(4, 6),
(5, 3),
(5, 4),
(5, 5),
(6, 1),
(7, 2),
(8, 3),
(9, 4)
(5, 6);
The idea is as follow: I would like to retrieve all people who have a link to option_id=1 AND option_id=3.
The expected result should be one person: John.
But I tried with something like that, which doesn't work because it returns also people who have 1 OR 3:
SELECT *
FROM person p
LEFT JOIN person_option_link l ON p.id = l.person_id
WHERE l.option_id IN ( 1, 3 )
What is the best practice in this case?
//////// POST EDITED: I need to focus on an other important point ////////
And what if we add a new condition with NOT IN? like:
SELECT *
FROM person p
LEFT JOIN person_option_link l ON p.id = l.person_id
WHERE l.option_id IN ( 3, 4 )
AND l.option_id NOT IN ( 6 )
In this case, the result should be FRANK, because PAULINE who has also 3 and 4, have the option 6 and we don't want that.
Thanks!
This is a Relational Division Problem.
SELECT p.id, p.name
FROM person p
INNER JOIN person_option_link l
ON p.id = l.person_id
WHERE l.option_id IN ( 1, 3 )
GROUP BY p.id, p.name
HAVING COUNT(*) = 2
SQLFiddle Demo
if a unique constraint was not enforce on option_id for every id, a DISTINCT keyword is required to filter unique option_ID
SELECT p.id, p.name
FROM person p
INNER JOIN person_option_link l
ON p.id = l.person_id
WHERE l.option_id IN ( 1, 3 )
GROUP BY p.id, p.name
HAVING COUNT(DISTINCT l.option_id) = 2
SQL of Relational Division
Use GROUP BY and COUNT:
SELECT p.id, p.name
FROM person p
LEFT JOIN person_option_link l ON p.id = l.person_id
WHERE l.option_id IN ( 1, 3 )
GROUP BY p.id, p.name
HAVING COUNT(Distinct l.option_id) = 2
I prefer using COUNT DISTINCT in case you could have the same option id multiple times.
Good luck.
It may not be the best option, but you could use a 'double join' to the person_option_link table:
SELECT *
FROM person AS p
JOIN person_option_link AS l1 ON p.id = l1.person_id AND l1.option_id = 1
JOIN person_option_link AS l2 ON p.id = l2.person_id AND l2.option_id = 3
This ensures that there is simultaneously a row with option ID of 1 and another with option ID of 3 for the given user.
The GROUP BY alternatives certainly work; they might well be quicker too (but you'd need to scrutinize query plans to be sure). The GROUP BY alternatives scale better to handle more values: for example, a list of the users with option IDs 2, 3, 5, 7, 11, 13, 17, 19 is fiddly with this variant but the GROUP BY variants work without structural changes to the query. You can also use the GROUP BY variants to select users with at least 4 of the 8 values which is substantially infeasible using this technique.
Using the GROUP BY does require a slight restatement (or rethinking) of the query, though, to:
How can I select people who have 2 of the option IDs in the set {1, 3}?
How can I select people who have 8 of the option IDs in the set {2, 3, 5, 7, 11, 13, 17, 19}?
How can I select people who have at least 4 of the option IDs in the set {2, 3, 5, 7, 11, 13, 17, 19}?
For the "has not these ids" part of the question, simply add a WHERE clause:
WHERE person_id NOT IN
(
SELECT person_id
FROM person_option_link
WHERE option_id = 4
)
I have a simple application that tracks diners and their favorite flavors and desserts. The records table is just the diner's name and ID, the mid table tracks the desserts and flavors (again by an ID linked to another table of values).
CREATE TABLE IF NOT EXISTS `records` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=3 ;
INSERT INTO `records` (`id`, `name`) VALUES
(1, 'Jimmy Jones'),
(2, 'William Henry');
CREATE TABLE IF NOT EXISTS `mid` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`diner` int(11) NOT NULL,
`dessert` int(11) NOT NULL DEFAULT '0',
`flavor` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=11 ;
INSERT INTO `mid` (`id`, `diner`, `dessert`, `flavor`) VALUES
(1, 1, 3, 0),
(2, 1, 2, 0),
(3, 1, 15, 0),
(4, 1, 0, 1),
(5, 2, 3, 0),
(6, 2, 6, 0),
(7, 2, 0, 4),
(8, 1, 34, 0),
(9, 2, 0, 4),
(10, 2, 0, 22);
I'm a little stumped by what should be a simple query-- I want to get all IDs from the records table where certain dessert or flavor requirements are met:
SELECT a.id
FROM records AS a
JOIN mid AS b ON a.id = b.diner
WHERE b.dessert IN (3,2,6)
AND b.flavor IN (4,22)
This query returns no rows, even though there are records that match the where clauses. I am pretty sure I'm missing something obvious with the JOIN but I've tried INNER, OUTER, LEFT and RIGHT with no success.
Can someone put me on the right track and explain what I'm missing?
Thanks
You seem to want diners that have the combinations. Here is one way:
select diner
from records
group by diner
having max(b.dessert = 3) = 1 and
max(b.dessert = 2) = 1 and
max(b.dessert = 6) = 1 and
max(b.flavor = 4) = 1 and
max(b.flavor = 22) = 1
This answers your comment:
select diner
from records
group by diner
having max(case when b.dessert in (2, 3, 6) then 1 esle 0 end) = 1 and
max(case when b.dessert in (4, 22) then 1 else 0 end) = 1
If you are just looking for the records in a that match the conditions, use:
select r.*, d.name
from records r join
diner d
on r.diner = d.id
where b.dessert IN (3,2,6) AND b.flavor IN (4,22)
If this is what you want, the join condition in your query is wrong (a.id should be a.diner).
You SQL statement is fine, but non of your sample records meet your condition, records that would match should look like this
dessert flavor
3 4
3 22
2 4
2 33
6 4
6 22
Non of your input record has any of these combinations
Your WHERE condition does not fit any record in the "mid" table.
There are no records that have dessert in (3, 2, 6) AND flavor in (4, 22), so the query (correctly)returns no result.
You don't have any records that match both where conditions.
( 1, 1, 3, 0) - Matches dessert IN (3,2,6)
( 2, 1, 2, 0) - Matches dessert IN (3,2,6)
( 3, 1, 15, 0)
( 4, 1, 0, 1)
( 5, 2, 3, 0) - Matches dessert IN (3,2,6)
( 6, 2, 6, 0) - Matches dessert IN (3,2,6)
( 7, 2, 0, 4) - Matches flavor IN (4,22)
( 8, 1, 34, 0)
( 9, 2, 0, 4) - Matches flavor IN (4,22)
(10, 2, 0, 22) - Matches flavor IN (4,22)
Perhaps you meant OR?
SELECT a.id
FROM records AS a
JOIN mid AS b ON a.id = b.diner
WHERE b.dessert IN (3,2,6)
OR b.flavor IN (4,22)
Should return 7 results.
Also, your thoughts on JOIN are a red herring. The difference between LEFT and RIGHT is just which table gets precedence when the join clause doesn't match records between them. The difference between INNER and OUTER is just what happens when there isn't a matching record between the two tables. Try this explanative article from coding horror for more details on joins (helpfully pointed out to me in a different SO question, heh).