I am trying to find the name of the person who received the highest grade in the "Big Data" course.
I have 3 different tables:
People (id, name, age, address)
---------------------------------------------------
p1 | Tom Martin| 24 | 11, Integer Avenue, Fractions, MA
p2 | Al Smith | 33 | 26, Main Street, Noman's Land, PA
p3 | Kim Burton| 40 | 45, Elm Street, Blacksburg, VA
---------------------------------------------------
Courses (cid, name, department)
---------------------------------------------------------
c1 | Systematic Torture | MATH
c2 | Pretty Painful | CS
c3 | Not so Bad | MATH
c4 | Big Data | CS
---------------------------------------------------------
Grades (pid, cid, grade)
---------------------------------------------------
p1 | c1 | 3.5
p2 | c3 | 2.5
p3 | c2 | 4.0
p3 | c4 | 3.85
---------------------------------------------------
I can't figure out how to find the person with the highest grade without using any fancy SQL feature. That is, I just want to use SELECT, FROM, WHERE, UNION, INTERSECT, EXCEPT, CREATE VIEW and arithmetic comparison operators like =, <, >.
My outcome is showing something other than what I try to achieve.
This is what I have tried so far:
CREATE VIEW TEMPFIVE AS
SELECT G1.pid FROM Grades AS G1, Grades AS G2 WHERE G1.pid = G2.pid AND G1.cid = G2.cid
SELECT People.name, Courses.name FROM TEMPFIVE, People, Courses WHERE TEMPFIVE.pid = People.pid AND Courses.name = "Big Data";
+------------+----------+
| name | name |
+------------+----------+
| Tom Martin | Big Data |
| Al Smith | Big Data |
|Kim Burton | Big Data |
|Kim Burton | Big Data |
+------------+----------+
The easiest way is to use LIMT 1 with an ORDER BY DESC clause:
SELECT p.name, c.name, g.grade
FROM People AS p
JOIN Grades AS g ON p.id = g.pid
JOIN Courses AS c ON c.cid = g.cid
WHERE c.name = "Big Data"
ORDER BY g.grade DESC LIMIT 1
No Idea for MySql Query structure. So Explained in steps. I hope you can build query based on that.
join three tables according to their relationship
set course name 'Big data' in where clause
set grade order to DESC order
set the limit to fetch only first row.
Try this
select * from(
select p.id pid,p.name name, p.age age,p.address address,
c.cid cid, c.name coursname, c.department department,g.grade grade
from Grades G
left join
Courses C on g.cid = c.cid
left join
People p on g.pid = p.id
)a where coursname= 'Big Data' order by grade desc
you can apply the operators on the where clause
GiorgosBestos shoes the correct way if you only want 1 record. If you want ties. meaning if more than 1 student has the same MAX grade then you can do a subselect as follows:
SELECT p.name, c.name, g.grade
FROM
(
SELECT c.cid, MAX(g.grade) MaxGrade
FROM
Grades g
INNER JOIN Courses c
ON c.cid = g.cid
AND c.name = 'Big Data'
GROUP BY
c.cid
) m
INNER JOIN Grades g
ON g.cid = m.cid
AND g.grade = m.MaxGrade
INNER JOIN People p
ON g.pid = p.id
The following SQL covers the case when tow or more students have the same maximum grade:
SELECT P.NAME,
C.NAME,
G.GRADE
FROM PEOPLE P
JOIN GRADES G ON G.PID = P.ID
JOIN COURSES C ON C.CID = G.CID
WHERE C.NAME = 'Big data'
AND G.GRADE = (SELECT MAX(G2.GRADE)
FROM PEOPLE P2
JOIN GRADES G2 ON G2.PID = P2.ID
JOIN COURSES C2 ON C2.CID = G2.CID
WHERE C2.NAME = 'Big data');
It is similar but not identical to the SQL proposed by Matt.
Related
Running into a seemingly simple JOIN problems here..
I have two tables, users and courses
| users.id | users.name |
| 1 | Joe |
| 2 | Mary |
| 3 | Mark |
| courses.id | courses.name |
| 1 | History |
| 2 | Math |
| 3 | Science |
| 4 | English |
and another table that joins the two:
| users_id | courses_id |
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
I'm trying to find distinct user names who are in course 1 and course 2
It's possible a user is in other courses, too, but I only care that they're in 1 and 2 at a minimum
SELECT DISTINCT(users.name)
FROM users_courses
LEFT JOIN users ON users_courses.users_id = users.id
LEFT JOIN courses ON users_courses.courses_id = courses.id
WHERE courses.name = "History" AND courses.name = "Math"
AND courses.name NOT IN ("English")
I understand why this is returning an empty set (since no single joined row has History and Math - it only has one value per row.
How can I structure the query so that it returns "Joe" because he is in both courses?
Update - I'm hoping to avoid hard-coding the expected total count of courses for a given user, since they might be in other courses my search does not care about.
Join users to a query that returns the user ids that are in both courses:
select u.name
from users u
inner join (
select users_id
from users_courses
where courses_id in (1, 2)
group by users_id
having count(distinct courses_id) = 2
) c on c.users_id = u.id
You can omit distinct from the condition:
count(distinct courses_id) = 2
if there are no duplicates in users_courses.
See the demo.
If you want to search by course names and not ids:
select u.name
from users u
inner join (
select uc.users_id
from users_courses uc inner join courses c
on c.id = uc.courses_id
where c.name in ('History', 'Math')
group by uc.users_id
having count(distinct c.id) = 2
) c on c.users_id = u.id
See the demo.
Results:
| name |
| ---- |
| Joe |
You can use in operator and use select to generate list of potential users_id attending the second course, to find matching ones in the first course. This is many times faster than using joins.
select distinct u.users_id, users.name
from users_courses u, users
where u.users_id in (select distinct users_id from users_courses where courses_id = 2)
and u.courses_id = 1
and users.users_id = u.users_id
Almost similar to what #Nae's solution.
select u.name from users u
where exists
(select 1
from users_courses uc
where uc.course_id in (1, 2)
and uc.user_id = u.id
group by uc.user_id
having count(0) = 2);
Your code is close. Just use GROUP BY and a HAVING clause:
SELECT u.name
FROM users_courses uc JOIN
users u
ON uc.users_id = u.id JOIN
courses c
ON uc.courses_id = c.id
WHERE c.name IN ('History', 'Math')
GROUP BY u.name
HAVING COUNT(DISTINCT c.name) = 2;
Notes:
This assumes that users cannot have the same name. You might want to use GROUP BY u.id, u.name to ensure that you are counting individual users.
If users cannot take the same course multiple times, then use COUNT(*) = 2 rather than COUNT(DISTINCT).
I'd write:
SELECT MAX(u.name)
FROM users_courses uc
LEFT JOIN users u ON uc.users_id = u.id
WHERE uc.courses_id IN (1, 2)
GROUP BY uc.users_id
HAVING COUNT(0) = 2
;
For more complex conditions (for example requiring the user to be in certain classes but also not in certain classes such as "Science") this should also work:
SELECT MAX(u.name)
FROM users_courses uc
LEFT JOIN users u ON uc.users_id = u.id
GROUP BY uc.users_id
HAVING (
SUM(uc.courses_id = 1) = 1
-- user enrolled exactly once in the course 2
AND SUM(uc.courses_id = 2) = 1
-- user enrolled in course 3, 0 times
AND SUM(uc.courses_id = 3) = 0
)
;
I've run into an issue where every time I attempt to use GROUP BY, H2 informs me that I need to add certain column names into the GROUP BY clause because, based on my research, it's unclear to H2 how to sort columns with non-repeating data.
Here's an example to elaborate:
Person table
+------------+------------+
| ID | Name |
+============+============+
| 1 | John |
+------------+------------+
| 2 | Jane |
+------------+------------+
Pet table
+------------+------------+------------+------------+
| ID | PERSON_ID | NAME | BIRTHDATE |
+============+============+============+============+
| 1 | 1 | Rufus | 2012 |
+------------+------------+------------+------------+
| 2 | 1 | Ben | 2014 |
+------------+------------+------------+------------+
Let's say I want all the oldest pets belonging to John.
SELECT PERSON.NAME, PET.NAME, PET.BIRTHDATE FROM PERSON
INNER JOIN PET ON PET.PERSON_ID = PERSON.ID
GROUP BY PERSON.NAME
ORDER BY PET.BIRTHDATE ASC
This would work perfectly in MySQL because it will simply group by PERSON.NAME and, by default, select the first record in the set. However, in H2 it needs to have aggregation such as MAX, MIN, etc.
The problem, as you can see in this example, is that you could use MIN to get the BIRTHDATE ordered correctly but there does not appear to be any aggregation function available for sorting NAME based on the oldest BIRTHDATE?
If you want the oldest pets, I would recommend:
SELECT p.NAME, pt.NAME, pt.BIRTHDATE
FROM PERSON p INNER JOIN
PET pt
ON pt.PERSON_ID = p.ID
WHERE pt.BIRTHDATE = (SELECT MIN(pt2.BIRTHDATE)
FROM pet pt2
WHERE pt2.PERSON_ID = PT.PERSON_ID
);
This explicitly selects the pet or pets (for each person) that have the earliest birth year. No aggregation is necessary.
You can also phrase this with JOINs only in the FROM:
SELECT p.NAME, pt.NAME, pt.BIRTHDATE
FROM PERSON p INNER JOIN
PET pt
ON pt.PERSON_ID = p.ID JOIN
(SELECT PERSON_ID, MIN(pt2.BIRTHDATE) as MINBT
FROM pet pt2
GROUP BY pt2.PERSON_ID
) pt2
ON pt2.PERSON_ID = PT.PERSON_ID;
You can always resort to NOT EXISTS in such cases, if the person has no pet with smaller birthdate then the pet is the oldest (if two pets happen to have the same age and both are the oldest ones for that person, then both are selected):
SELECT p.NAME, q.NAME, q.BIRTHDATE
FROM PERSON p
INNER JOIN PET q ON q.PERSON_ID = p.ID AND NOT EXISTS (
SELECT * FROM PET WHERE PERSON_ID = p.ID AND BIRTHDATE < q.BIRTHDATE
)
ORDER BY q.BIRTHDATE ASC
If you insist on GROUP BY you can do it like this:
SELECT a.name, b.name, b.BIRTHDATE FROM (
SELECT p.id, MIN(q.BIRTHDATE) birthdate FROM PERSON p
INNER JOIN PET q ON q.PERSON_ID = p.ID
GROUP BY p.ID
) o INNER JOIN PERSON a ON a.ID = o.ID
INNER JOIN PET b ON b.PERSON_ID = a.ID AND b.BIRTHDATE = o.BIRTHDATE
ORDER BY b.BIRTHDATE
If you can use WITH the query could be written easier.
All, I've found out my users have been inputting Customer names all wrong. Below is an example of how they are entering customer names. I guess they thought they needed an account for each residence this guy owns. I have similar entries as well, but the fake middle initial is before the last name. If I wanted to pull a list of customers that share names and emails how would I go about this? I've already used a query I'll include below my example data, but it's missing results like in my example data. Instead it returns other duplicates I want it to return, just not records like 1,2 below.
Example:
ID | first Name | last Name | email | Residence |
---+------------+-----------+----------------+---------------+
1 | Bill A | Bob | bill#bob.com | 1-2 broad st |
2 | Bill B | Bob | bill#bob.com | 1-3 broad st |
3 | Fred | Jones | f.jones#me.com | 1 example st |
4 | Fred | Jones | f.jones#me.com | 200 South ave |
5 | Alex | Man | Manley#grt.com | 25 N Main st |
6 | Alex | Man | Manley#grt.com | 39 Front st |
Query:
SELECT C.ID, R.Customer_ID , C.orgName, C.fName, C.lName, C.email, R.hNumber, R.street, R.aNumber, R.city
FROM Customer C
LEFT JOIN Residence R ON C.ID = R.Customer_ID
JOIN (
SELECT X.fName, X.lName
FROM Customer X
GROUP BY X.fName, X.lName
HAVING COUNT(*) > 1
) X ON X.fName = C.fName AND X.lName = C.lName
ORDER BY C.fName, C.lName
You can use (at least for mysql)
SELECT C.ID, R.Customer_ID , C.orgName, C.fName, C.lName, C.email,
R.hNumber, R.street, R.aNumber, R.city
FROM Customer C
LEFT JOIN Residence R ON C.ID = R.Customer_ID
JOIN Customer C1 on C.ID <> C1.id
LEFT JOIN Residence R1 ON C1.ID = R1.Customer_ID
where
C1.fName = C.fName AND C1.lName = C.lName
or C1.email = C.email
or <whatever else you like to compare, eg. same adress + same lastname>
group by C.ID
or, more general,
SELECT C.ID, R.Customer_ID , C.orgName, C.fName, C.lName, C.email,
R.hNumber, R.street, R.aNumber, R.city
FROM Customer C
LEFT JOIN Residence R ON C.ID = R.Customer_ID
where exists (
select * from
Customer C1
LEFT JOIN Residence R1 ON C1.ID = R1.Customer_ID
where
C.ID <> C1.id
and (
C1.fName = C.fName AND C1.lName = C.lName
or C1.email = C.email
or <whatever else you like to compare, eg. same adress + same lastname>
)
)
Of course this will only give you a limited duplicate check, especially if someone is intentionally trying to bypass this (e.g. in a shopsystem, but there are tools and procedures to help you with that).
I don't think there is no... each way of doing it will probably involve manually identifying a pattern that has been used and modifying it, like using a large case statement... which isn't that "automatic"
Closest would be to use the soundex to tell if they sound the same... http://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_soundex
If you can use another programming language then I'd recommend something like... http://php.net/manual/en/function.similar-text.php but it will be comutationally heavy
Say I have these data from two table:
Student Table columns:
id | name
Course Table columns:
id | code | name
and I want to use the Student.id AS Student and Course.id AS Course
to get the following:
Student | Course
-----------------
1 | C
1 | B
1 | A
2 | F
2 | B
2 | A
3 | C
3 | B
3 | F
How would I query it so it will return only the Students with a Course C and their other Courses like below:
Student | Course
-----------------
1 | C
1 | B
1 | A
3 | C
3 | B
3 | F
?
I have tried :
SELECT Student.id, Course.code FROM Course
INNER JOIN Student ON Course.student = Student.id
WHERE Course.code = 'C'
but I got only
Student | Course
-----------------
1 | C
3 | C
SELECT s.id, c.code
FROM Course c
INNER JOIN Student s
ON c.student = s.id
WHERE EXISTS
(
SELECT 1
FROM Course c1
WHERE c.student = c1.student
AND c1.Course = 'C'
)
The most efficient approach to this problem is usually an inline view and a JOIN operation, although there are several ways to get an equivalent result.
SELECT Student.id
, Course.code
FROM ( SELECT c.Student
FROM Course c
WHERE c.code = 'C'
GROUP BY c.Student
) o
JOIN Course
ON Course.Student = o.Student
JOIN Student
ON Student.id = Course.Student
Here, we're using an inline view (aliased as o) to get a list of Student taking course code = 'C'.
(NOTE: the query in my answer is based on your original query. If there's a foreign key definition between Course and Student, and we only need to return the Student.id, we could improve performance by omitting the join to Student, and return Course.Student AS id in place of Student.id in the SELECT list.)
Here the first JOIN selects only those students which have course C, and second JOIN gives you all the courses for each of those students.
SELECT st.id, c2.code FROM
Student st
JOIN Course c ON c.student = st.id AND c.code = "C"
JOIN Course c2 ON c2.student = st.id
You actually don't even need two tables here, because both student and course is available in the Course table, just JOIN it on itself:
SELECT c2.student, c2.code FROM
Course c JOIN Course c2 ON c.student = c2.student
WHERE c.course = "C"
Here the WHERE clause leaves student id's which have course C and then you JOIN those to find all their courses.
edited to make clearer - many apologies for the confusion of the original example
I have the following table structure representing married couples:
id | Person | Spouse
______________________
1 | Mary | John
2 | John | Mary
3 | Katy | Bob
4 | Bob | Katy
5 | Mary | John
6 | John | Mary
In this example Mary is married to John, Katy to Bob and a different Mary is married to a different John.
How can I retrieve these pairs of married couples?
I have got close with this:
SELECT
p.id id1,
q.id id2
FROM
people p
INNER JOIN people q ON
p.person = q.spouse AND
q.person = p.spouse AND
p.id < q.id
ORDER BY p.id
However this returns:
1 | 2 (1st Mary & 1st John)
1 | 6 (1st Mary & 2nd John) *problem*
2 | 5 (1st John & 2nd Mary) *problem*
3 | 4 (Katy & Bob)
5 | 6 (2nd Mary & 2nd John)
How can I make sure the 1st Mary and 1st John are only married once (i.e. remove the problem rows above)?
Many thanks
Here's the SQL to create the example:
CREATE TABLE people
(`id` int, `person` varchar(7), `spouse` varchar(7))
;
INSERT INTO people
(`id`, `person`, `spouse`)
VALUES
(1, 'Mary', 'John'),
(2, 'John', 'Mary'),
(3, 'Katy', 'Bob'),
(4, 'Bob', 'Katy'),
(5, 'Mary', 'John'),
(6, 'John', 'Mary')
;
SELECT
p.id id1,
q.id id2
FROM
people p
INNER JOIN people q ON
p.person = q.spouse AND
q.person = p.spouse AND
p.id < q.id
ORDER BY p.id
;
I'll give it a try:
SELECT
p.id AS id1,
q.id AS id2
FROM
people AS p
JOIN people AS q ON
p.person = q.spouse AND
q.person = p.spouse AND
p.id < q.id
JOIN (SELECT
p.id, COUNT(*) AS rank
FROM
people AS p
INNER JOIN people AS p2 ON
p.person = p2.person AND
p.spouse = p2.spouse AND
p.id >= p2.id
GROUP BY p.id
) AS x ON
x.id = p.id
JOIN (SELECT
p.id, COUNT(*) AS rank
FROM
people AS p
INNER JOIN people AS p2 ON
p.person = p2.person AND
p.spouse = p2.spouse AND
p.id >= p2.id
GROUP BY p.id
) AS y ON
y.id = q.id AND
y.rank = x.rank ;
And another one:
SELECT
p.id AS id1,
q.id AS id2
FROM
people AS p
JOIN people AS q ON
p.person = q.spouse AND
q.person = p.spouse
JOIN people AS p2 ON
p.person = p2.person AND
p.spouse = p2.spouse AND
p.id >= p2.id
JOIN people AS q2 ON
q.person = q2.person AND
q.spouse = q2.spouse AND
q.id >= q2.id
WHERE
p.id < q.id
GROUP BY
p.id, q.id
HAVING
COUNT(DISTINCT p2.id) = COUNT(DISTINCT q2.id) ;
Both tested at SQL-Fiddle
It would be much simpler, if only MySQL had window functions (like almost all other DBMS have). Tested at Postgres fiddle:
WITH cte AS
( SELECT
id, person, spouse,
ROW_NUMBER() OVER( PARTITION BY person, spouse
ORDER BY id )
AS rn
FROM
people
)
SELECT
p.id AS id1,
q.id AS id2
FROM
cte AS p
JOIN cte AS q ON
p.person = q.spouse AND
q.person = p.spouse AND
p.rn = q.rn AND
p.id < q.id ;
In this example Mary is married to John, Katy to Bob and a different Mary is married to Richard.
Nothing in your show data structures allows to differentiate between those two “Marys”, because there is no difference between them.
Both are just the text literal Mary. If you want to differentiate between different people that might have the same name, then you need another criterion, and a unique one at that. (F.e. the id of the database records for each individual person.)
Your database stricture is wrong.
People like Mary, John, etc. do not have identity.
Some heuristic query might help, but it is not a reliable solution.
So, please, improve you data structure.
Not very elegant, but works:
SELECT p.id, q.id
FROM people p
INNER JOIN people q ON
p.person1 = q.person2 and
q.person1 = p.person1
which in fact uses the existance of an inverted row as a selector
There's lots of ways of doing it, however one of the most important reasons for using a database is that it holds lots of data - and there should rarely be times when you ever write a query which retrieves lots of data. Except in very unusual circumstances, and for homework assignments, the results should be filtered according to some criteria. Hence the most appropriate solution depends on what other stuff you add to the query later.
But here's a couple of examples of how to get the unique pairs:
SELECT a, b, GROUP_CONCAT(id)
(SELECT id
, IF (person>=spouse, person, spouse) as a
, IF (person>=spouse, spouse, person) as b
FROM yourtable ) AS pairs
GROUP BY a,b;
SELECT id, person, spouse
FROM yourtable s1
WHERE NOT EXISTS ( SELECT 1
FROM yourtable s2
WHERE s2.id>s1.id
AND s1.person=s2.spouse
AND s1.spouse=S2.person);
(there are several other solutions).