Is INTERSECT preferred over subquery? - mysql

I am working on question
Find all students who do not appear in the Likes table (as a student who likes or is liked) and return their names and grades. Sort by grade, then by name within each grade.
I proposed doing the following, getting all people who don't have Likes and intersecting those with the people who don't like anyone:
SELECT name, grade
FROM Highschooler h1
LEFT JOIN Likes l1
ON (l1.ID1 = h1.ID)
WHERE l1.ID1 IS NULL
INTERSECT
SELECT name, grade
FROM Highschooler h1
LEFT JOIN Likes l1
ON (l1.ID2 = h1.ID)
WHERE l1.ID2 IS NULL
ORDER BY grade, name
An alternative way to do this is with a subquery (as I found online)
select name, grade from Highschooler H1
where H1.ID not in (select ID1 from Likes union select ID2 from Likes)
order by grade, name;
Which way is preferred? I think my method is more readable.

Like Adrien said, both are acceptable ways of doing it,
I think your way is more readable because of the capitalized functions and better indentation and not because of the query itself.
This is how i would do it for example:
SELECT name, grade
FROM Highschooler H1
LEFT JOIN Likes AS L1 ON L1.ID1 = H1.ID OR L1.ID2 = H1.ID
WHERE L1.ID1 IS NULL
ORDER BY grade, name
This case would be called an Anti-Join. Read more about those here
https://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/
Here is another nice blog post about the different kind of joins with some diagrams to accompany them:
http://blog.jooq.org/2015/10/06/you-probably-dont-use-sql-intersect-or-except-often-enough/
our case would be

No, your query is not very readable. Read it in plain English: 1. Make a large list of all pupils and their likes. 2. If no likes record is found for a pupil, keep the pupil record nontheless. 3. Then remove all records that do contain likes (thus only keeping those pupils who don't have likes). 4.-6. Do the same with the other person in the likes table. 7. Intersect the two result sets.
The original task was only: Find pupils that don't exist in the likes table.
select name, grade
from highschooler
where not exists
(
select *
from likes
where likes.id1 = highschooler.id
or likes.id2 = highschooler.id
)
order by grade, name;
Keep your queries as simple as possible. SQL is made to read more or less like you would word the task in English. It's not always possible to formulate a task in such simple form, but you can always try :-)
The anti-join pattern you are using in your intersect query is a trick to overcome weaknesses in young DBMS that don't deal well with IN and EXISTS yet. You should use tricks only when necessary, not when dealing with such a simple task as the one given.

Related

SQL query for finding the two teachers who have access to most number of course rooms

I am a newbie with sql queries so I have no clue how to create an accurate SQL.
I tried my best but I literally cannot find any similar example online, please help me out here.
The Data Schema as follows:
User(userID, username password, email, , userType)
Course(courseID, courseTitle)
Enroll(userID, courseID)
course rooms that users can access; note that users include all sorts of users such as teachers and administrators
Material(materialID, materialText, teacherUserID, courseID)
Question:
Find the two teachers who have access to most number of course rooms. Should there be a tie break, choose the ones with smaller user IDs. List the user ID, email, and the number of course rooms that s/he can access for the two teachers.**
The problems are:
SELECT userid, email, MIN (userid)
How can I specifically find the 2 smaller user IDs and which table should I select for finding out the course rooms? Do I have to use COUNT in this case?
FROM user JOIN enroll ON (user.usertype=enroll.userid)
As the enroll_table cannot identify whether the userID is teacher or administrator, if I use JOIN, can I find the the result that I want?
WHERE....
I don't know how to specifically find two teachers AND make sure they have tie break
Do I have to use GROUP BY and ORDER BY as well?
Just saw your attempted query. Look up how to format the code, so it stands out from the text. But you started about right. While we don't have the full info, try the following:
select user.userid,user.username, count(*) as cnt
from enroll
join user on user.userid=enroll.userid
where user.usertype="teacher"
group by user.userid
order by cnt DESC;
So Mary teaches three courses and comes out ahead. Since you want only the top two you can add the line LIMIT 2 to just get the two most prolific teachers.
The part that is hardest to understand for beginners is the group by clause, which generates aggregation, and which requires something like a count(*) clause in the first line. Read up on this separately and make yourself an even smaller example so you understand this well.
kenken068 also asked for a "tie break" using the userid so maybe the "order by" should be
order by cnt DESC, userid ASC;
Problem 1?
Limit to 2 based on the order on total. And also userid as tie braker.
Problem 2?
That info should be in User.userType
But then you need to know which userType is used for the teachers.
However, teachers have Material?
Problem 3?
See problem 1.
Problem 4?
Not always. But to calculate a total, a count is often used together with a group by.
SELECT
u.userID,
u.email,
COUNT(DISTINCT e.courseID) as TotalCourses
FROM `User` AS u
LEFT JOIN `Enroll` AS e
ON u.userID = e.userID
WHERE u.userID IN (SELECT DISTINCT teacherUserID FROM `Material`)
GROUP BY u.userID, u.email
ORDER BY TotalCourses DESC, u.userID
LIMIT 2
Select teacheruserID, count(courseID) from material
group by teacheruserID;
This will give you the count of courses the teachers have access to.
Then Simply order it in descending with the help of order by desc clause.
and the select TOP 2 with TOP Keyword

Combining LIKE and EXISTS?

Here is the database I'm using: https://drive.google.com/file/d/1ArJekOQpal0JFIr1h3NXYcFVngnCNUxg/view?usp=sharing
Find the papers whose title contain the string 'data' and where at least one author is
from the department with deptnum 100. List the panum and title of these papers. You
must use the EXISTS operator. Ensure your query is case-insensitive.
I'm unsure how to output the total number of papers for each academic.
My attempt at this question:
SELECT panum, title
FROM department NATURAL JOIN paper
WHERE UPPER(title) LIKE ('%data%') AND EXISTS (SELECT deptnum FROM
department WHERE deptnum = 100);
This seems to come up empty. I'm not sure what I'm doing wrong, can LIKE and EXISTS be combined?
Thank you.
Don't use natural join! It is an abomination because it does not make use of explicitly declared foreign key relationships. Explicitly list your join keys, so the queries are more understandable and more maintainable.
That said, your subquery is the problem. I would expect a query more like this:
SELECT p.panum, p.title
FROM paper p
WHERE lower(p.title) LIKE '%data%' AND
EXISTS (SELECT 1
FROM authors
WHERE a.author = p.author AND -- or whatever the column should be
a.deptnum = 100
);
Since they are requiring EXISTS, the operator needs to be applied to author, not department table. The query inside EXISTS needs to be correlated with the query on papers, so there should be no JOIN on the top level:
SELECT p.PANUM, p.TITLE
FROM paper p
WHERE p.Title LIKE ('%data%') AND EXISTS (
SELECT *
FROM author a
JOIN academic ac ON ac.ACNUM=a.ACNUM
WHERE a.PANUM=p.PANUM AND ac.DEPTNUM=100
)
Note that since author table lacks DEPTNUM, you do need a join inside the EXISTS query to bring in a row of academic for its DEPTNUM column.
The phrase UPPER(title) LIKE ('%data%') is never going to find any rows, since an uppercase version of whatever is in title will never contain the lowercase letters data.
select p.TITLE,p.PANUM from PAPER p where TITLE like '%data%'
AND EXISTS(
SELECT * FROM AUTHOR a join ACADEMIC d
on d.ACNUM=a.ACNUM where d.DEPTNUM=100 AND a.PANUM=p.PANUM)

Using two SELECT statements in SQL?

I have two tables, one is 'points' which contains ID and points. The other table is 'name' and contains ID, Forename, and Surname.
I'm trying to search for the total number of points someone with the forename Anne, and surname Brown, scored.
Would I have to do a join? If so, is this correct?
SELECT Name.Forename, Name.Surname
FROM Name
FULL OUTER JOIN Points
ON Name.ID=Points.ID
ORDER BY Name.Forename;
But then I also have to add the points, so would I have to use:
SELECT SUM (`points`) FROM Points
Then there is also the WHERE statement so that it only searches for the person with this name:
WHERE `Forename`="Anne" OR `Surname`="Brown";
So how does this all come together (based on the assumption that you do something like this)?
SELECT Name.ID, Forename, Surname, SUM(Points)
FROM Name
INNER JOIN Points ON Name.ID = Points.ID
/* Optional WHERE clause:
WHERE Name.ForeName = 'Anne' AND Name.Surname='Brown'
*/
GROUP BY Name.ID, Name.Forename, Name.Surname
So, first, your answer:
select sum(points) as Points
from
Points
inner join Name on Name.ID = Points.ID
where
Name.Forename ='Anne' and Name.SurName='Brown'
Secondly, FULL JOINS are bad since they pull all values from both sets even those without matches. If you want to only return values that match your criteria (A & B) you must use an INNER JOIN.
Thirdly, here is the MySQL reference documentation on SQL statement syntax. Please consider reading up on it and familiarizing yourself at least with the basics like JOINs, aggregation (including GROUP BY and HAVING), WHERE clauses, UNIONs, some of the basic functions provided, and perhaps subqueries. Having a good base in those will get you 99% of the way through most MySQL queries.
You can write it like this with a subquery.
SELECT Name.Forename, Name.Surname, Name.ID,
(SELECT SUM (`points`) FROM Points where Points.ID = Name.ID) as total_points
FROM Name ORDER BY Name.Forename;
However, I would like to point out, that it appears that your linking of the tables is incorrect. I can not be completely sure without seeing the tables, but I imagine it should be where points.userid = name.id

Remove duplicates from LEFT JOIN query

I am using the following JOIN statement:
SELECT *
FROM students2014
JOIN notes2014 ON (students2014.Student = notes2014.NoteStudent)
WHERE students2014.Consultant='$Consultant'
ORDER BY students2014.LastName
to retrieve a list of students (students2014) and corresponding notes for each student stored in (notes2014).
Each student has multiple notes within the notes2014 table and each note has an ID that corresponds with each student's unique ID. The above statement is returning a the list of students but duplicating every student that has more than one note. I only want to display the latest note for each student (which is determined by the highest note ID).
Is this possible?
You need another join based on the MAX noteId you got from your select.
Something like this should do it (not tested; next time I'd recommed you to paste a link to http://sqlfiddle.com/ with your table structure and some sample data.
SELECT *
FROM students s
LEFT JOIN (
SELECT MAX(NoteId) max_id, NoteStudent
FROM notes
GROUP BY NoteStudent
) aux ON aux.NoteStudent = s.Student
LEFT JOIN notes n2 ON aux.max_id = n2.NoteId
If I may say so, the fact that a table is called students2014 is a big code smell. You'd be much better off with a students table and a year field, for many reasons (just a couple: you won't need to change your DB structure every year, querying across years is much, much easier, etc, etc). Perhaps you "inherited" this, but I thought I'd mention it.
GROUP the query by studentId and select the MAX of the noteId
Try :
SELECT
students2014.Student,
IFNULL(MAX(NoteId),0)
FROM students2014
LEFT JOIN notes2014 ON (students2014.Student = notes2014.NoteStudent)
WHERE students2014.Consultant='$Consultant'
GROUP BY students2014.Student
ORDER BY students2014.LastName

Getting object if count is less then a number

I have 2 simple tables - Firm and Groups. I also have a table FirmGroupsLink for making connections between them (connection is one to many).
Table Firm has attributes - FirmID, FirmName, City
Table Groups has attributes - GroupID, GroupName
Table FirmGroupsLink has attributes - FrmID, GrpID
Now I want to make a query, which will return all those firms, that have less groups then #num, so I write
SELECT FirmID, FirmName, City
FROM (Firm INNER JOIN FirmGroupsLink ON Firm.FirmID =
FirmGroupsLink.FrmID)
HAVING COUNT(FrmID)<#num
But it doesn't run, I try this in Microsoft Access, but it eventually should work for Sybase. Please show me, what I'm doing wrong.
Thank you in advance.
In order to count properly, you need to provide by which group you are couting.
The having clause, and moreover the count can't work if you are not grouping.
Here you are counting by Firm. In fact, because you need to retrieve information about the Firm, you are grouping by FirmId, FirmName and City, so the query should look like this:
SELECT Firm.FirmID, Firm.FirmName, Firm.City
FROM Firm
LEFT OUTER JOIN FirmGroupsLink
ON Firm.FirmID = FirmGroupsLink.FrmID
GROUP BY Firm.FirmID, Firm.FirmName, Firm.City
HAVING COUNT(FrmID) < #num
Note that I replace the INNER JOIN by a LEFT OUTER JOIN, because you might want Firm which doesn't belongs to any groups too.