Difference between these sql queries - mysql

I can't seem to understand why these two queries return different results for the following task: "Find names and grades of students who only have friends in the same grade. Return the result sorted by grade, then by name within each grade."
Tables here: https://lagunita.stanford.edu/c4x/DB/SQL/asset/socialdata.html
The first query:
SELECT DISTINCT h1.name, h1.grade
FROM Highschooler h1, Friend f, Highschooler h2
WHERE h1.ID = f.ID1 AND h2.ID = f.ID2 AND h1.grade = h2.grade
ORDER BY h1.grade, h1.name
The second query:
select name, grade from Highschooler
where ID not in (
select ID1 from Highschooler H1, Friend, Highschooler H2
where H1.ID = Friend.ID1 and Friend.ID2 = H2.ID and H1.grade <> H2.grade)
order by grade, name;
The second one returns the expected result, but not the first one. If anyone cares to clarify, Thanks.

The first query applies three filter in the query simultaneously to all data in tables and returns just those entries matching all the filters. The second query firstly does a subquery where it returns rows matching the subquery condition and then all the IDs which are not there are returned, which includes also IDs for which H1.ID = Friend.ID1 and Friend.ID2 = H2.ID do not hold true. You can try something like:
select name, grade from Highschooler
where where H1.ID = Friend.ID1 and Friend.ID2 = H2.ID and ID not in (
select ID1 from Highschooler H1, Friend, Highschooler H2
where H1.ID = Friend.ID1 and Friend.ID2 = H2.ID and H1.grade <> H2.grade)
order by grade, name;

It can be standard NULL - related behavior . Demo
create table tble (ID int, col int);
insert tble(ID, col)
values (1,1),(2,null),(3,2);
select *
from tble
where col=1;
select *
from tble
where ID not in (select t2.ID from tble t2 where t2.col<>1);
Because select t2.ID from tble t2 where t2.col<>1 must not return ID 2 as predicate NULL <> 1 does not evaluates to TRUE.

I just wanted to add further clarification on the first query explanations. The first query results in this:
SELECT DISTINCT h1.name, h1.grade FROM Highschooler h1, Friend f, Highschooler h2 WHERE h1.ID = f.ID1 AND h2.ID = f.ID2 AND h1.grade = h2.grade ORDER BY h1.grade, h1.name;
+-----------+-------+
| name | grade |
+-----------+-------+
| Cassandra | 9 |
| Gabriel | 9 |
| Jordan | 9 |
| Tiffany | 9 |
| Andrew | 10 |
| Brittany | 10 |
| Haley | 10 |
| Kris | 10 |
| Alexis | 11 |
| Gabriel | 11 |
| Jessica | 11 |
| John | 12 |
| Jordan | 12 |
| Kyle | 12 |
| Logan | 12 |
+-----------+-------+
15 rows in set (0,00 sec)
Since you are performing a cartesian product (by means of selecting the same table Highschooler twice), and one of your conditions is h1.grade = h2.grade, you are going to retrieve all students that have at least one friend in the same grade. The only student you are not getting is Austin, which is the only one that doesn't have any friends in his grade.
The second query is explained in Radek's answer.
I hope this helps.

Related

UNION records from two tables and favor fields that are not NULL (otherwise favor values from first table)

I have fought my way through various answers and did some progress, however, the final solution has not been discovered.
The DB situation:
Table "clients_a":
userid | name
1 | Steve
2 | John
3 | Paul
Table "clients_b":
userid | name
1 | NULL
3 | Jokename
4 | Jessy
Desired result/output:
userid | name
1 | Steve
2 | John
3 | Paul
4 | Jessy
Description of what is going on:
userid is unique in the result (merged)
a result for name that is not NULL is favored
if two entries, then result from table clients_a is favored
all entries have a groupid (see below), that has to be taken into account
MySql queries I tried (and came close):
Attempt 1: This query works, but it does not regard the name. It takes all names from client_a:
SELECT * FROM
(
SELECT userid, name
FROM `client_a`
WHERE groupid = 123
UNION DISTINCT
SELECT userid, name
FROM `client_b`
WHERE groupid = 123
) AS res
GROUP BY res.userid
Attempt 2: This query creates duplicate entries (one userid can occur twice), but regards the name, as it seems:
SELECT o.*, i.* FROM
(
SELECT userid, name
FROM `client_a`
WHERE groupid = 123
UNION DISTINCT
SELECT userid, realname
FROM `client_b`
WHERE groupid = 123
GROUP BY userid
) AS o
LEFT JOIN `client_a` as i on i.userid = o.userid
I also tried to use MIN(name) without success.
Any help is appreciated.
You can do it with NOT EXISTS:
SELECT a.userid, a.name
FROM clients_a a
WHERE a.groupid = 123
AND (name IS NOT NULL OR NOT EXISTS (SELECT 1 FROM clients_b b WHERE b.userid = a.userid))
UNION
SELECT b.userid, b.name
FROM clients_b b
WHERE b.groupid = 123
AND NOT EXISTS (SELECT 1 FROM clients_a a WHERE a.userid = b.userid AND a.name IS NOT NULL)
See the demo.
Results:
userid
name
1
Steve
2
John
3
Paul
4
Jessy

SQL select from 1 x N where all bigger than

I have tables books and bookType which pose a 1 X n relationship.
books
+-----+------------------+----------+-------+
| id | title | bookType | price |
+-----+------------------+----------+-------+
| 1 | Wizard of Oz | 3 | 14 |
| 2 | Huckleberry Finn | 1 | 16 |
| 3 | Harry Potter | 2 | 25 |
| 4 | Moby Dick | 2 | 11 |
+-----+------------------+----------+-------+
bookTypes
+-----+----------+
| id | name |
+-----+----------+
| 1 | Fiction |
| 2 | Drama |
| 3 | Children |
+-----+----------+
How would I retrieve bookTypes where all books are more expensive than e.g. 12($)?
In this case, the expected output would be:
+-----+----------+
| id | name |
+-----+----------+
| 1 | Fiction |
| 3 | Children |
+-----+----------+
You can use not exists:
select t.*
from bookTypes t
where not exists (
select 1
from books b
where b.bookType = t.id and b.price < 12
)
If you want to select book types that also have at least one associated book:
select t.*
from bookTypes t
where
exists (select 1 from books b where b.bookType = t.id)
and not exists (select 1 from books b where b.bookType = t.id and b.price < 12)
Do a GROUP BY, use HAVING to return only booktypes having the lowest price > 12.
SELECT bt.name
FROM bookTypes bt
INNER JOIN books b ON b.bookType = bt.id
group by bt.name
HAVING SUM(b.price <= 12) = 0;
You can directly consider using having min(price) >= 12 with grouping by bookType
select t.id, t.name
from bookTypes t
join books b
on t.id = b.bookType
group by b.bookType
having min(price) >= 12
Moreover, if your DB's version is at least 10.2, then you can also use some window functions for analytical queries such as min(..) over (partition by .. order by ..) :
with t as
(
select t.id, t.name, min(price) over (partition by bookType) as price
from bookTypes t
join books b
on t.id = b.bookType
)
select id, name
from t
where price >= 12
in which min() over (..) window function determines minimum price for each booktype by use of partition by bookType
Demo
I think GMB's solution is likely the best so far. But for sake of completeness: You can also use the ALL operator with a correlated subquery. That's probably the most straight forward solution.
SELECT *
FROM booktypes bt
WHERE 12 < ALL (SELECT b.price
FROM books b
WHERE b.booktype = bt.id);
Can you not just select from books inner join bookTypes on id WHERE price > 12?
SELECT bt.*
FROM bookTypes bt
INNER JOIN books b ON b.bookType = bt.id
WHERE b.price > 12

MySQL Relational Division with multiple IDs

Please take a look at the question described here: MySQL ONLY IN() equivalent clause , regarding Relational Division in MySQL.
My database structure is very similar to the one described, but in the "Chocolate Boys Table", I have an additional ID field - let's call it milk ID.
Chocolates Boys Table
+----+---------+-----------------------+
| id | chocolate_id | milk id | boy_id |
+----+--------------+---------+--------+
| 1 | 1000 | 2000 | 10007 |
| 2 | 1003 | 2001 | 10007 |
| 3 | 1006 | 2005 | 10007 |
| 4 | 1000 | 2001 | 10009 |
| 5 | 1001 | 2000 | 10009 |
| 6 | 1005 | 2008 | 10009 |
+----+--------------+---------+--------|
The objective is to run a query that retrieves the boy ID that contains the exact chocolate and milk IDs that I pass in. Here are some examples of my expected results:
Example #1:
Chocolate IDs Passed In (in order) - 1000,1003,1006.
Milk IDs Passed In (in order) - 2000,2001,2005.
Expected Result: Query returns boy ID of 10007.
Example #2:
Chocolate IDs Passed In (in order) - 1000,1003.
Milk IDs Passed In (in order) - 2000,2001.
Expected Result: Empty result set.
Example #3:
Chocolate IDs Passed In (in order) - 1003,1000,1006.
Milk IDs Passed In (in order) - 2000,2001,2005.
Expected Result: Empty result set - The passed in IDs are included in boy ID 10007, but the order is wrong. The values of Chocolate ID and Milk ID don't match up if examined on a row by row basis.
I am attempting to use a slightly modified version of John Woo's solution in order to incorporate the added ID field:
SELECT boy_id
FROM boys_chocolates a
WHERE chocolate_id IN (1003,1000,1006) AND milk_id IN (2000,2001,2005) AND
EXISTS
(
SELECT 1
FROM boys_chocolates b
WHERE a.boy_ID = b.boy_ID
GROUP BY boy_id
HAVING COUNT(DISTINCT chocolate_id) = 3
)
GROUP BY boy_id
HAVING COUNT(*) = 3
The problem that I'm having is that the IN function does not enforce order, as seen in example #3. I would like the above query to return an empty result set. What needs to be changed in order to address this problem? Thank you!
Try this approach:
SELECT a.boy_id
FROM
(SELECT id, boy_id FROM boys_chocolates WHERE chocolate_id = 1000) a
JOIN
(
(SELECT id, boy_id FROM boys_chocolates WHERE chocolate_id = 1003) b,
(SELECT id, boy_id FROM boys_chocolates WHERE chocolate_id = 1006) c,
(SELECT id, boy_id FROM boys_chocolates WHERE milk_id = 2000) d,
(SELECT id, boy_id FROM boys_chocolates WHERE milk_id = 2001) e,
(SELECT id, boy_id FROM boys_chocolates WHERE milk_id = 2005) f
)
ON a.boy_id = b.boy_id AND a.boy_id = c.boy_id AND a.boy_id = d.boy_id
AND a.boy_id = e.boy_id AND a.boy_id = f.boy_id AND b.id > a.id
AND c.id > b.id AND e.id > d.id AND f.id > e.id;
Replace 1000 1003 1006 with your first chocolate_id, second chocolate_id, third chocolate_id respectively. Also replace 2000 2001 2005 with your first milk_id, second milk_id, third milk_id.

Sum Log values from table using second table

I have a huge table where a new row could be an "adjustment" to a previous row.
TableA:
Id | RefId | TransId |Score
----------------------------------
101 | null | 3001 | 10
102 | null | 3002 | 15
103 | null | 3003 | 15
104 | 101 | | -5
105 | null | 3004 | 5
106 | 105 | | -10
107 | null | 3005 | 15
TableB:
TransId | Person
----------------
3001 | Harry
3002 | Draco
3003 | Sarah
3004 | Ron
3005 | Harry
In the table above, Harry was given 10 points in TableA.Id=101, deducted 5 of those points in TableA.Id=104, and then given another 15 points in TableA.Id=107.
What I want to do here, is return all the rows where Harry is the person connected to the score. The problem is that there is no name attached to a row where points are deducted, only to the rows where scores are given (through TableB). However, scores are always deducted from a previously given score, where the original transaction's Id is referred to in the tables as "RefId".
SELECT
SUM TableA.Score
FROM TableA
LEFT JOIN TableB ON TableA.Trans=TableB.TransId
WHERE 1
AND TableB.Person='Harry'
GROUP BY TableA.Score
That only gives me the points given to Harry, not the deducted ones. I would like to get the total scored returned, which would be 20 for Harry. (10-5+15=20)
How do I get MySQL to include the negative scores as well? I feel like it should be possible using the TableA.RefId. Something like "if there is a RefId, get the score from this row, but look at the corresponding TableA.Id for the rest of the data".
Select sum(total) AS total
From tableb
Join
(
Select t1.transid, sum(score) AS total
From tablea t1
Join tablea t2 on t1.id = t2.refid
group by t1.transid
) x on x.transid = tableb.transid
Where TableB.Person='Harry'
try this:
select sum(sum1 + sums) as sum_all from (
SELECT t1.id,T1.Score sum1, coalesce(T2.score,0) sums
FROM Table1 t1
inner JOIN Table2 ON T1.TransId=Table2.TransId
left JOIN Table1 t2 ON t2.RefId = t1.id
WHERE Table2.Person='Harry'
)c
DEMO HERE
OUTput:
SUM_ALL
20
If you assume that adjustments don't modify adjustments, you can do this without aggregating all the data:
select sum(a.score + coalesce(aref.score, 0)) as HarryScore
from tableA a left outer join
tableA aref
on a.refId = aref.id left outer join
tableB b
on a.TransId = b.Transid left outer join
tableB bref
on aref.TransId = bref.TransId
where b.Person = 'Harry' or bref.Person = 'Harry';

MySQL Filtering rows from three tables

Let's say i've got this database:
book
| idBook | name |
|--------|----------|
| 1 |Book#1 |
category
| idCateg| category |
|--------|----------|
| 1 |Adventures|
| 2 |Science F.|
book_categ
| id | idBook | idCateg | DATA |
|--------|--------|----------|--------|
| 1 | 1 | 1 | (null) |
| 2 | 1 | 2 | (null) |
I'm trying to select only the books which are in category 1 AND category 2
This is what I've got so far:
SELECT book.* FROM book,book_categ
WHERE book_categ.idCateg = 1 AND book_categ.idCateg = 2
Obviously, this giving 0 results becouse each row has only one idCateg it does work width OR but the results are not what I need. I've also tried to use a join, but I just can't get the results I expect.
Here it's the SQLFiddle of my current project, the data at the begining is just a sample.
SQLFiddle
Any help will be really appreciated.
You could double join with a constraint on the category id:
SELECT a.* FROM book AS a
INNER JOIN book_categ AS b ON a.idBook = b.idBook AND b.idCateg = 1
INNER JOIN book_categ AS c ON a.idBook = c.idBook AND c.idCateg = 2
You could use a subquery:
SELECT a.* FROM book AS a
WHERE
(SELECT COUNT(DISTINCT idCateg) FROM book_categ AS b
WHERE b.idBook = a.idBook AND b.idCateg IN (1,2)) = 2
If you are on MySQL as your fiddle implies, you should prefer the join variant, since most joins are much faster in MySQL than subqueries.
edit
This one should also work:
SELECT a.* FROM book a
INNER JOIN book_categ AS b ON a.idBook = b.idCateg
WHERE b.idCateg IN (5, 6)
GROUP BY idBook
HAVING COUNT(DISTINCT b.idCateg) = 2
and should be faster than the two above, although you have to change the last number according to the number of category ids you are requesting.