SQL - Find AT LEAST TWO DISTINCT / SEPARATE / DIFFERENT values on another Table - mysql

While taking an Online database course (for beginner) a problem has came to my attention, where I had to find queries involving ...AT LEAST TWO DISTINCT values... For example,
the COMPANY database in the ELMASRI book which states: Find all employee who work on at least two distinct projects. And the solution (which works great) is
SELECT DISTINCT LName FROM Employee e1
JOIN Works_On AS w1 ON (e1.Ssn = w1.Essn)
JOIN Works_On AS w2 ON (e1.Ssn = w2.Essn)
WHERE w1.Pno <> w2.Pno
Similarly in case of the STUDENT/COURSE database (i forgot the source): Find the Student_ID of the Students who take at least two distinct Courses. And the solution looks also simple (though its not tested)
SELECT e1.Student_ID FROM Enroll AS e1, Enroll AS e2
WHERE e1.Student_ID = e2.Student_ID
AND e1.Course_ID <> e2.Course_ID
In my problem, I have to Find the name and customer ID of those customers who have accounts in at least two branches of distinct types (i.e., which do not have the same Branch Type).
from the following table (MySql)
CUSTOMER: BRANCH: ACCOUNT:
Cust_ID Lname Br_ID Br_Type Acc_Num Br_ID Cust_ID Balance
------- ------ ----- ------- ------- ----- ------- -------
1 Mr.A 10 big 1001 10 1 2000
2 Mr.B 11 small 1002 11 1 2500
3 Mr.C 12 big 1003 13 1 3000
4 Mr.D 13 small 1004 12 2 4000
1005 13 3 4500
1006 10 4 5000
1007 12 4 6000
Result Table should look like the following:
Lname Cust_ID
----- -------
Mr.A 1
Only Mr.A has account in a branch whose type is 'big' as well as in a branch whose type is 'small'
I tried the following which didnt work
SELECT DISTINCT c1.Lname, a1.Cust_ID FROM Customer AS c1
JOIN Account a1 ON (c1.Cust_ID=a1.Cust_ID)
JOIN Branch b1 ON (a1.Br_ID=b1.Br_ID)
JOIN Branch b2 ON (a1.Br_ID=b2.Br_ID)
WHERE b1.Br_Type<>b2.Br_Type;
What am I exactly doing wrong? Sorry for such a long description but i wanted to make sure that the question is understandable and a little explanation on < > part will be highly appreciated.

You're trying to pull 2 different Branch records off the same Account record - but that can't happen. What you want is to search on 2 different Account records with associated Branches of a different type:
SELECT DISTINCT c1.Lname, a1.Cust_ID FROM Customer AS c1
JOIN Account a1 ON (c1.Cust_ID=a1.Cust_ID)
JOIN Account a2 ON (c1.Cust_ID=a2.Cust_ID)
JOIN Branch b1 ON (a1.Br_ID=b1.Br_ID)
JOIN Branch b2 ON (a2.Br_ID=b2.Br_ID)
WHERE b1.Br_Type<>b2.Br_Type;
SQLFiddle here
A more efficient approach that gives the same result, would be to use GROUP BY and HAVING COUNT(DISTINCT Br_Type) >= 2 - which is what #GordonLindoff proposed.

The problem with your query is the two on conditions. They are returning the same row in branch, because the join conditions are the same.
In any case, I think there is a better way to think about these types of queries (what I call "set--sets" queries). Think of these as aggregation. Aggregation at the customer level, then using the having clause to filter the customers:
SELECT c.Lname, a.Cust_ID
FROM Customer AS c JOIN
Account a
ON c.Cust_ID = a.Cust_ID JOIN
Branch b
ON a.Br_ID = b.Br_ID
GROUP BY c.Lname, a.Cust_ID
HAVING count(distinct b.br_type) > 1;

Related

sql select master records based on ANDing multiple detail records

I have master and detail tables as described below (with representative data). I want to select (mysql-compliant) Student.id for students that have an "A" in both Biology and Chemistry.
Students grades
id name id student_id class grade
1 ken 1 1 Biology A
2 beth 2 1 Chemistry A
3 joe 3 1 Math B
4 2 Biology A
5 2 Chemistry A
6 2 Math A
7 3 Biology B
8 3 Chemistry A
9 3 Math A
Currently, I'm just pulling in all the data into my program (java) but figure there's got to be a way in SQL to get the right records.
The results I'm looking for from the data above would be 1 & 2 (ken and beth). I've tried a few variations using joins and inner selects but can't quite get it to work. My main problem seems to be I'm ANDing my detail records eg., ...where grades.class='Biology' and grades.grade='A'
I took a look at SQL select from header table where detail table rows have multiple values but that didn't quite get me where I need to be.
Assistance greatly appreciated.
Try this
select s.id from students as s
inner join grades as g on s.id=g.student_id
group by s.id
having max(case when g.class='Biology' and g.grade='A' then 1 else 0 end)=1
and
having max(case when g.class='Chemistry' and g.grade='A' then 1 else 0 end)=1
Here is one way:
select g.student_id
from grades g
where g.class in ('Biology', 'Chemistry') and g.grade = 'A'
group by g.student_id
having count(distinct class) = 2;
Notes:
A join is not necessary because the grades table has the student id.
The where clause only gets records where a student has an 'A' in either class.
The having guarantees that a student has an 'A' in both classes.
I should note that there is an alternative method that uses join but not group by:
select gb.student_id
from grades gb join
grades gc
on gb.student_id = gc.student_id and
gb.class = 'Biology' and gc.class = 'Chemistry' and
gb.grade = 'A' and gc.grade = 'A';
This works -- and the performance might even be better. I like the group by and having approach because it is more flexible.

In MySQL, how to query multiple table counts on column when joining on common columns

I have two tables (they come from different original data sources)
Projects (cross sectional data)
Timestamp Project Owner Name Submission_Date
2014-02-18 1 Tim Susan 2014-02-10
2014-02-18 2 Matt Jaclyn 2014-02-10
2014-02-18 2 Tim Mary 2014-02-11
etc
and Hitups (activity log)
Project Owner Name Hitup_Date
1 Tim Susan 2014-02-01
2 Matt Jaclyn 2014-02-02
etc
And I want to run a query to get a count of activities from both tables, and grouped by the common project and owner. Given the above totally made up sample data, I'd expect to see results similar to
Project Owner count(Hitup_Date) count(Submission_Date)
1 Tim 1 1
2 Matt 1 1
2 Tim null 1
My attempt to query this is as follows
SELECT p.Project, p.Owner, COUNT(Hitup_Date), COUNT( p.Submission_Date )
FROM projects p, hitups h
WHERE p.project = h.project and p.owner = h.owner
AND p.date = ( SELECT MAX( DATE ) FROM projects )
GROUP BY p.project, p.owner
... which fails miserably. What's going wrong? I've searched exhaustively and have been unable to find prior examples on how to tackle a similar situation explicitly - thank you for your guidance.
You need to perform the grouping and aggregation first, and only then join the results:
SELECT p.project, p.owner, submission_count, hitup_count
FROM (SELECT project, owner, COUNT(*) AS submission_count
FROM projects
GROUP BY project, owner) p
LEFT OUT JOIN (SELECT project, owner, COUNT(*) AS hitup_count
FROM hitups
GROUP BY project, owner) h
ON p.project = h.project AND p.owner = h.owner

SQL select rows if a column value is not in a group of a different column's values

For each identifier, how can I return the quantity when the received country is not equal to any of the delivered countries? I need an efficient query for the steps below since my table is huge.
These are the steps I would think could do this, of course you don't need to follow them :)
Create a group of 'delivered' countries for each identifier.
See if 'received' is any of these countries for each identifier. If
there is no match, return this result.
Starting Table:
identifier delivered received quantity
------------- ------------ ----------- ------------
1 USA France 432
1 France USA 450
1 Ireland Russia 100
2 Germany Germany 1,034
3 USA France 50
3 USA USA 120
Result:
identifier delivered received quantity
------------- ------------ ----------- ------------
1 Ireland Russia 100
The starting table is about 30,000,000 rows, so self-joins will be impossible unfortunately. I am using something similar to MySQL.
I think LEFT JOIN query should work for you:
SELECT a.*
FROM starting a
LEFT JOIN starting b
ON a.id = b.id
AND a.delivered = b.received
WHERE b.received IS NULL;
Example: SQLFiddle
For optimizing above query, adding following composite index should give you better performance:
ALTER TABLE starting ADD KEY ix1(id, delivered, received);
You could use a not exists subquery:
SELECT a.*
FROM starting a
WHERE NOT EXISTS
(
SELECT *
FROM starting b
WHERE a.id = b.id
AND a.delivered = b.received
)
This is not a self-join, but the query optimizer is free to execute it as one (and usually does.)

MySQL Join Multiple (More than 2) Tables with Conditions

Assume I have 4 tables:
Table 1: Task
ID Task Schedule
1 Cut Grass Mon
2 Sweep Floor Fri
3 Wash Dishes Fri
Table 2: Assigned
ID TaskID (FK) PersonID (FK)
1 1 1
2 1 2
3 2 3
4 3 2
Table 3: Person
ID Name
1 Tom
2 Dick
3 Harry
Table 4: Mobile
ID PersonID (FK) CountryCode MobileNumber
1 1 1 555-555-5555
2 2 44 555-555-1234
3 3 81 555-555-5678
4 3 81 555-555-0000
I'm trying to display the
Task on a certain day
Name of person assigned to task
Phone numbers of said person
I think it should be something like the following, but I'm not sure how to set up the conditions so that the results are limited correctly:
SELECT T.ID, T.Task, P.Name, M.MobileNumber
FROM Task AS T
LEFT JOIN Assigned AS A
ON T.ID = A.TaskID
LEFT JOIN Person AS P
ON A.PersonID = P.ID
LEFT JOIN Mobile AS M
ON M.PersonID = P.ID
WHERE T.Schedule = Fri
My goal is to fetch the following information (it will be displayed differently):
Tasks Name MobileNumber
Sweep Floor, Wash Dishes Dick, Harry 44-555-555-1234, 81-555-555-5678, 81-555-555-0000
Of course, if JOIN is the wrong way to do this, please say so.
It's unclear what you want to do with duplicate data in this case, but you should be looking at using inner joins instead of outer joins, and using something like group_concat() to combine the phone numbers.

Grouping and Accumulating Records at the same time

I am writing a query against an advanced many-to-many table in my database. I call it an advanced table because it is a many-to-many table with and extra field. The table maps data between the fields table and the students table. The fields table holds potential fields that a student can used, kind of like a contact system (i.e. name, school, address, etc). The studentvalues table that I need to query against holds the field id, student id, and the field answer (i.e. studentid=1; fieldid=2; response=Dave Long).
So my table looks like this:
What I need to do is take a few passed in values and create a grouped accumulated report. I would like to do as much in the SQL as possible.
So that data that I have will be the group by field (a field id), the cumulative field (a field id) and I need to group the students by the group by field and then in each group count the amount of students in the cumulative fields.
So for example I have this data
ID STUDENTID FIELDID RESPONSE
1 1 2 *(city)* Wallingford
2 1 3 *(state)* CT
3 2 2 *(city)* Wallingford
4 2 3 *(state)* CT
5 3 2 *(city)* Berlin
6 3 3 *(state)* CT
7 4 2 *(city)* Costa Mesa
8 4 3 *(state)* CA
I am hoping to write one query that I can generate a report that looks like this:
CA - 1 Student
Costa Mesa 1
CT - 3 Students
Berlin 1
Wallingford 2
Is this possible to do with a single SQL statement or do I have to get all the groups and then loop over them?
EDIT Here is the code that I have gotten so far, but it doesn't give the proper stateSubtotal (the stateSubtotal is the same as the citySubtotal)
SELECT state, count(state) AS stateSubtotal, city, count(city) AS citySubtotal
FROM(
SELECT s1.response AS city, s2.response AS state
FROM studentvalues s1
INNER JOIN studentvalues s2
ON s1.studentid = s2.studentid
WHERE s1.fieldid = 5
AND s2.fieldid = 6
) t
GROUP BY city, state
So to make a table that looks like that, I would assume something like
State StateSubtotal City CitySubtotal
CA 1 Costa Mesa 1
CT 3 Berlin 1
CT 3 Wallingford 2
Would be what you want. We can't just group on Response, since if you had a student answer LA for city, and another student that responds LA for state (Louisiana) they would add. Also, if the same city is in different states, we need to first lay out the association between a city and a state by joining on the student id.
edit - indeed, flawed first approach. The different aggregates need different groupings, so really, one select per aggregation is required. This gives the right result but it's ugly and I bet it could be improved on. If you were on SQL Server I would think a CTE would help but that's not an option.
select t2.stateAbb, stateSubtotal, t2.city, t2.citySubtotal from
(
select city, count(city) as citySubTotal, stateAbb from (
select s1.Response as city, s2.Response as StateAbb
from aaa s1 inner join aaa s2 on s1.studentId = s2.studentId
where s1.fieldId = 2 and s2.fieldId=3
) t1
group by city, stateabb
) t2 inner join (
select stateAbb, count(stateabb) as stateSubTotal from (
select s1.Response as city, s2.Response as StateAbb
from aaa s1 inner join aaa s2 on s1.studentId = s2.studentId
where s1.fieldId = 2 and s2.fieldId=3
) t3
group by stateabb
) t4 on t2.stateabb = t4.stateabb