SQL inner join between two tables - sql-server-2008

I am using SQL Server 2008 and I have 4 tables StudentAbsentees, Students, StudentSections and Sections
In the StudentAbsentees table I am storing the studentId that are absent (absentees only) on a particular day like,
StudentId Time Date
----------- ------ ------
1 10:00 2012-04-13
and in the StudentSections I am storing the studentId in a particular section like
StudentId SectionId
---------- ------------
1 1
2 1
3 1
and in the Students table I am storing student details, likewise in Sections table I have section details like name and capacity of that section.
I need to join these tables and display whether the student is present/absent on a particular day... the result should be
StudentId Status
--------- ------
1 Absent
2 Present
3 Present
I can get the absentees list from these tables, I dunno how to display whether they are present/absent....can anyone help me here

select * from (
select s.id,
case
when sa.date = '2012-01-01'
then 'absent'
else 'present'
end as status,
ROW_NUMBER() OVER (PARTITION BY s.id ORDER BY CASE WHEN sa.date = '2012-01-01' THEN 1 ELSE 2 END) AS RowNumber
from students s
left outer join studentabsentees sa on s.id = sa.studentid
)
as a where a.RowNumber = 1

You're query to show the status of all students for a particular day would look like:
select s.id, s.name, a.status
from student s
left join studentabsentees a on s.id = a.studentid
where a.date = ?
Obviously you have to supply a date.
Note: Your question uses "inner join" in the title. I think left is a better fit because it would show for all students. But if you really wanted just the ones that have a record in the absentee table then you could just change the word "left" in the query to "inner".
Note2: My query assumes a status field. If you don't have one then look at juergen d's answer.

No need for joins, you can just use set operators:
SELECT StudentID, 'Absent'
FROM StudentsAbsentees
WHERE [date] = ...
UNION
(
SELECT StudentID, 'Present'
FROM Students
EXCEPT
SELECT StudentID, 'Present'
FROM StudentsAbsentees
WHERE [date] = ...
)
You can display 'Present' and 'Absent' by just selecting them as constant. It's easy to get the list of all the absent students. Then union this with all the present students. Present students are found by taking the complete student list and using the except operator on the missing students. But in this except part make sure you select the absent students as present so they subtract nicely from the list of all students with present next to their name that you have just created.

Related

Aggregating three tables but getting wrong values during the aggregation operation

"employee" Table
emp_id
empName
1
ABC
2
xyx
"client" Table:
id
emp_id
clientName
1
1
a
2
1
b
3
1
c
4
2
d
"collection" Table
id
emp_id
Amount
1
2
1000
2
1
2000
3
1
1000
4
1
1200
I want to aggregate values from the three tables input tables here reported as samples. For each employee I need to find
the total collection amount for that employee (as a sum)
the clients that are involved with the corresponding employee (as a comma-separated value)
Here follows my current query.
MyQuery:
SELECT emp_id,
empName,
GROUP_CONCAT(client.clientName ORDER BY client.id SEPARATOR '') AS clientName,
SUM(collection.Amount)
FROM employee
LEFT JOIN client
ON clent.emp_id = employee.emp_id
LEFT JOIN collection
ON collection.emp_id = employee.emp_id
GROUP BY employee.emp_id;
The problem of this query is that I'm getting wrong values of sums and clients when an employee is associated to multiple of them.
Current Output:
emp_id
empName
clientName
TotalCollection
1
ABC
a,b,c,c,b,a,a,b,c
8400
2
xyz
d,d
1000
Expected Output:
emp_id
empName
clientName
TotalCollection
1
ABC
a , b , c
4200
2
xyz
d
1000
How can I solve this problem?
There are some typos in your query:
the separator inside the GROUP_CONCAT function should be a comma instead of a space, given your current output, though comma is default value, so you can really omit that clause.
each alias in your select requires the table where it comes from, as long as those field names are used in more than one tables among the ones you're joining on
your GROUP BY clause should at least contain every field that is not aggregated inside the SELECT clause in order to have a potentially correct output.
The overall conceptual problem in your query is that the join combines every row of the "employee" table with every row of the "client" table (resulting in multiple rows and higher sum of amounts during the aggregation). One way for getting out of the rabbit hole is a first aggregation on the "client" table (to have one row for each "emp_id" value), then join back with the other tables.
SELECT emp.emp_id,
emp.empName,
cl.clientName,
SUM(coll.Amount)
FROM employee emp
LEFT JOIN (SELECT emp_id,
GROUP_CONCAT(client.clientName
ORDER BY client.id) AS clientName
FROM client
GROUP BY emp_id) cl
ON cl.emp_id = emp.emp_id
LEFT JOIN (SELECT emp_id, Amount FROM collection) coll
ON coll.emp_id = emp.emp_id
GROUP BY emp.emp_id,
emp.empName,
cl.clientName
Check the demo here.
Regardless of my comment, here is a query for your desired output:
SELECT
a.emp_id,
a.empName,
a.clientName,
SUM(col.Amount) AS totalCollection
FROM (SELECT e.emp_id,
e.`empName`,
GROUP_CONCAT(DISTINCT c.clientName ORDER BY c.id ) AS clientName
FROM employee e
LEFT JOIN `client` c
ON c.emp_id = e.emp_id
GROUP BY e.`emp_id`) a
LEFT JOIN collection col
ON col.emp_id = a.emp_id
GROUP BY col.emp_id;
When having multiple joins, you should be careful about the relations and the number of results(rows) that your query generates. You might as well have multiple records in output than your desired ones.
Hope this helps
SELECT emp_id,
empName,
GROUP_CONCAT(client.clientName ORDER BY client.id SEPARATOR '') AS clientName,
C .Amount
FROM employee
LEFT JOIN client
ON clent.emp_id = employee.emp_id
LEFT JOIN (select collection.emp_id , sum(collection.Amount ) as Amount from collection group by collection.emp_id) C
ON C.emp_id = employee.emp_id
GROUP BY employee.emp_id;
it works for me now

SQL Select From Master - Detail Tables (Formatted Data)

I have two tables, named supplier and contacts.
The data in the contact table corresponds to a record on the supplier table.
Data of supplier
ID
Name
1
Hp
2
Huawei
Data for the contact
id
supplierId
Contact
1
1
John
2
1
Smith
3
1
Will
4
2
Doe
5
2
Wick
Now, I want to make a query that should return the following result
ID
Name
Contact
1
Hp
John, Smith, Will
2
Huawei
Doe, Wick
or should return the following result
ID
Name
Contact
Contact
Contact
1
Hp
John
Smith
Will
2
Huawei
Doe
Wick
You can use MySQL GROUP_CONCAT aggregation function to get your first output table. It's own ORDER BY clause will allow you to check the order of concatenation for the rows.
SELECT s.ID,
s.Name,
GROUP_CONCAT(c.Contact ORDER BY c.id)
FROM Supplier s
INNER JOIN Contact c
ON s.ID = c.supplierId
GROUP BY s.ID,
s.Name
You can use the window function ROW_NUMBER to assign a rank to each row inside the Contact table by partitioning on the supplier. Then split the contacts into three columns using an IF statement that will check for the three possible values of the ranking. The MAX aggregation function will allow you to remove the nulls.
SELECT s.ID,
s.Name,
MAX(IF(c.rn = 1, c.Contact, NULL)) AS Contact1,
MAX(IF(c.rn = 2, c.Contact, NULL)) AS Contact2,
MAX(IF(c.rn = 3, c.Contact, NULL)) AS Contact3
FROM Supplier s
INNER JOIN (SELECT *, ROW_NUMBER() OVER(PARTITION BY supplierId
ORDER BY id) AS rn
FROM Contact ) c
ON s.ID = c.supplierId
GROUP BY s.ID,
s.Name;
This second query may not work if you have more than three customers per supplier. In that case you either modify the query to contain the possible maximum amount of suppliers, or you use a prepared statement. If you really need such a solution, leave a comment below.
For a better understanding, you can play with these solutions here. The first solution will work on any MySQL version while the second one will work with MySQL 8.
Query to show the table like you want :
SELECT supplier.ID, supplier.Name, contact.Contact
FROM supplier
INNER JOIN contact
ON supplier.ID = contact.supplierId;

SQL Query to Return Top Field for Each Member

Let's say I have a table with the following fields...
MEMBER_ID (text)
CATEGORY1 (int)
CATEGORY2 (int)
CATEGORY3 (int)
CATEGORY4 (int)
...and let's say I have, like, 30+ more CATEGORY fields, all numbered accordingly. And in each category field, there is a numerical score.
Is there a query that could be used to populate a new table that looks like so...
MEMBER_ID
TOP_CATEGORY (the category name from the previous table with the
highest score for this MEMBER_ID)
SECOND_CATEGORY (the category name from the previous table with the
second-highest score for this MEMBER_ID)
THIRD_CATEGORY (the category name from the previous table with the
third-highest score for this MEMBER_ID)
...I know I could use CASE, but if I have a ton of CATEGORY fields, I assume that would get unwieldy. Do I have any other options?
Your best option in my opinion would be to normalise your database structure. Create a table of members, a table of categories and a table of scores by member & category. Having normalised, the problem you are trying to solve becomes a "top n per group" problem, followed by conditional aggregation. For example, if you follow the template for normalising that I have made in this demo, your query would look something like this:
select member_id,
max(case when `rank` = 1 then name end) as top_category,
max(case when `rank` = 2 then name end) as second_category,
max(case when `rank` = 3 then name end) as third_category
from (select n.member_id, c.name, s1.score, count(s2.score) + 1 as rank
from score s1
left join score s2 on s2.member = s1.member and s1.score < s2.score
join new_members n on n.id = s1.member
join category c on c.id = s1.category
group by n.member_id, c.name, s1.score
having count(s2.score) < 3) s
group by member_id
Once your database is normalised, adding new members, categories and scores becomes a lot easier, and queries to get the data out such as the above don't have to change at all.

Group SQL by newly defined parameters

i have gotten the task of creating a statistic from tables that look like this:
Faculty
1 FacultyName1
2 FacultyName2
3 FacultyName3
4 FacultyName4
5 FacultyName5
and this:
Student
1 StudentName1 FacultyNr2
2 StudentName2 FacultyNr3
3 StudentName3 FacultyNr5
4 StudentName4 FacultyNr2
now i have to create a statistic which Groups the Faculties into newly created fields and groups by them.
Say:
Faculty Group 1 Count: 3
Faculty Group 2 Count: 1
for this example lets say that all those of FacultyName1,FacultyName2,FacultyName3 should be listet as of "Faculty Group 1" and FacultyName4 and FacultyName5 as of "Faculty Group 2".
I started by doing the following:
Select Count(*)
FROM Student INNER JOIN Faculty on Student.FacultyID = Faculty.ID
But am stuck trying to understand how to Group, how i could create Groups in the Code, where i could just say: Group by FacultyGroups (Select Case When FacultyName = 'FacultyName1' = 'Faculty Group 1')
or something similiar, does anybody have any idea ?
Assuming that you have added a GroupID column in your Faculty table
SELECT COUNT(*), f.GroupID
FROM Student AS s
INNER JOIN Faculty AS f ON s.FacultyID = f.ID
GROUP BY f.GroupID
It gives you the number of student per group of faculties and the id of this group
There are better ways, but this should work:
SELECT
CASE
WHEN f.Name IN ('FacultyName1', 'FacultyName2', 'FacultyName3') THEN 'FacultyGroup1'
WHEN f.Name IN ('FacultyName4', 'FacultyName5') THEN 'FacultyGroup2'
END AS FacultyGroup,
COUNT(*) AS Students
FROM
Student s
INNER JOIN Faculty f ON s.FacultyID = f.ID
GROUP BY
CASE
WHEN f.Name IN ('FacultyName1', 'FacultyName2', 'FacultyName3') THEN 'FacultyGroup1'
WHEN f.Name IN ('FacultyName4', 'FacultyName5') THEN 'FacultyGroup2'
END;
If your "group" logic becomes too long then it will look messy in your query, so you might want to pre-calculate this. You could do this by using a sub-query for example, so one part of your query (the sub query) would convert faculties to groups and the other "main" part would count the students per group.

SQL Questions on generating a new table [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
The image shows the structure of my table. The first line means tutorB gives 10 marks to studentD. The second line means tutorE does not give any marks to studentD yet.
How can I generate the following table? I have referenced another post in stackoverflow.com. Collaborative filtering in MySQL? Yet, I am still quite confused.
From the image shown above, o means recommended, which the rate is higher than or equal to 7; x means not recommended, which the rate is less than 7.
For example, the tutorB give studentD 10 marks, therefore, from the second line in the image, we can see there is a "o" in column StudentD. ( And other three rows's data are just randomly assigned now.)
Now, if I want to recommend a student for Tutor A. The ranks ( or similarity) of TutorB, C and D are 0,2 and 3 respectively.
How can I generate a SQL such that I can able to convert the rate to "o" and "x" and calculate the rank. And, the most important, I want to recommend StudentH to TutorA as from the image.
How should I modify the code from the previous post? And, if my idea mentioned above correct?
Thanks.
============================================================================
EDITED
I have the following data in the database. The first row means 10 marks is given by tutorA to studentC.
I convert it as another table for a better understanding. v is the value of Rate.
create temporary table ub_rank as
select similar.NameA,count(*) rank
from tbl_rating target
join tbl_rating similar on target.NameB= similar.NameB and target.NameA != similar.NameA
where target.NameA = "tutorA"
group by similar.NameA;
select similar.NameB, sum(ub_rank.rank) total_rank
from ub_rank
join ub similar on ub_rank.NameA = similar.NameA
left join ub target on target.NameA = "tutorA" and target.NameB = similar.NameB
where target.NameB is null
group by similar.NameB
order by total_rank desc;
select * from ub_rank;
The code above is referenced from Collaborative filtering in MySQL?. I have a few questions.
There are 2 parts in the SQL. I can select * from the first part. However, if I enter the whole SQL as shown above, the system alerts Table 'mydatabase.ub' doesn't exist How should I modify the code?
The code will find the similarity. How should I change the code, such that if the marks are less that 7, it changes to o, else change to v , and count the similarity of a given user?
Shamelessly borrowing from the answer to this previous question, see if this does the trick:
SET #sql = NULL;
SELECT
GROUP_CONCAT(DISTINCT
CONCAT(
'max(case when NameB = ''',
NameB,
''' then (case when rate >= 7 then ''x'' else ''o'' end) else '' '' end) AS ',
replace(NameB, ' ', '')
)
) INTO #sql
from tbl_rating
where RoleA = 'Tutor';
SET #sql = CONCAT('SELECT NameA, ', #sql,
' from tbl_rating
where RoleA = ''Tutor''
group by NameA');
PREPARE stmt FROM #sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
Here is a SQL Fiddle.
Your DB schema is actually not very easy to work with.
Here's a query to get an exhaustive rating table:
SELECT Tutor.Name, Student.Name,
CASE WHEN Rating.Rate IS NULL THEN ''
WHEN Rating.Rate > 6 THEN 'o'
ELSE 'x' END
FROM (
SELECT DISTINCT NameB AS Name
FROM tbl_rating
WHERE RoleB='Tutor'
UNION
SELECT DISTINCT NameA AS Name
FROM tbl_rating
WHERE RoleA='Tutor'
ORDER BY Name) AS Tutor
CROSS JOIN (
SELECT DISTINCT NameB AS Name
FROM tbl_rating
WHERE RoleB='Student'
UNION
SELECT DISTINCT NameA AS Name
FROM tbl_rating
WHERE RoleA='Student'
ORDER BY Name) AS Student
LEFT JOIN tbl_rating AS Rating
ON Tutor.Name = Rating.NameA
AND Student.Name = Rating.NameB
ORDER BY Tutor.Name, Student.Name
The above query works by extracting from the table the list of all tutors (first subquery aliased to Tutor), and the list of all students (second subquery Student), do a product of both sets to obtain all the possible combination of tutor and student. Then it does an outer join with the rating table, which associate finds all the ratings done by students on tutors, and fill in with NULL non existent ratings.
(The query to obtain the opposit rating - ie. student rating by tutors - can be obtained by swapping NameA and NameB in the LEFT JOIN clauses).
The CASE turns numerical (or null) ratings to symbols as requested.
For similarities, we need to add two more joins:
one more on Tutor,
and another one on Rating
thus giving:
SELECT T1.Name AS Tutor1 , T2.Name AS Tutor2,
SUM( CASE
WHEN (R1.Rate > 6 && R2.Rate > 6) ||
(R1.Rate < 7 && R2.Rate < 7) THEN 1
ELSE 0 END) AS SIMILARITY
FROM (
SELECT DISTINCT NameB AS Name
FROM tbl_rating
WHERE RoleB='Tutor'
UNION
SELECT DISTINCT NameA AS Name
FROM tbl_rating
WHERE RoleA='Tutor'
ORDER BY Name) AS T1
CROSS JOIN (
SELECT DISTINCT NameB AS Name
FROM tbl_rating
WHERE RoleB='Tutor'
UNION
SELECT DISTINCT NameA AS Name
FROM tbl_rating
WHERE RoleA='Tutor'
ORDER BY Name) AS T2
CROSS JOIN (
SELECT DISTINCT NameB AS Name
FROM tbl_rating
WHERE RoleB='Student'
UNION
SELECT DISTINCT NameA AS Name
FROM tbl_rating
WHERE RoleA='Student'
ORDER BY Name) AS Student
LEFT JOIN tbl_rating AS R1
ON T1.Name = R1.NameA
AND Student.Name = R1.NameB
LEFT JOIN tbl_rating AS R2
ON T2.Name = R2.NameA
AND Student.Name = R2.NameB
WHERE Tutor1 < Tutor2
GROUP BY Tutor1, Tutor2
ORDER BY Tutor1, Tutor2
You could make these queries much more efficient by extracting the students and tutors specific data in their own tables, split the rating table in student ratings and tutors ratings, and use foreign keys:
Table student : Id | Name
Table tutor: Id | Name
Table tutor_rating: StudentId | TutorId | Rate
Table student_rating: StudentId | TutorId | Rate
and possibly a tutor_similiarity table to avoid recomputing the whole dataset all the time, with a couple of triggers on the rating tables to update it (the similarity computation would be then incremental, and queries would just dump its content).
Table tutor_similarity: TutorId1 | TutorId2 | Similarity
This is really a comment but it is too long for a comment.
First, you cannot easily create a table with a variable number of columns. Do you know the columns in advance? In general, you represent a matrix the way you do in your original table . . . the "x" and "y" values are columns and the value goes in a third column.
Second, is the x and o based on the rating from the tutor to the student or vice versa? Your question is entirely ambiguous.
Third, to convert a rating to an "x" or "o", just use a case statement:
select (case when rating >= 7 then 'x' else 'o' end)
Fourth, you say the similarities from A to B, C, and D are 0, 2, and 3 respectively. I have no idea how you are getting this from the matrix that you show. If it is by overlap of "x"s, then the values would seem to be 0, 1, and 2.
My final conclusion is that you don't need to create a matrix like that at all because you already have the data in the correct format.