join 2 different tables to a third table - mysql

I have an sql issue. I have 3 tables like in the image. In the front end (User Interface), I have a selectone box to select a course and an employee autocomplete. The autocomplete must retrieve all employee names along with the status for the selected course.
I tried
select e.id,per.id,t.status
from employee e
join person per on e.personId=per.id
left join training t on e.id=t.employeeId`
but this retrieves duplicate rows for the employeeId '1'.
for the employee with id 1, I need to retireve only the row with the selected courseId (selected from User Interface. )
In short,I need all employees information plus the selected courses employee info and also empIds must not repeat.
If selected course id is 34, the retrieved output must contain
Empid,PersonName,Status
1, Ravi , 1;
2, Meera , 0;
3, Rahul ,0;
4, Vinu, 0.
How do i form the reqd sql query?
As per suggestion provided, I sort of modified the accepted answer to (as per my requirement)
SELECT e.id,per.name,COALESCE(t.status,0)
FROM employee e
JOIN person per ON e.personId=per.id
LEFT JOIN training t ON e.id=t.employeeId
AND t.courseId = ?

The trick with left joins is to add the condition on the left-joined table to the join condition:
select e.id,per.id,t.status
from employee e
join person per on e.personId=per.id
left join training t on e.id=t.employeeId
and t.courseId = ?
This only attempts joins to specific training rows.
If you put the course condition into a where clause, you lose the left join - it effectively becomes an inner join, because where clause conditions are executed after the join is made. Conditions in the join condition however are executed as the join is made.
As a general comment, many people don't realise that you can put non-key conditions into a join condition. In fact, as in this situation, it is the cleanest way to achive the output you want.

Related

Proper SQL Statements in a PDF?

I am in the process of creating an attendance system and have created 3 different reports to generate based on the content of 3 different MySQL tables: members, attendance, and absence.
I am having an issue though. One of the reports is working since I have the correct statement. However, I cannot get the other two to work, so I need some help on how to figure out the best SQL statement for these reports.
The first report I need has to look like this:
This report shows how many people in each precinct showed up to the event and how many excused absences are in that precinct. For this report, I will also need a "Totals" line at the very bottom to count the total number of attendees, excused absences and totals from each precinct (like this):
The second report is similar to the report that is already completed. The difference is instead of the member's email and phone address, I need to see if they were marked present and if they had an excused absence. I cannot show the report since there is real data about real people, however I can show you the SQL statement that the completed report is using:
SELECT
precinct, name, residential_address, member_email, member_phone, present, alternate
FROM
attendance INNER JOIN members ON members.id = attendance.member_id
WHERE
present = 1
ORDER BY
members.precinct
I've tried SQL COUNT statements and various JOIN queries to try and make the queries work, but nothing is working at all. What is the correct query and why?
UPDATE
Here is my table structure of the 3 tables involved in the report generation. Note that each table (other than Members) shares the Member ID column:
Members Table:
Attendance Table:
Absence Table:
This is untested against actual data, but should be close to what you're looking for.
Your first report (http://sqlfiddle.com/#!9/191d8d/1) should be:
SELECT
m.precinct as 'precinct',
COUNT(at.member_id) as 'delagates_present',
COUNT(ab.member_id) as 'delegates_absent',
COUNT(at.member_id) + COUNT(ab.member_id) as 'total'
FROM
members m
LEFT JOIN attendance at ON at.member_id = m.id
LEFT JOIN absence ab ON ab.member_id = m.id
GROUP BY
m.precinct
WITH ROLLUP;
This selects all members, groups them by precinct, counts how many were present or absent, and then adds those together for the total column. Additionally, WITH ROLLUP will give you the sums of the columns (https://dev.mysql.com/doc/refman/8.0/en/group-by-modifiers.html) as the last row.
Then your second report (http://sqlfiddle.com/#!9/191d8d/2) should be:
SELECT
precinct,
name,
residential_address,
IF(at.member_id IS NULL, 0, 1) as 'present',
IF(ab.member_id IS NULL, 0, 1) as 'absent',
alternate
FROM
members m
LEFT JOIN attendance at ON at.member_id = m.id
LEFT JOIN absence ab ON ab.member_id = m.id
ORDER BY
m.precinct,
m.id;
which selects all members and does a LEFT JOIN on the other 2 tables. Then we can use a condition in the SELECT to determine if they were present or absent. There are a number of ways that data could be represented and returned, but I've opted for a simple 1 or 0 in both of those columns.

Sql statement fetching information from two different tables

I have a two tables, one of the table is called participants_tb while the second is called allocation_tb. On the participants_tb, I have my columns as participant_id, name, username.
Under the allocation_tb, I have my columns as allocation_id, sender_username, receiver_username, done. The column done holds any of these three numbers: 0, 1, 2.
I used this sql statement to fetch my values
SELECT *, COUNT(done) d
FROM participants_tb
JOIN allocation_tb ON (username=receiver_username)
WHERE done = 0 || done = 1
GROUP BY receiver_username
It worked very well, the problem I have is that, I want it to also include the information of participants that are in the participants_tb but not in the allocation_tb. I tried to use the left outer join but it did not work as expected because I want it to include participants that are only in the participants_tb but not in the allocation_tb, since the done in the where clause is in the allocation_tb, it won't include those information.
You seem to want:
SELECT p.*, COUNT(a.done) as d
FROM participants_tb p LEFT JOIN
allocation_tb a
ON p.username = a.receiver_username) AND
a.done IN (0, 1)
GROUP BY p.participant_id;
Notes:
The LEFT JOIN keeps all participants.
The GROUP BY needs to be on the first table.
You can use SELECT p.* with the GROUP BY -- assuming that the GROUP BY key is unique (or the primary key).
All columns should be qualified.
IN is an easier way to express your logic.

How to properly use inner join in SQL when I have to join a table twice?

Note: The actual schema isn't male/female, but some other criteria. I'm just using male/female for this example to make it easier to understand.
I have a table "users", which contains a column user_name and user_gender. The gender can be "M" or "F".
The problem is that I have another table, "messages", that has a column for "sender" and "receiver". These columns contains user_name in each row.
How can I use INNER JOIN so that I can get messages where only males send to females?
I know easily how to specify it once, binding users.user_name to "sender" or "receiver" but not both.
To expand on my question, how do see which top 10 pairs where a male sent the most messages to a female? Note, this means unique A/B pairs, so I want to return cases where a guy sends a single female a ton of messages, not when a guy spams a lot of messages to different females.
Think of your messages table as a "cross table" connecting two rows in the users table. When you join to a table like that, give users two different aliases, and refer to them in your join conditions, like this:
select *
from messages msg
join users m on msg.sender = m.user_id AND m.user_gender='M'
join users f on msg.receiver = f.user_id AND f.user_gender='F'
With this skeleton in hand, you should be able to figure out the rest of your query:
Use GROUP BY to group by m.user_id, f.user_id, and count(*) to count
Order by COUNT(*) to get the highest sender+receiver pairs at the top
Use LIMIT to grab the top ten pairs.

Conditional JOIN determines table based on INSTR of main table field

Slightly unusual requirement here, which unfortunately is down to a poor table design a long way back down the development path!
I have 3 tables, repairs, staff, technicians
The repairs table contains all the information on repair tasks booked in to my system. This contains a field "Technician" this field will contain the ID of either a staff member (from the table staff) or an ID of an outsourced (offsite) technician (from the table technicians), in this latter case the ID will have a prefix of "T"
So, due to this latter case prefix, I need to be able to grab that T and use it to determine whether my SQL query needs to JOIN table staff or table technicians
So, I have a fairly simple SQL Query:
SELECT technicians.screenName,
repairs.turnAround, repairs.technician, repairs.dateIn, repairs.Type,
invoices.status, invoices.grossTotal
FROM repairs
LEFT JOIN invoices ON repairs.invNo=invoices.id
LEFT JOIN technicians ON technicians.id = REPLACE(repairs.technician, 'T','')
WHERE repairs.id ='REQUIRED JOB ID' ORDER BY repairs.dateIn DESC
This will work fine, and overcomes the "T" prefix for all cases where I have an outsourced technician.
BUT...
The IDs will mix up if I try to JOIN the staff table.
So I need a conditional join, such as:
LEFT JOIN
WHEN instr(repairs.technican,'T') > 0 THEN
JOIN TECHNICIANS TABLE
ELSE
JOIN STAFF TABLE
END
The further issue I can see here is that the field technicans.screenName being included in the field list will not work if I am not joining the technicians table, however, as the staff table includes a field screenName which I'd need if I joined that table, if I had an ambiguous field name screenName with no table prefix it should work shouldn't it?
EDIT: I should probably add that the conditional join example above does NOT work!
You can join both tables with including/excluding INSTR condition, and show not null value then:
SELECT COALESCE(technicians.screenName, staff.screenName) AS screenName,
repairs.turnAround, repairs.technician, repairs.dateIn, repairs.Type,
invoices.status, invoices.grossTotal
FROM repairs
LEFT JOIN invoices ON repairs.invNo=invoices.id
LEFT JOIN technicians ON instr(repairs.technican,'T') > 0 AND technicians.id = REPLACE(repairs.technician, 'T','')
LEFT JOIN staff ON instr(repairs.technican,'T') = 0 AND staff.id = repairs.technician
WHERE repairs.id ='REQUIRED JOB ID' ORDER BY repairs.dateIn DESC

In what order are MySQL JOINs evaluated?

I have the following query:
SELECT c.*
FROM companies AS c
JOIN users AS u USING(companyid)
JOIN jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123;
I have the following questions:
Is the USING syntax synonymous with ON syntax?
Are these joins evaluated left to right? In other words, does this query say: x = companies JOIN users; y = x JOIN jobs; z = y JOIN useraccounts;
If the answer to question 2 is yes, is it safe to assume that the companies table has companyid, userid and jobid columns?
I don't understand how the WHERE clause can be used to pick rows on the companies table when it is referring to the alias "j"
Any help would be appreciated!
USING (fieldname) is a shorthand way of saying ON table1.fieldname = table2.fieldname.
SQL doesn't define the 'order' in which JOINS are done because it is not the nature of the language. Obviously an order has to be specified in the statement, but an INNER JOIN can be considered commutative: you can list them in any order and you will get the same results.
That said, when constructing a SELECT ... JOIN, particularly one that includes LEFT JOINs, I've found it makes sense to regard the third JOIN as joining the new table to the results of the first JOIN, the fourth JOIN as joining the results of the second JOIN, and so on.
More rarely, the specified order can influence the behaviour of the query optimizer, due to the way it influences the heuristics.
No. The way the query is assembled, it requires that companies and users both have a companyid, jobs has a userid and a jobid and useraccounts has a userid. However, only one of companies or user needs a userid for the JOIN to work.
The WHERE clause is filtering the whole result -- i.e. all JOINed columns -- using a column provided by the jobs table.
I can't answer the bit about the USING syntax. That's weird. I've never seen it before, having always used an ON clause instead.
But what I can tell you is that the order of JOIN operations is determined dynamically by the query optimizer when it constructs its query plan, based on a system of optimization heuristics, some of which are:
Is the JOIN performed on a primary key field? If so, this gets high priority in the query plan.
Is the JOIN performed on a foreign key field? This also gets high priority.
Does an index exist on the joined field? If so, bump the priority.
Is a JOIN operation performed on a field in WHERE clause? Can the WHERE clause expression be evaluated by examining the index (rather than by performing a table scan)? This is a major optimization opportunity, so it gets a major priority bump.
What is the cardinality of the joined column? Columns with high cardinality give the optimizer more opportunities to discriminate against false matches (those that don't satisfy the WHERE clause or the ON clause), so high-cardinality joins are usually processed before low-cardinality joins.
How many actual rows are in the joined table? Joining against a table with only 100 values is going to create less of a data explosion than joining against a table with ten million rows.
Anyhow... the point is... there are a LOT of variables that go into the query execution plan. If you want to see how MySQL optimizes its queries, use the EXPLAIN syntax.
And here's a good article to read:
http://www.informit.com/articles/article.aspx?p=377652
ON EDIT:
To answer your 4th question: You aren't querying the "companies" table. You're querying the joined cross-product of ALL four tables in your FROM and USING clauses.
The "j.jobid" alias is just the fully-qualified name of one of the columns in that joined collection of tables.
In MySQL, it's often interesting to ask the query optimizer what it plans to do, with:
EXPLAIN SELECT [...]
See "7.2.1 Optimizing Queries with EXPLAIN"
Here is a more detailed answer on JOIN precedence. In your case, the JOINs are all commutative. Let's try one where they aren't.
Build schema:
CREATE TABLE users (
name text
);
CREATE TABLE orders (
order_id text,
user_name text
);
CREATE TABLE shipments (
order_id text,
fulfiller text
);
Add data:
INSERT INTO users VALUES ('Bob'), ('Mary');
INSERT INTO orders VALUES ('order1', 'Bob');
INSERT INTO shipments VALUES ('order1', 'Fulfilling Mary');
Run query:
SELECT *
FROM users
LEFT OUTER JOIN orders
ON orders.user_name = users.name
JOIN shipments
ON shipments.order_id = orders.order_id
Result:
Only the Bob row is returned
Analysis:
In this query the LEFT OUTER JOIN was evaluated first and the JOIN was evaluated on the composite result of the LEFT OUTER JOIN.
Second query:
SELECT *
FROM users
LEFT OUTER JOIN (
orders
JOIN shipments
ON shipments.order_id = orders.order_id)
ON orders.user_name = users.name
Result:
One row for Bob (with the fulfillment data) and one row for Mary with NULLs for fulfillment data.
Analysis:
The parenthesis changed the evaluation order.
Further MySQL documentation is at https://dev.mysql.com/doc/refman/5.5/en/nested-join-optimization.html
SEE http://dev.mysql.com/doc/refman/5.0/en/join.html
AND start reading here:
Join Processing Changes in MySQL 5.0.12
Beginning with MySQL 5.0.12, natural joins and joins with USING, including outer join variants, are processed according to the SQL:2003 standard. The goal was to align the syntax and semantics of MySQL with respect to NATURAL JOIN and JOIN ... USING according to SQL:2003. However, these changes in join processing can result in different output columns for some joins. Also, some queries that appeared to work correctly in older versions must be rewritten to comply with the standard.
These changes have five main aspects:
The way that MySQL determines the result columns of NATURAL or USING join operations (and thus the result of the entire FROM clause).
Expansion of SELECT * and SELECT tbl_name.* into a list of selected columns.
Resolution of column names in NATURAL or USING joins.
Transformation of NATURAL or USING joins into JOIN ... ON.
Resolution of column names in the ON condition of a JOIN ... ON.
Im not sure about the ON vs USING part (though this website says they are the same)
As for the ordering question, its entirely implementation (and probably query) specific. MYSQL most likely picks an order when compiling the request. If you do want to enforce a particular order you would have to 'nest' your queries:
SELECT c.*
FROM companies AS c
JOIN (SELECT * FROM users AS u
JOIN (SELECT * FROM jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123)
)
as for part 4: the where clause limits what rows from the jobs table are eligible to be JOINed on. So if there are rows which would join due to the matching userids but don't have the correct jobid then they will be omitted.
1) Using is not exactly the same as on, but it is short hand where both tables have a column with the same name you are joining on... see: http://www.java2s.com/Tutorial/MySQL/0100__Table-Join/ThekeywordUSINGcanbeusedasareplacementfortheONkeywordduringthetableJoins.htm
It is more difficult to read in my opinion, so I'd go spelling out the joins.
3) It is not clear from this query, but I would guess it does not.
2) Assuming you are joining through the other tables (not all directly on companyies) the order in this query does matter... see comparisons below:
Origional:
SELECT c.*
FROM companies AS c
JOIN users AS u USING(companyid)
JOIN jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123
What I think it is likely suggesting:
SELECT c.*
FROM companies AS c
JOIN users AS u on u.companyid = c.companyid
JOIN jobs AS j on j.userid = u.userid
JOIN useraccounts AS us on us.userid = u.userid
WHERE j.jobid = 123
You could switch you lines joining jobs & usersaccounts here.
What it would look like if everything joined on company:
SELECT c.*
FROM companies AS c
JOIN users AS u on u.companyid = c.companyid
JOIN jobs AS j on j.userid = c.userid
JOIN useraccounts AS us on us.userid = c.userid
WHERE j.jobid = 123
This doesn't really make logical sense... unless each user has their own company.
4.) The magic of sql is that you can only show certain columns but all of them are their for sorting and filtering...
if you returned
SELECT c.*, j.jobid....
you could clearly see what it was filtering on, but the database server doesn't care if you output a row or not for filtering.