I have been trying to figure this out for a while -- I'm working on a simulation (run in PHP because I hate myself). I've gotten the thing up to the point where I can start adding in "Viruses" and such.
Right now, I'm working on an 'virus' that limits fertility of citizens -- I've got it all working perfectly save for the actual method that finds out if they are 'infertile' or not.
I'm trying to pull data from the database using a query similar to this (note this query semi-working, but I can't figure out how to make it work properly):
SELECT g.infert1 as gert,
v.infert1 as vert
FROM genetics as g,
virus as v,
citizens as c
WHERE c.cid = 1
AND g.cid = c.cid
AND v.vid = c.infected
The GERT and VERT data associate to two tables (Genetics and Virus). The virus table contains most of the same rows as the genetics table (Imm00-12, infert1, etc.). Both sets of data select correctly if a 'citizen' is infected, however if they are not, then the result returns null and causes an error.
I'm trying to figure out if there is a way to get conditional data where if a citizen is not infected, it'll still select the GERT information and just return null for VERT as opposed returning nothing at all.
Please try the following...
SELECT genetics.infert1 AS gert,
virus.infert1 AS vert
FROM citizens
JOIN genetics ON citizens.cid = genetics.cid
LEFT JOIN virus ON virus.vid = citizens.infected
WHERE citizens.cid = 1;
This statement starts by performing an INNER JOIN between citizens and genetics based on their common value of cid. This effectively prepends the fields of citizen to those of genetics for each citizen.
It then performs a LEFT JOIN between this joined dataset and virus where virus.vid = citizens.infected. This has the effect of appending the values of the fields of a virus that a citizen is afflicted with along with the fields for that citizen to each record from genetics for that citizen. If the citizen is not afflicted with a virus, then NULL values are used for the virus fields.
The resulting dataset is then refined to just those records where the value of citizens.cid is 1.
If you wish to produce the same list but for all citizens, then remove the WHERE clause.
Using INNER JOIN and LEFT JOIN and is a more modern and efficient way of joining than using CROSS JOIN's refined by the WHERE clause. A CROSS JOIN between two tables, performed using tblTable1, tblTable2 appends a copy of every record from tblTable2 to every record from tblTable. The resulting (often very large) dataset is then refined by the WHERE clause. With INNER JOIN and LEFT JOIN only those record that meet the ON criteria are joined, which is typically significantly faster.
If you have any questions or comments, then please feel free to post a Comment accordingly.
Something like this is probably what you want.
SELECT g.infert1 as gert,
cv.vert as vert
FROM
genetics as g
LEFT JOIN
( SELECT c.cid as cid, v.infert1 as vert
FROM virus as v
INNER JOIN citizens as c ON v.vid = c.infected
WHERE c.cid = 1
) AS cv
ON g.cid = cv.cid
Here I use SQL JOIN syntax:
Inner Join means give me rows where they exactly match. Which is what your orignal SQL does, but we want that only on virus vs citizens
LEFT JOIN means "give me rows from the LEFT hand table, and if there are any in the other, then include them too" - which is what I think you want.
You may wish to re-evaluate whether citizens can only have one virus though.
Related
SELECT team_with.participant1,team_with.participant2,team_with.participant3
FROM event,team_with
WHERE team_with.for_event_no=event.event_no AND
event.event_no=4 AND
team_with.participant1=9 OR
team_with.participant2=9 OR
team_with.participant3=9;
I have written the particular query, and obtained the required id's in a row. I am not able to modify this query such that, in place of these id's, names connected to the id's are displayed.
The student_detatil table consists of PK(sam_id) and the attribute name.
IDs displayed by the present query are FKs connected to student_detail.sam_id..
It seems like a bad design to multiply columns storing different participants. Consider creating a separate row for each participant and storing them in a table. Your joining logic would also be easier.
Also, please use explicit JOIN syntax - it makes the query clearer and easier to understand by separating join logic with conditions for data retrieval.
Remember that operator AND has a precedence over OR, so that your event.event_no = 4 does not apply to each participant condition. I believe this was a mistake, but you are the one to judge.
As to the query itself, you could apply OR conditions into join, or simply join the student_detail table thrice.
SELECT
s1.name,
s2.name,
s3.name
FROM
event e
INNER JOIN team_with t ON t.for_event_no = e.event_no
LEFT JOIN student_detail s1 ON s1.sam_id = t.participant1
LEFT JOIN student_detail s2 ON s2.sam_id = t.participant2
LEFT JOIN student_detail s3 ON s3.sam_id = t.participant3
WHERE
e.event_no = 4
AND ( t.participant1=9 OR t.participant2=9 OR t.participant3=9 );
I'm working through the JOIN tutorial on SQL zoo.
Let's say I'm about to execute the code below:
SELECT a.stadium, COUNT(g.matchid)
FROM game a
JOIN goal g
ON g.matchid = a.id
GROUP BY a.stadium
As it happens, it produces the same output as the code below:
SELECT a.stadium, COUNT(g.matchid)
FROM goal g
JOIN game a
ON g.matchid = a.id
GROUP BY a.stadium
So then, when does it matter which table you assign at FROM and which one you assign at JOIN?
When you are using an INNER JOIN like you are here, the order doesn't matter. That is because you are connecting two tables on a common index, so the order in which you use them is up to you. You should pick an order that is most logical to you, and easiest to read. A habit of mine is to put the table I'm selecting from first. In your case, you're selecting information about a stadium, which comes from the game table, so my preference would be to put that first.
In other joins, however, such as LEFT OUTER JOIN and RIGHT OUTER JOIN the order will matter. That is because these joins will select all rows from one table. Consider for example I have a table for Students and a table for Projects. They can exist independently, some students may have an associated project, but not all will.
If I want to get all students and project information while still seeing students without projects, I need a LEFT JOIN:
SELECT s.name, p.project
FROM student s
LEFT JOIN project p ON p.student_id = s.id;
Note here, that the LEFT JOIN refers to the table in the FROM clause, so that means ALL of students were being selected. This also means that p.project will be null for some rows. Order matters here.
If I took the same concept with a RIGHT JOIN, it will select all rows from the table in the join clause. So if I changed the query to this:
SELECT s.name, p.project
FROM student s
RIGHT JOIN project p ON p.student_id = s.id;
This will return all rows from the project table, regardless of whether or not it has a match for students. This means that in some rows, s.name will be null. Similar to the first example, because I've made project the outer joined table, p.project will never be null (assuming it isn't in the original table). In the first example, s.name should never be null.
In the case of outer joins, order will matter. Thankfully, you can think intuitively with LEFT and RIGHT joins. A left join will return all rows in the table to the left of that statement, while a right join returns all rows from the right of that statement. Take this as a rule of thumb, but be careful. You might want to develop a pattern to be consistent with yourself, as I mentioned earlier, so these queries are easier for you to understand later on.
When you only JOIN 2 tables, usually the order does not matter: MySQL scans the tables in the optimal order.
When you scan more than 2 tables, the order could matter:
SELECT ...
FROM a
JOIN b ON ...
JOIN c ON ...
Also, MySQL tries to scan the tables in the fastest way (large tables first). But if a join is slow, it is possible that MySQL is scanning them in a non-optimal order. You can verify this with EXPLAIN. In this case, you can force the join order by adding the STRAIGHT_JOIN keyword.
The order doesn't always matter, I usually just order it in a way that makes sense to someone reading your query.
Sometime order does matter. Try it with LEFT JOIN and RIGHT JOIN.
In this instance you are using an INNER JOIN, if you're expecting a match on a common ID or foreign key, it probably doesn't matter too much.
You would however need to specify the tables the correct way round if you were performing an OUTER JOIN, as not all records in this type of join are guaranteed to match via the same field.
yes, it will matter when you will user another join LEFT JOIN, RIGHT JOIN
currently You are using NATURAL JOIN that is return all tables related data, if JOIN table row not match then it will exclude row from result
If you use LEFT / RIGHT {OUTER} join then result will be different, follow this link for more detail
I have 4 tables which need to be joined. They are:
Contractors
Crews
Skill_type
Location
I also need to be able to count the number of contractors in each crew.
I have created some SQL (MySql) which does the job nicely:
Select contractors.crew_id as contractors_crew_id,
auburntree.crews.crew_name, count(*) as members, skill_type.skill,
location.location_name
FROM contractors
JOIN auburntree.crews on contractors.crew_id=crews.id
JOIN skill_type on skill_type.skill_id = crews.skill_id
JOIN location on location.id = crews.location_id
GROUP BY contractors.crew_id ;
The contractors table contains a foreign key reference to which crew they are assigned to. Hence the column "contractors_crew_id"
I get a nice result, as you can see in this screen capture image:
https://drive.google.com/file/d/0B5elaUk7GlRoS0JWblE3ZzFXY0E/view?usp=sharing
Problem Defined:
I need to define an empty crew first before I add contractors/members. I will give it a name, a skill and a location. I might define an empty crew several days in advance. Currently an empty crew does not show up in my results.
I want to see:
contractors_crew_id = Null, crew_name, skill_type, Location, members 0
I have tried using RIGHT OUTER JOIN on my tables and I still do not see empty crew. LEFT OUTER JOIN does not work either.
contractors_crew_id will be Null at that point, as no contractors have been assigned yet.
I Hope this makes, sense - sorry it is a bit long and complicated. I have tried really hard to explain in.
Many Thanks !!
Because you are actually focussing on information about crews, and not contractors, I would suggest to start by querying the crews table (in the FROM part of your query) and then join the other tables.
Using a LEFT JOIN on the contractors table should then give you the desired result. Since crews is now the "left" table in the query, the result will include all rows from crews, even when there is no corresponding contractor. This answer explains it very well.
This query works for me on test tables based on your examples:
SELECT
contractors.crew,
crews.crew,
COUNT(contractors.id) as members,
skills.skill,
locations.location
FROM crews
JOIN skills ON crews.skill = skills.id
JOIN locations ON crews.location = locations.id
LEFT JOIN contractors ON contractors.crew = crews.id
GROUP BY contractors.crew, crews.id;
To see for yourself, try this SQL Fiddle.
Try this, instead:
Select contractors.crew_id as contractors_crew_id, crews.crew_name, count(*) as members, skill_type.skill, location.location_name
FROM crews
LEFT OUTER JOIN contractors on contractors.crew_id=crews.id
LEFT OUTER JOIN skill_type on skill_type.skill_id = crews.skill_id
LEFT OUTER JOIN location on location.id = crews.location_id
GROUP BY contractors.crew_id ;
That should pull all crews, regardless of whether or not they have contractors assigned. The way you wrote your original query starts by looking at your contractors and then pulling in any crews they might be assigned to. Crews is clearly your focal table here, so you should be joining the rest of your tables to it, rather than to the contractors table.
I need to perform a query SELECT that joins three tables (no problem with that). Nonetheless, the third table can, or NOT, have any element that match the joining KEY.
I want ALL data from the first two tables and if the ITEMS have ALSO information in the third table, fetch this data to.
For example, imagine that the first table have a person, the second table have his/her address (everyone lives anywhere), the third table stores the driving license (not everyone has this) - but I need to fetch all data whether or not people (all people) have driving license.
Thanks a lot for reading, if possible to give you suggestion / solution!
Use LEFT JOIN to join the third table. Using INNER JOIN a row has to exists. Using LEFT JOIN, the 'gaps' will be filled with NULLs.
SELECT
p.PersonID, -- NOT NULL
-- dl.PersonID, -- Can be null. Don't use this one.
p.FirstName,
p.LastName,
a.City,
a.Street,
dl.ValidUntilDate
FROM
Person p
INNER JOIN Addresse a ON a.AddressID = p.HomeAddressID
LEFT JOIN DrivingLicence dl ON dl.PersonId = p.PersonID
I have the following query:
SELECT c.*
FROM companies AS c
JOIN users AS u USING(companyid)
JOIN jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123;
I have the following questions:
Is the USING syntax synonymous with ON syntax?
Are these joins evaluated left to right? In other words, does this query say: x = companies JOIN users; y = x JOIN jobs; z = y JOIN useraccounts;
If the answer to question 2 is yes, is it safe to assume that the companies table has companyid, userid and jobid columns?
I don't understand how the WHERE clause can be used to pick rows on the companies table when it is referring to the alias "j"
Any help would be appreciated!
USING (fieldname) is a shorthand way of saying ON table1.fieldname = table2.fieldname.
SQL doesn't define the 'order' in which JOINS are done because it is not the nature of the language. Obviously an order has to be specified in the statement, but an INNER JOIN can be considered commutative: you can list them in any order and you will get the same results.
That said, when constructing a SELECT ... JOIN, particularly one that includes LEFT JOINs, I've found it makes sense to regard the third JOIN as joining the new table to the results of the first JOIN, the fourth JOIN as joining the results of the second JOIN, and so on.
More rarely, the specified order can influence the behaviour of the query optimizer, due to the way it influences the heuristics.
No. The way the query is assembled, it requires that companies and users both have a companyid, jobs has a userid and a jobid and useraccounts has a userid. However, only one of companies or user needs a userid for the JOIN to work.
The WHERE clause is filtering the whole result -- i.e. all JOINed columns -- using a column provided by the jobs table.
I can't answer the bit about the USING syntax. That's weird. I've never seen it before, having always used an ON clause instead.
But what I can tell you is that the order of JOIN operations is determined dynamically by the query optimizer when it constructs its query plan, based on a system of optimization heuristics, some of which are:
Is the JOIN performed on a primary key field? If so, this gets high priority in the query plan.
Is the JOIN performed on a foreign key field? This also gets high priority.
Does an index exist on the joined field? If so, bump the priority.
Is a JOIN operation performed on a field in WHERE clause? Can the WHERE clause expression be evaluated by examining the index (rather than by performing a table scan)? This is a major optimization opportunity, so it gets a major priority bump.
What is the cardinality of the joined column? Columns with high cardinality give the optimizer more opportunities to discriminate against false matches (those that don't satisfy the WHERE clause or the ON clause), so high-cardinality joins are usually processed before low-cardinality joins.
How many actual rows are in the joined table? Joining against a table with only 100 values is going to create less of a data explosion than joining against a table with ten million rows.
Anyhow... the point is... there are a LOT of variables that go into the query execution plan. If you want to see how MySQL optimizes its queries, use the EXPLAIN syntax.
And here's a good article to read:
http://www.informit.com/articles/article.aspx?p=377652
ON EDIT:
To answer your 4th question: You aren't querying the "companies" table. You're querying the joined cross-product of ALL four tables in your FROM and USING clauses.
The "j.jobid" alias is just the fully-qualified name of one of the columns in that joined collection of tables.
In MySQL, it's often interesting to ask the query optimizer what it plans to do, with:
EXPLAIN SELECT [...]
See "7.2.1 Optimizing Queries with EXPLAIN"
Here is a more detailed answer on JOIN precedence. In your case, the JOINs are all commutative. Let's try one where they aren't.
Build schema:
CREATE TABLE users (
name text
);
CREATE TABLE orders (
order_id text,
user_name text
);
CREATE TABLE shipments (
order_id text,
fulfiller text
);
Add data:
INSERT INTO users VALUES ('Bob'), ('Mary');
INSERT INTO orders VALUES ('order1', 'Bob');
INSERT INTO shipments VALUES ('order1', 'Fulfilling Mary');
Run query:
SELECT *
FROM users
LEFT OUTER JOIN orders
ON orders.user_name = users.name
JOIN shipments
ON shipments.order_id = orders.order_id
Result:
Only the Bob row is returned
Analysis:
In this query the LEFT OUTER JOIN was evaluated first and the JOIN was evaluated on the composite result of the LEFT OUTER JOIN.
Second query:
SELECT *
FROM users
LEFT OUTER JOIN (
orders
JOIN shipments
ON shipments.order_id = orders.order_id)
ON orders.user_name = users.name
Result:
One row for Bob (with the fulfillment data) and one row for Mary with NULLs for fulfillment data.
Analysis:
The parenthesis changed the evaluation order.
Further MySQL documentation is at https://dev.mysql.com/doc/refman/5.5/en/nested-join-optimization.html
SEE http://dev.mysql.com/doc/refman/5.0/en/join.html
AND start reading here:
Join Processing Changes in MySQL 5.0.12
Beginning with MySQL 5.0.12, natural joins and joins with USING, including outer join variants, are processed according to the SQL:2003 standard. The goal was to align the syntax and semantics of MySQL with respect to NATURAL JOIN and JOIN ... USING according to SQL:2003. However, these changes in join processing can result in different output columns for some joins. Also, some queries that appeared to work correctly in older versions must be rewritten to comply with the standard.
These changes have five main aspects:
The way that MySQL determines the result columns of NATURAL or USING join operations (and thus the result of the entire FROM clause).
Expansion of SELECT * and SELECT tbl_name.* into a list of selected columns.
Resolution of column names in NATURAL or USING joins.
Transformation of NATURAL or USING joins into JOIN ... ON.
Resolution of column names in the ON condition of a JOIN ... ON.
Im not sure about the ON vs USING part (though this website says they are the same)
As for the ordering question, its entirely implementation (and probably query) specific. MYSQL most likely picks an order when compiling the request. If you do want to enforce a particular order you would have to 'nest' your queries:
SELECT c.*
FROM companies AS c
JOIN (SELECT * FROM users AS u
JOIN (SELECT * FROM jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123)
)
as for part 4: the where clause limits what rows from the jobs table are eligible to be JOINed on. So if there are rows which would join due to the matching userids but don't have the correct jobid then they will be omitted.
1) Using is not exactly the same as on, but it is short hand where both tables have a column with the same name you are joining on... see: http://www.java2s.com/Tutorial/MySQL/0100__Table-Join/ThekeywordUSINGcanbeusedasareplacementfortheONkeywordduringthetableJoins.htm
It is more difficult to read in my opinion, so I'd go spelling out the joins.
3) It is not clear from this query, but I would guess it does not.
2) Assuming you are joining through the other tables (not all directly on companyies) the order in this query does matter... see comparisons below:
Origional:
SELECT c.*
FROM companies AS c
JOIN users AS u USING(companyid)
JOIN jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123
What I think it is likely suggesting:
SELECT c.*
FROM companies AS c
JOIN users AS u on u.companyid = c.companyid
JOIN jobs AS j on j.userid = u.userid
JOIN useraccounts AS us on us.userid = u.userid
WHERE j.jobid = 123
You could switch you lines joining jobs & usersaccounts here.
What it would look like if everything joined on company:
SELECT c.*
FROM companies AS c
JOIN users AS u on u.companyid = c.companyid
JOIN jobs AS j on j.userid = c.userid
JOIN useraccounts AS us on us.userid = c.userid
WHERE j.jobid = 123
This doesn't really make logical sense... unless each user has their own company.
4.) The magic of sql is that you can only show certain columns but all of them are their for sorting and filtering...
if you returned
SELECT c.*, j.jobid....
you could clearly see what it was filtering on, but the database server doesn't care if you output a row or not for filtering.