Is there a reason MySQL doesn't support FULL OUTER JOINS? - mysql

Is there a reason MySQL doesn't support FULL OUTER JOINS? I've tried full outer join syntax in mysql many times and it never worked, just found out its not supported by mysql so just curious as to why?

MySQL lacks a lot of functionality that other databases have*. I think they have a pretty huge backlog of ideas and not enough developers to implement them all.
This feature was requested in 2006 and is still not implemented. I guess it has low priority because you can work around it by combining LEFT and RIGHT OUTER JOIN with a UNION ALL. Not pleasant, but it does the trick. Change this:
SELECT *
FROM table1
FULL OUTER JOIN table2
ON table1.table2_id = table2.id
to this:
SELECT *
FROM table1
LEFT JOIN table2
ON table1.table2_id = table2.id
UNION ALL
SELECT *
FROM table1
RIGHT JOIN table2
ON table1.table2_id = table2.id
WHERE table1.table2_id IS NULL
* To be fair to MySQL, they also have some features that many other databases don't have.

I don't believe the MySQL devs have ever stated any technical reason why it might be difficult to implement.
But MySQL, like most DBMSs, has many places where it does not fully implement the ANSI standard. Since FULL OUTER JOIN is a rarely-used feature, and can typically be replaced by a UNION workaround, there is little pressure to get it fixed.
I suggest adding your voice to bug 18003.

Because it was never implemented by MySQL developers.
Why?
Because there was not enough pressure from customers.

Out of a large system, I usually might use a FULL OUTER JOIN maybe once or twice, so there isn't a huge demand for it and of course you can work around it fairly easily and potentially more explicitly readably (if you are inferring derived columns based on left/right results) with a UNION of LEFT and RIGHT JOINs.

as a product matures, more features get added with each release. just chalk this up to not being implemented yet. I'm sure it will eventually be there, it doesn't mean that MySql is bad or anything, every database has extra as well as missing features. I wish SQL Server had the group concatenation feature that MySql has!

From High Performance MySQL:
At the moment, MySQL’s join execution strategy is simple: it treats every join as a nested-loop join. A FULL OUTER JOIN can’t be executed with nested loops and backtracking as soon as a table with no matching rows is found, because it might begin with a table that has no matching rows.This explains why MySQL doesn’t support FULL OUTER JOIN

Related

Is this SQL statement making a join? [duplicate]

I develop against Oracle databases. When I need to manually write (not use an ORM like hibernate), I use a WHERE condition instead of a JOIN.
for example (this is simplistic just to illustrate the style):
Select *
from customers c, invoices i, shipment_info si
where c.customer_id = i.customer_id
and i.amount > 999.99
and i.invoice_id = si.invoice_id(+) -- added to show a replacement for a join
order by i.amount, c.name
I learned this style from an OLD oracle DBA. I have since learned that this is not standard SQL syntax. Other than being non-standard and much less database portable, are there any other repercussions to using this format?
I don't like the style because it makes it harder to determine which WHERE clauses are for simulating JOINs and which ones are for actual filters, and I don't like code that makes it unnecessarily difficult to determine the original intent of the programmer.
The biggest issue that I have run into with this format is the tendency to forget some join's WHERE clause, thereby resulting in a cartesian product. This is particularly common (for me, at least) when adding a new table to the query. For example, suppose an ADDRESSES table is thrown into the mix and your mind is a bit forgetful:
SELECT *
FROM customers c, invoices i, addresses a
WHERE c.customer_id = i.customer_id
AND i.amount > 999.99
ORDER BY i.amount, c.name
Boom! Cartesian product! :)
The old style join is flat out wrong in some cases (outer joins are the culprit). Although they are more or less equivalent when using inner joins, they can generate incorrect results with outer joins, especially if columns on the outer side can be null. This is because when using the older syntax the join conditions are not logically evaluated until the entire result set has been constructed, it is simply not possible to express a condition on a column from outer side of a join that will filter records when the column can be null because there is no matching record.
As an example:
Select all Customers, and the sum of the sales of Widgets on all their Invoices in the month Of August, where the Invoice has been processed (Invoice.ProcessDate is Not Null)
using new ANSI-92 Join syntax
Select c.name, Sum(d.Amount)
From customer c
Left Join Invoice I
On i.custId = c.custId
And i.SalesDate Between '8/1/2009'
and '8/31/2009 23:59:59'
And i.ProcessDate Is Not Null
Left Join InvoiceDetails d
On d.InvoiceId = i.InvoiceId
And d.Product = 'widget'
Group By c.Name
Try doing this with old syntax... Because when using the old style syntax, all the conditions in the where clause are evaluated/applied BEFORE the 'outer' rows are added back in, All the UnProcessed Invoice rows will get added back into the final result set... So this is not possible with old syntax - anything that attempts to filter out the invoices with null Processed Dates will eliminate customers... the only alternative is to use a correlated subquery.
Some people will say that this style is less readable, but that's a matter of habit. From a performance point of view, it doesn't matter, since the query optimizer takes care of that.
I have since learned that this is not standard SQL syntax.
That's not quite true. The "a,b where" syntax is from the ansi-89 standard, the "a join b on" syntax is ansi-92. However, the 89 syntax is deprecated, which means you should not use it for new queries.
Also, there are some situations where the older style lacks expressive power, especially with regard to outer joins or complex queries.
It can be a pain going through the where clause trying to pick out join conditions. For anything more than one join the old style is absolute evil. And once you know the new style, you may as well just keep using it.
This is a standard SQL syntax, just an older standard than JOIN. There's a reason that the syntax has evolved and you should use the newer JOIN syntax because:
It's more expressive, clearly indicating which tables are JOINed, the JOIN order, which conditions apply to which JOIN, and separating out the filtering WHERE conditions from the JOIN conditions.
It supports LEFT, RIGHT, and FULL OUTER JOINs, which the WHERE syntax does not.
I don't think you'll find the WHERE-type JOIN substantially less portable than the JOIN syntax.
As long as you don't use the ANSI natural join feature I'm OK with it.
I found this quote by – ScottCher, I totally agree:
I find the WHERE syntax easier to read than INNER JOIN - I guess its like Vegemite. Most people in the world probably find it disgusting but kids brought up eating it love it.
It really depends on habits, but I have always found Oracle's comma separated syntax more natural. The first reason is that I think using (INNER) JOIN diminishes readability. The second is about flexibility. In the end, a join is a cartesian product by definition. You do not necessarily have to restrict the results based on IDs of both tables. Although very seldom, one might well need cartesian product of two tables. Restricting them based on IDs is just a very reasonable practice, but NOT A RULE. However, if you use JOIN keyword in e.g. SQL Server, it won't let you omit the ON keyword. Suppose you want to create a combination list. You have to do like this:
SELECT *
FROM numbers
JOIN letters
ON 1=1
Apart from that, I find the (+) syntax of Oracle also very reasonable. It is a nice way to say, "Add this record to the resultset too, even if it is null." It is way better than the RIGHT/LEFT JOIN syntax, because in fact there is no left or right! When you want to join 10 tables with several different types of outer joins, it gets confusing which table is on the "left hand side" and which one on the right.
By the way, as a more general comment, I don't think SQL portability exists in the practical world any more. The standard SQL is so poor and the expressiveness of diverse DBMS specific syntax are so often demanded, I don't think 100% portable SQL code is an achievable goal. The most obvious evidence of my observation is the good old row number problemmatic. Just search any forum for "sql row number", including SO, and you will see hundreds of posts asking how it can be achieved in a specific DBMS. Similar and related to that, so is limiting the number of returned rows, for example..
This is Transact SQL syntax, and I'm not quite sure how "unportable" it is - it is the main syntax used in Sybase, for example (Sybase supports ANSI syntax as well) as well as many other databases (if not all).
The main benefits to ANSI syntax is that it allows you to write some fairly tricky chained joins that T-SQL prohibits
Speaking as someone who writes automated sql query transformers (inline view expansions, grafted joins, union factoring) and thinks of SQL as a data structure to manipulate: the non-JOIN syntax is far less pain to manipulate.
I can't speak to "harder to read" complaints; JOIN looks like an lunge toward relational algebra operators. Don't go there :-)
Actually, this syntax is more portable than a JOIN, because it will work with pretty much any database, whereas not everybody supports the JOIN syntax (Oracle Lite doesn't, for example [unless this has changed recently]).

MySQL - Left Joins are Officially Preferred Over Right Joins?

In the MySQL documentation for joins, a coworker pointed out this gem to me today:
RIGHT JOIN works analogously to LEFT JOIN. To keep code portable across databases, it is recommended that you use LEFT JOIN instead of RIGHT JOIN.
Is anyone able to shed some light on this? This strikes me as probably a remnant of a past age - as in maybe the documentation means to say "To keep code reverse compatible with earlier versions of MySQL..."
Is there a modern RDBMS that doesn't support RIGHT JOIN? I get that RIGHT JOIN is syntactic sugar over LEFT JOIN, and any RIGHT JOIN can be expressed as a LEFT JOIN, but there are times when readability suffers if you write a query in that direction.
Is this advice still modern and valid? Is there a compelling reason to avoid RIGHT JOIN?
There's at least one SQL engine that does not support RIGHT JOIN: SQLite. Maybe that's the reason why compatibility was listed as a concern. There may potentially be other SQL engines as well.
So, a RIGHT and LEFT JOIN perform the same action in typical SQL engines. LEFT JOIN table a to table b returns everything from a that exists in b or not. RIGHT JOIN table a to table b returns everything from b that exists in a or not. Prior to optimizing the query, LEFT and RIGHT keywords only refer to an action to be taken on which table. The MySQL optimizer will always normalize the query and make the JOIN effectively a LEFT JOIN. Thus, writing your query to use LEFT JOIN instead of RIGHT will cost less in the optimizer.
It's a sensible convention so it should be preferred unless you want to express something distinct or out of the ordinary to anyone who might read it later, (importantly, including yourself).
Your question is valid, though to answer it completed would require either:
1) a counter-example where a RIGHT JOIN doesn't operate, or operates with significant differences
-or-
2) proof that no such case:1 exists
These will be hard to come by I think. Your suggestion that the advice may have been deprecated since it's writing is possible. It may also be due to some wrinkle of MySQL supporting multiple database back-ends. Perhaps one of them is RIGHT JOIN intolerant?

Is nested select clause decreases the database performance??

I used to write select clause in side select clause to avoid joins in from clause. But I am afraid that is it a good coading practice or it will degrade database performance. Below is the query which contains multiple tables but I have written it using nested select clause without any join statement. Please let me know if I am making any mistake or it is ok. At this moment, I am getting accurate result.
SELECT * ,
(select POrderNo from PurchaseOrderMST POM
where POM.POrderID=CET.POrderID)as POrderNo,
(select SiteName from SiteTRS ST where ST.SiteID=CET.SiteID)as SiteName,
(select ParticularName from ParticularMST PM where
PM.ParticularID=CET.ParticularID)as ParticulerName
FROM ClaimExpenseTRS CET
WHERE ClaimID=#ClaimID
I'd use joins for this because it is best practice to do so and will be better for the query optimizer.
But for the learning just try to execute the script with join and without and see what happens on the query plan and the execution time. Usually this answers your questions right away.
Your solution is just fine.
As long as you are only using 1 column for each "joined" table, and has no multiple matching rows, it is fine. In some cases, even better than joining.
(the db engine could anytime change the direction of a join, if you are not using tricks to force a given direction, which could cause performance suprises. It is called query optimiyation, but as far as you really know your database, you should be the one to decide how the query should run).
I think you should JOIN indeed.
Now your creating your own JOIN with where and select statements.

What's better: joins or multiple sub-select statements as part of one query

Performance wise, what is better?
If I have 3 or 4 join statements in my query or use embedded select statements to pull the same information from my database as part of one query?
I would say joins are better because:
They are easier to read.
You have more control over whether you want to do an inner, left/right outer join or full outer join
join statements cannot be so easily abused to create query abominations
with joins it is easier for the query optimizer to create a fast query (if the inner select is simple, it might work out the same, but with more complicated stuff joins will work better).
embedded select's can only simulate left/right outer join.
Sometimes you cannot do stuff using joins, in that case (and only then) you'll have to fall back on an inner select.
It rather depends on your database: sizes of tables particularly, but also the memory parameters and sometimes even how the tables are indexed.
On less than current versions of MySQL, there was a real possibility of a query with a sub-select being considerably slower than a query that would return the same results structured with a join. (In the MySQL 4.1 days, I have seen the difference to be greater than an order of magnitude.) As a result, I prefer to build queries with joins.
That said, there are some types of queries that are extremely difficult to build with a join and a sub-select is the only way to really do it.
Assuming the database engine does absolutely no optimization, I would say it depends on how consistent you need your data to be. If you're doing multiple SELECT statements on a busy database, where the data you are looking at may change rapidly, you may run into issues where your data does not match up, between queries.
Assuming your data contains no inter-dependencies, then multiple queries will work fine. However, if your data requires consistency, use a single query.
This viewpoint boils down to keeping your data transactionally safe. Consider the situation where you have to pull a total of all accounts receivable, which is kept in a separate table from the monetary transaction amounts. If someone were to add another transaction in between your two queries, the accounts receivable total would not match the sum of the transaction amounts.
Most databases will optimize both queries below into the same plan, so whether you do:
select A.a1, B.b1 from A left outer join B on A.id = B.a_id
or
select A.a1, (select B.b1 from B where B.a_id = A.id) as b1 from A
It ends up being the same. However, in most cases for non-trivial queries you'd better stick with joins whenever you can, especially since some types of joins (such as an inner join) are not possible to achieve using sub-selects.

when to use join or simple 2 table condition? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
SQL left join vs multiple tables on FROM line?
I think this is much clearer:
"SELECT * FROM t1,t2 WHERE t2.foreignID = t1.id "
than a query with a JOIN.
are there any specs on when to use one or another?
Thanks
Personally, I prefer the explicitness of
SELECT t1.*, t2.*
FROM t1
JOIN t2 ON t1.id = t2.foreignID
I like to see exactly what my JOIN conditions are, and the I use WHERE to further filter the results (current year, only a certain user, etc). It shouldn't matter in a simple query like this, but will definitely help with longer, more complex queries.
It doesn't make sense to me to have one style for shorter queries and a different for longer ones. Usually the simple ones turn into complex queries soon enough.
This is a cartesian join which is actually very, very slightly different from an inner join. More than 99.99% of the time, you will get identical results. But there are edge cases that will run much slower, or possibly give extra rows. But you are more likely to run into parsing problems when you add left joins. The MySQL documentation gives a little bit of explanation, but not much. The differences lie in how the query optimiser chooses how to scan the tables.
This format also gets messy when you start adding other conditions in your WHERE clause.