Not to used natural join - sql-server-2008

I was debate with my TL over natural join, he told natural join should not be used, so on what cases we are not going with natural join, equjoin and go with inner join.

Please rephrase your question like What are the implications between a Natural Join and Inner JOIN? 'What are the limitations of these JOINS in SQL Server 2008?
SQL Server gets rid of Natural Joins, so there also is a scalable aspect to their usage. Without getting too specific, a NATURAL JOIN is essentially like an INNER JOIN except it
A) returns distinct columns (think INTERSECT/UNION except tables can have differing columns)
B) add implicitly an EQUI JOIN on all of the available columns.
Illustration: Note, this was designed in SQL Server 2012
DECLARE #TableA TABLE (Col1 VARCHAR(10)
, Col2 VARCHAR(10) );
DECLARE #TableB TABLE (Col1 VARCHAR(10)
, Col2 VARCHAR(10)
, Col3 VARCHAR(10) );
INSERT INTO #TableA (Col1, Col2)
VALUES ('C', 'D');
INSERT INTO #TableB (Col1, Col2, Col3)
VALUES ('C', 'D', 'E');
SELECT *
FROM #TableA
NATURAL JOIN (SELECT Col1, Col2, Col3
FROM #TableB) AS B
-- returns
Col1 | Col2 | Co3
'C' 'D' 'E'
SELECT *
FROM #TableA AS A
INNER JOIN (SELECT Col1, Col2, Col3
FROM #Table B) AS B ON A.Col1 = B.Col1
AND A.Col2 = B.Col2
-- returns A = #TableA, B = #TableB
A.Col1 | B.Col1 | A.Col2 | B.Col2 | B.Col2
'C' 'C' 'D' 'D' 'E'
Do you see the difference? Rather significant, no? With the Inner you still could compare the two table's results, but a Natural JOIN is like a INTERCEPT, only it groups the columns together. You lose the relation in your result set.
Conclusion:
Everyone is entitled to their own opinion, but SQL will parse your query the same regardless.
Learning about how the joins work helps you understand what business use those joins can function...or at least work with other SQL languages.
TSQL removed the NATURAL JOIN in favor of UNION, UNION ALL, INTERSECT and EXCEPT.
If you can, ask your TL the 'whys' behind the business logic of using one or the other. Find out how he understands SQL Querying. You might either get something insightful or...well, be prepared to hear unconventional things...but at least you are learning more about SQL, your company, and how to ask questions.
A Win/Win, I say.

Related

sql join issues with different data

I want to join two tables with different data and data type is same.
In tableA the column col1 is with varchar datatype i.e. 123 and in tableB the column col1 is with varchar datatype i.e. ABC-123
Is there any way to join both columns by adding ABC as prefix to col1 in table 1 or by removing prefix ABC from col1 table 2.
You can use CONCAT(), as in:
select *
from table_a a
join table_b b on concat('ABC-', a.col1) = b.col2
This issue is quite common, specially in old databases, where you need to join VARCHAR columns with NUMERIC ones, since the designers back in the 90s though it that way.
Just use function CONCAT() at the ON clause of INNER JOIN
select * from
tableA a inner join tableB b
on CONCAT('ABC-', a.col1) = b.col2

Subqueries in SQL for select

I want to filter records in my cursor using inner and outer select statements.
How do I achieve that?
I want only '_02' records from both tables.
table A:
col1
1122_01
1234_02
3456_02
7899_02
table B:
col1
1111_02
1234_02
4567_02
table Final:
col1
3456_02
7899_02
SELECT distinct a.col1
FROM A a
WHERE NOT EXISTS (SELECT 1 FROM B b
WHERE b.col1 = a.col1
and b.col1='02')
and a.col1='02'
will this work?
Or this?
SELECT distinct t.item, t.skuloc loc
FROM SCPOMGR.UDT_DFUTOSKUMAP t
, SCPOMGR.udt_gen_param G
WHERE NOT EXISTS (SELECT *
FROM SCPOMGR.SKU s1
, SCPOMGR.udt_gen_param G
,SCPOMGR.UDT_DFUTOSKUMAP t
WHERE s1.ITEM = t.ITEM
AND s1.LOC = t.SKULOC
and G.region='XYZ'
and G.jda_code= substr(s1.loc,-2,2)
)
and G.region='XYZ'
and G.jda_code= substr(T.SKUloc,-2,2)
It looks like you want the set of
distinct values of A.col1
for which the same value does not exist in B.col1
ending in _02.
If you will work using SQL it is vitally important to learn to specify what you want very precisely. SQL is, at is heart, a scheme for specifying sets of data. Use the concepts of elementary set theory when specifying requirements to be realized with SQL.
Once you have a precise specification, writing it in SQL is usually easy. If you can't express what you want in SQL, revisit the specification.
In your case:
SELECT DISTINCT A.col1
FROM A
LEFT JOIN B ON A.col1 = B.col1
WHERE B.col1 IS NULL
AND A.col1 LIKE `%_02`
This uses LIKE '%suffix for the third requirement, the old LEFT JOIN ... IS NULL trick for the second requirement, and DISTINCT A.col1 for the first.
This particular query works in various SQL dialects. Cursors have nothing to do with this case.

Left Join with Null Script Efficiency Explanation Needed

Why would I use a LEFT JOIN in SQL in a FROM clause and attach a WHERE clause where the entity "is null"? I was told this is a very efficient script and I should learn the methodology behind it.
For example:
FROM
something
LEFT JOIN aRow a AND bRow b AND cRow c AND dRow d
WHERE
bRow.b IS NULL;
This kind of construct is used when you specifically want to know something like "a list of all customers who have never ordered anything" :
SELECT
customer.*
FROM
customers
LEFT JOIN
orders
ON
orders.customerid = customers.id
WHERE
orders.id IS NULL
Or to quote an old manager of mine: "Can you get the database to give me a list of everything that isn't in the database?"
Me> "Sure, can you give me a list of what things the database should tell you it doesn't have?"
Him> "How am I supposed to know that?"
This really is a fairly generic, non-RDBMS-specific question. The logic will apply to pretty much any flavor of SQL. And this is a technique that anyone who works with data queries should be familiar with.
For all intents and purposes (and moving past the flawed syntax in the OP), this is the same query as:
SELECT *
FROM table1
WHERE table1.col1 NOT IN (
SELECT table2.col1 FROM table2 WHERE table2.col2 = <filterHere>
)
When you are dealing with a couple of hundred rows, you may not see a significant difference in performance. But when you're working with just a few million rows in both tables, you will most definitely see a significant performance increase in
SELECT table1.*
FROM table1
LEFT OUTER JOIN table2 ON table1.col1 = table2.col1
AND table2.col2 = 42
WHERE table2.id IS NULL
Let's illustrate what is happening with these queries.
Create test tables.
CREATE TABLE table1 (col1 int, col2 varchar(10)) ;
INSERT INTO table1 ( col1, col2 )
VALUES (1,'a')
, (2,'b')
, (3,'c')
, (4,'d')
CREATE TABLE table2 (col1 int, col2 varchar(10)) ;
INSERT INTO table2 ( col1, col2 )
VALUES (1,'a')
, (3,'c')
This gives us
table1
col1 col2
1 a
2 b
3 c
4 d
table2
col1 col2
1 a
3 c
Now we want the columns that are in table1 but not in table2.
SELECT t1.col1, t1.col2
FROM table1 t1
WHERE t1.col1 NOT IN (
SELECT t2.col1 FROM table2 t2
)
We can't SELECT anything from table2, because that table is just a sub-query and not part of the whole query. It's not available to us.
This breaks down to
SELECT t1.col1, t1.col2
FROM table1 t1
WHERE t1.col1 NOT IN ( 1,3 )
Which further breaks down to
SELECT t1.col1, t1.col2
FROM table1 t1
WHERE t1.col1 <> 1
OR t1.col1 <> 3
These queries give us
col1 col2
2 b
4 d
That's a subquery broken down into 2 different OR statements to filter our results.
So lets look at a JOIN. We want all of the records on the left side, and only include those on the right side that match. So
SELECT t1.col1 AS t1_col1, t1.col2 AS t1_col2, t2.col1 AS t2_col1, t2.col2 AS t2_col2
FROM table1 t1
LEFT OUTER JOIN table2 t2 ON t1.col1 = t2.col1
With a JOIN, both tables are available to our SELECT, so we can see which records in tablel2 match up to those in table1. The above gives us
t1_col1 t1_col2 t2_col1 t2_col2
1 a 1 a
2 b NULL NULL
3 c 3 c
4 d NULL NULL
With the extra data, we can see that col1 for 2 and 4 don't match in the two tables. We can now filter those out with a simple WHERE statement.
SELECT t1.col1, t1.col2
FROM table1 t1
LEFT OUTER JOIN table2 t2 ON t1.col1 = t2.col1
WHERE t2.col1 IS NULL
Giving us
col1 col2
2 b
4 d
There's no subquery and just one statement in the filter. Plus, this allows the engine's optimizer to make a more efficient query plan.
It's impossible to see a difference in performance when we're only dealing with a couple of rows, but multiply these tables by a few million rows, and you will definitely see how much faster a JOIN can be.

Matching unique Id's from two mysql tables, my query is only pulling a portion of them

I'm not sure what I'm doing wrong but I have my tables. I know this question has been asked a million times but I can't seem to figure out why it's only pulling a portion of the transaction ID's from one of my tables.
I have two tables. author and tab2.
author looks like so and the unique id is COL9:
COL1 COL2 COL3 COL4 COL5 COL6 COL7 COL8 COL9
data data data data data data data data 6314085733
My other table is tab2 and there is only one column in it:
COL1
6300798484
6300917409
6301563169
UPDATE: I just realized that there are two spaces in quiet a few of the COL1 fields. I changed the data type to varchar(10) to eliminate those extra spaces and reran the query.. still nothing.
My query is like so:
SELECT b.Col1, a.*
FROM author a
JOIN tab2 b
ON b.COL1 = a.COL9
ORDER BY a.COL9 DESC
I know there has to be more than 600 and the results I'm getting are:
Showing rows 0 - 24 (29 total, Query took 7.2141 sec) [COL9: 6319720972 - 6302432564]
You need to use TRIM, like this:
SELECT b.Col1, a.*
FROM author a
JOIN tab2 b
ON TRIM(b.COL1) = TRIM(a.COL9)
ORDER BY a.COL9 DESC
If COL1 and COL9 holds numeric values use a cast to get the data.
SELECT b.Col1, a.*
FROM author a
JOIN tab2 b
ON CAST(b.COL1 AS BIGINT) = CAST(a.COL9 AS BIGINT)
ORDER BY CAST(a.COL9 AS BIGINT) DESC;
Or you can use a TRIM.
Trim is not by default available in mssql.
first create a Scalar valued function.
CREATE FUNCTION [dbo].[Trim]
(
-- Add the parameters for the function here
#pString VARCHAR(2000)
)
RETURNS VARCHAR(2000)
AS
BEGIN
RETURN LTRIM(RTRIM(REPLACE(REPLACE(REPLACE(#pString, CHAR(10), ''), CHAR(13), ''), CHAR(9), '')));
END
GO
Use the below query
SELECT b.Col1, a.*
FROM author a
JOIN tab2 b
ON dbo.Trim(b.COL1) = dbo.Trim(a.COL9)
ORDER BY dbo.Trim(a.COL9) DESC;

How to determine if a MySQL query is valid?

Look here and here.
With the answers above, I have made this query, is it valid? If not, how can I correct it?
SELECT *,
FROM TABLE_2 t
WHERE EXISTS(SELECT IF(column1 = 'smith', column2, column1)
FROM TABLE_1 a
WHERE 'smith' IN (a.column1, a.column2)
AND a.status = 1
AND ( 'smith' IN (t.column1, t.column2)
)
To start with, the comma after select * does not belong.
Second, you alias your tables (table_2 t and table_1 a), but then you don't consistently use the aliases, so you might have issues at run time. Also from a maintenance perspective, I think most folks prefer to use aliases when declared, and no aliases otherwise.
Third, you do a comparison against cols from the t table in the outer select ('smith' in (t.column1, t.column2) ), when that appears unnecessary. You can just do it in the outer select. In other words, you can move that terminal paren to before the AND ('smith'...
As for whether it works -- I have no idea, since I don't know what you are trying to accomplish.
Combined, that would leave you with :
SELECT t.*
FROM TABLE_2 t
WHERE EXISTS (SELECT IF(a.column1 = 'smith', a.column2, a.column1)
FROM TABLE_1 a
WHERE 'smith' IN (a.column1, a.column2)
AND a.status = 1)
AND ( 'smith' IN (t.column1, t.column2)