SQL Server 2008 column in only one table - sql-server-2008

First a quick explanation: I am actually dealing with four tables and mining data from different places but my problem comes down to this seemingly simple concept and yes I am very new to this...
I have two tables (one and two) that both have ID columns in them. I want to query only the ID columns that are in table two only, not in both. As in..
Select ID
From dbo.one, dbo.two
Where dbo.two != dbo.one
I actually thought this would work but I'm getting odd results. Can anyone help?

SELECT t2.ID
FROM dbo.two t2
WHERE NOT EXISTS(SELECT NULL
FROM dbo.one t1
WHERE t2.ID = t1.ID)
This could also be done with a LEFT JOIN:
SELECT t2.ID
FROM dbo.two t2
LEFT JOIN dbo.one t1
ON t2.ID = t1.ID
WHERE t1.ID IS NULL

Completing the other 2 options after Joe's answer...
SELECT id
FROM dbo.two
EXCEPT
SELECT id
FROM dbo.one
SELECT t2.ID
FROM dbo.two t2
WHERE t2.ID NOT IN (SELECT t1.ID FROM dbo.one t1)
Note: LEFT JOIN will be slower than the other three, which should all give the same plan.
That's because LEFT JOIN is a join followed by a filter, the other 3 are semi-join

Related

A query that will see which results are the most

There is a table where strings are stored by type.
How to write 1 query so that it counts and outputs the number of results of each type.
Tried to do this:
SELECT COUNT(t1.id), COUNT(t2.id)
FROM table t1
LEFT JOIN table t2
ON t2.type='n1'
WHERE t1.type='b2';
But nothing works.
There can be many types, how can this be done?
I think this is what you want:
SELECT t1.type, COUNT(DISTINCT t1.id), COUNT(DISTINCT t2.id)
FROM table1 AS t1
LEFT JOIN table2 AS t2 ON t1.type = t2.type
GROUP BY t1.type
Note that this won't show any types that are only in table2. For that you'd need a FULL OUTER JOIN, which MySQL doesn't have. See How can I do a FULL OUTER JOIN in MySQL? for how to emulate it.

MySQL: How to Remove One Row of a Multi-Row Record Based on Column

If I have two tables that I'm joining and I write the most simple query possible like this:
SELECT *
FROM t1
LEFT JOIN t2 ON t1.id = t2.id
There are a few records who have multiple rows per ID because they have multiple employers, so t1 looks like this:
ID Name Employer
12345 Jerry Comedy Cellar
12345 Jerry NBC
12348 Elaine Pendant Publishing
12346 George Real Estate
12346 George Yankees
12346 George NBC
12347 Kramer Kramerica Industries
t2 is linked with the similar IDs but with some activities that I'd like to see -- hence the SELECT * above. Though I don't want multiple rows to return if the Employer column is "NBC" -- but everything else is good.
The only other thing that matters here is that t2 is smaller than t1, because t1 is everybody and t2 are only from people who did particular activities -- so some of the matches won't return anything from t2, but I would still like them to be returned, hence the LEFT JOIN.
If I write the query like this:
SELECT *
FROM t1
LEFT JOIN t2 ON t1.id = t2.id
WHERE Employer <> "NBC"
Then it removes Jerry and George completely -- when really all I want is for the NBC row to not be returned, but to return any other rows that are associated with them.
How can I write the query while joining t1 with t2 to return each row except for the NBC ones? The ideal output would be all of the rows from t1 regardless if they match up with all of t2 except removing all of the rows with "NBC" as the employer in the return file. Basically the ideal here is to return the JOINs where they fit, but regardless remove the entire row for anybody with "NBC" as employer without removing their other rows.
The more I write about it, it seems like I should potentially just run a query prior to my JOIN to delete all the rows in t1 who have "NBC" as their employer and then run the normal query.
Basic subset filtering
You can filter either of the two merged (joined) subsets by extending the ON clause.
SELECT *
FROM t1
LEFT JOIN t2
ON t1.ID = t2.ID
AND t2.Employer != 'NBC'
If you get null values now, and you don't want them, you'd add:
WHERE t2.Employer IS NOT NULL
extended logic:
SELECT *
FROM t1
LEFT JOIN t2
ON (t1.ID = t2.ID AND t2.Employer != 'NBC')
OR (t2.ID = t2.ID AND t2.Employer IS NULL)
Using UNION
Basically, JOIN is for horizontal linking and UNION does vertical linking of datasets.
It merges to resultsets: the first without NBC, and the second (which is basically an OUTER JOIN), adds everyone in t1 which is not part of t2.
SELECT *
FROM t1
LEFT JOIN t2
ON t1.ID = t2.ID
AND t2.Employer != 'NBC'
UNION
SELECT *
FROM t1
LEFT JOIN t2
ON t1.ID = t2.ID
AND t2.Employer IS NULL
String manipulation in the resultset
If you just want to remove NBC as a string, here is a workaround:
SELECT
t1.*,
IF (t2.Employer = 'NBC', NULL, t2.Employer) AS Employer
FROM t1
LEFT JOIN t2
ON t1.id = t2.id
This basically replaces "NBC" by NULL

Fetch the rows which have the Max value for a column - with duplicates

This answer from Bill Karwin to question 121387 worked perfectly for me..
"I see many people use subqueries or else vendor-specific features to do this, but I often do this kind of query without subqueries in the following way. It uses plain, standard SQL so it should work in any brand of RDBMS.
SELECT t1.*
FROM mytable AS t1
LEFT OUTER JOIN mytable AS t2
ON (t1.UserId = t2.UserId AND t1."Date" < t2."Date")
WHERE t2.UserId IS NULL;
In other words: fetch the row from t1 where no other row exists with the same UserId and a greater Date."
However, I also need to include in the result a column from a third table (imagine another table with UserId and UserPhoneNumber columns). It feels as if it should be straightforward but it's driving me nuts. Any help would be appreciated.
Simply join the third table after the LEFT JOIN:
SELECT t1.*, t3.*
FROM mytable AS t1
LEFT OUTER JOIN mytable AS t2
ON (t1.UserId = t2.UserId AND t1."Date" < t2."Date")
JOIN myothertable AS t3 ON t1.UserId = t3.UserId
WHERE t2.UserId IS NULL;

Shorten a join query

I have a query with 3 joins:
SELECT t1.email, t2.firstname, t2.lastname, t4.value
FROM t1
left join t2 on t1.email = t2.email
Inner join t3 on t2.entity_id = t3.order_id
Inner join t4 on t3.product_id = t4.entity_id
WHERE t4.attribute_id = 126
I think my server just can't make it :) --> time is running out so an error occurs!
Thanks a lot
Table structur:
T1:
email (which is the same then in t2)
T2:
email firstname lastname orderid (which is called entity id in t3)
T3:
entityid product id (which is called entity id in t4)
T4:
entityid attributeid value
Unless t2 links straight to t4 there is no way.
Also, do you need a left join between t1 and t2?
As #Sachin already stated, you can't "shorten" this query unless t2 links straight to t4 without requiring a comparison with t3. However, in order to speed up your query, you should have indexes on some or all of the columns referenced in your join conditions (i.e. t1.email, t2.email, t2.entity_id, etc).
Having an index on each of these columns will give you much faster SELECT queries, but it will slow down your INSERT and UPDATE queries. So if you SELECT more often than you INSERT or UPDATE, then you should definitely be using indexes. If not, try to make indexes in wise places (tables that have INSERT or UPDATE statements run less often but still have a lot of rows, for instance).
For further clarification, see the following links:
More information on how indexes work
Syntax for creating indexes
Try your query this way:
SELECT t1.email, t2.firstname, t2.lastname, t4.value
FROM t4
INNER JOIN t3 ON t3.product_id = t4.entity_id
INNER JOIN t2 ON t2.entity_id = t3.order_id
INNER JOIN t1 ON t1.email = t2.email
WHERE t4.attribute_id = 126
It's basically your query but "backwards". Your original way, your DBMS has to try to join t2 for ALL records in t1, then join t3 for ALL records found in t2 before it can even attempt to address your WHERE clause.
My way, you're finding all the records in t4 where attribute_id = 126 first, THEN attempting to join other tables. It should be a lot quicker. You should then be able to speed things up even more by making sure the proper indexes exist on the tables involved. You can prepend the keyword EXPLAIN to your query to see how the DBMS attempts to seek data in your query.

which method is better to join mysql tables?

What is difference between these two methods of selecting data from multiple tables. First one does not use JOIN while the second does. Which one is prefered method?
Method 1:
SELECT t1.a, t1.b, t2.c, t2.d, t3.e, t3.f
FROM table1 t1, table2 t2, table3 t3
WHERE t1.id = t2.id
AND t2.id = t3.id
AND t3.id = x
Method 2:
SELECT t1.a, t1.b, t2.c, t2.d, t3.e, t3.f
FROM `table1` t1
JOIN `table2` t2 ON t1.id = t2.id
JOIN `table3` t3 ON t1.id = t3.id
WHERE t1.id = x
For your simple case, they're equivalent. Even though the 'JOIN' keyword is not present in Method #1, it's still doing joins.
However, method #2 offers the flexibility of allowing extra conditions in the JOIN condition that can't be accomplished via WHERE clauses. Such as when you're doing aliased multi-joins against the same table.
select a.id, b.id, c.id
from sometable A
left join othertable as b on a.id=b.a_id and some_condition_in_othertable
left join othertable as c on a.id=c.a_id and other_condition_in_othertable
Putting the two extra conditions in the whereclause would cause the query to return nothing, as both conditions cannot be true at the same time in the where clause, but are possible in the join.
The methods are apparently identical in performance, it's just new vs old syntax.
I don't think there is much of a difference. You could use the EXPLAIN statement to check if MySQL does anything differently. For this trivial example I doubt it matters.