I have three tables, T1(C1,C2), T2(C1,C2), T3(C1,C2),and I want to select rows from T1 that only occur in T1 not in T2, T3.
Now I just use the left join to do this.
SELECT C1,C2
FROM T1
LEFT JOIN T2 ON T1.C1=T2.C2 AND T1.C2=T2.C2
LEFT JOIN T3 ON T1.C1=T3.C2 AND T1.C3=T2.C2
WHERE T2.C1 IS NULL AND T3.C1 IS NULL;
But the problem is, how to improve the performance when the database is quite large(e.g. rows>10000000 in three table). Using LEFT JOINthe query takes too long...
Three things that immediately jump to mind are:
- Indexes on the fields used for the joins
- SELECT only the data you really need
- give your database server lots of memory
A subquery can help you here. You can save additional costs of left joins this way.
SELECT
C1,C2
FROM
T1
WHERE
C1 NOT IN (
SELECT C1 FROM T2
UNION ALL
SELECT C1 FROM T3
);
Related
Suppose I have small table(t1) and large table(t2).I have indexed column1 and column2 of t2. If I want to INNER JOIN t1 and (select * from t2 where column1=x) then is the indexing on t2 be helpful even after the (select * from t2 where column1=x) during the inner join with t1?
If My query is (select * from t2 where column1=x) then obviously indexing is helpful. What happens when my complete query is run? will it first run (select * from t2 where column1=x)(here indexing will be used) and then INNER JOIN with t1 without using indexing?
Almost always it is better to JOIN two tables instead of JOINing to a "derived" table.
Probably inefficient:
FROM t1
JOIN ( SELECT ... FROM t2 ... ) AS t3 ON ...
Probably better:
FROM t1
JOIN t2 ON ...
One likely exception is when the derived table (t3) is much smaller than the table (t2) it comes from. This may happen when there is a GROUP BY, DISTINCT, and/or LIMIT inside t3.
If you want to discuss further, please provide the fully spelled out SELECT and SHOW CREATE TABLE for the two tables. An important discussion point is what indexes exist (or are missing).
I would like to join three tables and then union them. Two of the table that are joined are the same in the two queries which are union'd, and it seems like a waste to perform this join twice. See below for an example. How is this best performed? Thanks
SELECT t1.c1,t2.c1,t3.c1
FROM audits AS t1
INNER JOIN t2 ON t2.t1_id=t1.id
INNER JOIN t3 ON t3.t1_id=t1.id
WHERE t2.fk1=123
UNION
SELECT t1.c1,t2.c1,t4.c1
FROM audits AS t1
INNER JOIN t2 ON t2.t1_id=t1.id
INNER JOIN t4 ON t4.t1_id=t1.id
WHERE t2.fk1=123
ORDER BY t1.fk1 ASC
This would work, if the syntax is supported by MySql, and might be slightly more efficient:
SELECT t1.c1, t2.c1, t.c1
FROM audits AS t1
INNER JOIN t2 ON t2.t1_id=t1.id
INNER JOIN (
select t1_id from t3
union
select t1_id from t4
) as t ON t.t1_id=t1.id
WHERE t2.fk1=123
ORDER BY t1.fk1 ASC
The reason for a pssible performance improvement is the smaller footprint of the relation being UNION'ed; one column instead of 3. UNION eliminates duplicates (unlike UNION ALL) so the entire collection of records must be sorted to eliminate duplicates.
At the meta-level this query informs the optimizer of a specific optimization available, that it may be unable to determine on it's own.
I have a query with 3 joins:
SELECT t1.email, t2.firstname, t2.lastname, t4.value
FROM t1
left join t2 on t1.email = t2.email
Inner join t3 on t2.entity_id = t3.order_id
Inner join t4 on t3.product_id = t4.entity_id
WHERE t4.attribute_id = 126
I think my server just can't make it :) --> time is running out so an error occurs!
Thanks a lot
Table structur:
T1:
email (which is the same then in t2)
T2:
email firstname lastname orderid (which is called entity id in t3)
T3:
entityid product id (which is called entity id in t4)
T4:
entityid attributeid value
Unless t2 links straight to t4 there is no way.
Also, do you need a left join between t1 and t2?
As #Sachin already stated, you can't "shorten" this query unless t2 links straight to t4 without requiring a comparison with t3. However, in order to speed up your query, you should have indexes on some or all of the columns referenced in your join conditions (i.e. t1.email, t2.email, t2.entity_id, etc).
Having an index on each of these columns will give you much faster SELECT queries, but it will slow down your INSERT and UPDATE queries. So if you SELECT more often than you INSERT or UPDATE, then you should definitely be using indexes. If not, try to make indexes in wise places (tables that have INSERT or UPDATE statements run less often but still have a lot of rows, for instance).
For further clarification, see the following links:
More information on how indexes work
Syntax for creating indexes
Try your query this way:
SELECT t1.email, t2.firstname, t2.lastname, t4.value
FROM t4
INNER JOIN t3 ON t3.product_id = t4.entity_id
INNER JOIN t2 ON t2.entity_id = t3.order_id
INNER JOIN t1 ON t1.email = t2.email
WHERE t4.attribute_id = 126
It's basically your query but "backwards". Your original way, your DBMS has to try to join t2 for ALL records in t1, then join t3 for ALL records found in t2 before it can even attempt to address your WHERE clause.
My way, you're finding all the records in t4 where attribute_id = 126 first, THEN attempting to join other tables. It should be a lot quicker. You should then be able to speed things up even more by making sure the proper indexes exist on the tables involved. You can prepend the keyword EXPLAIN to your query to see how the DBMS attempts to seek data in your query.
I'm looking for a query to select rows from two different tables, keeping the column names the same (I did find one result here for selecting from two different tables, but it merged the column names to have an easier query). I need to keep the original column names, but have two different tables existing within the new, larger table. There are no overlapping columns between the two tables.
A picture, to visualise:
So, how can I do this? I know the query will probably be quite convoluted, but anything half-decent is probably going to be better than my current attempt:
SELECT t1.* , t2.*
FROM table1 t1 RIGHT OUTER JOIN table2 t2
ON r.someColumn1 = rc.someColumn2
UNION
SELECT t1.* , t2.*
FROM table1 t1 LEFT OUTER JOIN table2 t2
ON r.someColumn1 = rc.someColumn2
This does work, but only as long as there are no cases where someColumn1 = someColumn2 - which can happen quite easily, of course.
Any help is appreciated, and I apologise for what is probably a very silly question to which the smart answer is "don't do it, you fool!".
You can set your join criterion to never match:
SELECT t1.* , t2.*
FROM table1 t1 RIGHT OUTER JOIN table2 t2
ON 1 = 0
UNION
SELECT t1.* , t2.*
FROM table1 t1 LEFT OUTER JOIN table2 t2
ON 1 = 0
I don't have MySQL to test, but it works in SQL Server.
Edit: my first answer was wrong:
select * from Events
left join GroupList on ID=null
union
select Events.*,GroupList.* from GroupList
left join Events on GID=null
In the above GID and ID are keyfields in the tables.
I'm sure this is straight-forward, but how do I write a query in mysql that joins two tables and then returns only those records from the first table that don't match. I want it to be something like:
Select tid from table1 inner join table2 on table2.tid = table1.tid where table1.tid != table2.tid;
but this doesn't seem to make alot of sense!
You can use a left outer join to accomplish this:
select
t1.tid
from
table1 t1
left outer join table2 t2 on
t1.tid = t2.tid
where
t2.tid is null
What this does is it takes your first table (table1), joins it with your second table (table2), and fills in null for the table2 columns in any row in table1 that doesn't match a row in table2. Then, it filters that out by selecting only the table1 rows where no match could be found.
Alternatively, you can also use not exists:
select
t1.tid
from
table1 t1
where
not exists (select 1 from table2 t2 where t2.tid = t1.tid)
This performs a left semi join, and will essentially do the same thing that the left outer join does. Depending on your indexes, one may be faster than the other, but both are viable options. MySQL has some good documentation on optimizing the joins, so you should check that out..