I have two almost identical tables. Second one is a "slave" of first one. First table has autoincrement int ID column and second also has ID2 which is indexed unique int, but not autoincremented. ID2 is an analog of ID.
I need to find fastest way to detect new rows in second table (those ID2 which not exist in first table) and vise versa, new rows in first table (those ID which not exist in second table). A fastest way I found is
select SQL_NO_CACHE
tab1.ID
from `tab1`
left join `tab2`
on tab1.ID = tab2.ID2
where
isnull(tab2.ID2)
Takes out 2.5 seconds on ~200k records. What you may propose to get faster result?
SELECT * FROM Tab2 WHERE
NOT EXISTS (SELECT 'x' FROM Tab1 where
Tab1.ID= Tab2.ID)
I think this query would give you a little faster result.
Use is null:
select SQL_NO_CACHE tab1.ID
from `tab1` left join
`tab2`
on tab1.ID = tab2.ID2
where tab2.id2 is null;
This performance should be equivalent to:
select tab1.id
from tab1
where not exists (select 1 from tab2 where tab2.id2 = tab.id);
But it is worth trying both approaches.
Note that these versions do the inverse of what you ask -- find rows in tab1 that are not in tab2. It should be obvious how to switch the logic, depending on what you really want.
Related
I need to check two tables and find inconsistencies, ie where the value of table T1 is not present in the italy_cities table. I'll explain:
T1: Includes personal data (with place of birth)
italy_city: Includes all the municipalities of Italy.
Table T1 has about 9000 tuples.
T2 has 7,903 tuples.
Using "NOT IN" the query takes approximately 16 seconds to execute.
Here is the query:
SELECT
`T1`.*
FROM
T1
WHERE
(
`T1`.place NOT IN ( SELECT municipality FROM italy_cities )
)
MY QUESTION
what is the best and fast option to check for inconsistencies? to check all the "incorrect" municipalities that do not exist in the official database?
Thanks in advance
I generally recommend NOT EXISTS for this purpose:
SELECT T1.*
FROM T1
WHERE NOT EXISTS (SELECT 1
FROM italy_cities ic
WHERE t1.place = ic.municipality
);
Why? There are two reasons:
NOT IN does not do what you expect if the subquery returns any NULL values. If even one value is NULL all rows end up being filtered out.
This version of the query can take advantage of an index on italy_cities(municipality) which seems like a reasonable index on the table.
Not exists can perform better but there is also another way which is left join as follows:
SELECT T1.*
FROM T1
LEFT JOIN italy_cities I ON I.municipality = T1.PLACE
WHERE I.municipality IS NULL;
My table has a columns labeled primary_key and summary_id. The value in the second field summary_id in each record maps to the primary_key field of another record. There is a third field template_id. I need to select those records for which:
template_id is a certain value. Let's say 4.
primary_key matches at least one of the records' summary_id field.
Please don't tell me to redesign the tables. My next project will have a better design, but I don't have time for that now. I need to do this with one or more queries; the fewer the better. Ideally, there's some way to do this with one query, but I'm okay if it requires more.
This is how far I've gotten with my own query. (I know it's seriously lacking, which is why I need help.)
SELECT DISTINCT esjp_content.template_id
FROM esjp_content
INNER JOIN esjp_hw_config ON esjp_content.template_id = esjp_hw_config.proc_id
INNER JOIN esjp_assets ON esjp_hw_config.primary_key = esjp_assets.hw_config_id
WHERE
esjp_content.summary_id > 0
AND
(esjp_assets.asset_label='C001498500' OR esjp_assets.asset_label='H0065' OR esjp_assets.asset_label='L0009');
SELECT
esjp_content.primary_key, esjp_content.template_id, esjp_content.content, esjp_content.summary_id
FROM
esjp_content
WHERE
esjp_content.template_id = 4;
I need the records that summary_id points to. For example, if summary_id is 90, then I need the record where primary_key is 90.
You're looking for the existence of at least one row where summary_id = your primary key. like this.
SELECT *
FROM esjp_content c
WHERE template_id = 4
AND EXISTS (SELECT 1 FROM esjp_content c2 WHERE c2.summary_id = c.primary_key)
You can JOIN same table by using both IDs:
SELECT
t1.*
FROM
esjp_content t1
INNER JOIN esjp_content t2 ON t1.summary_id = t2.primary_key
WHERE
t1.template_id = 4
I have a table of 'entries' in a MYSQL database. I have another table that records activity on those entries, with the id of the entry as a foreign key. I want to select from my first table entries that do not appear in the second table.
How can I use SQL to make this happen? Do I have to iterate through both tables and compare every entry with every other entry? Is there an easier way to do this?
ex. I have a table with an entry data column and a user name column. I have another table with an entry id column and a user id column. I want to select from my first table all of the entries which do not appear in the second table with a given user id.
Thanks ahead of time. I have been struggling with this experiment for a while. I imagine I have to join the two tables somehow?
Several ways to achieve this, NOT IN, NOT EXISTS, LEFT JOIN / NULL check. Here's one with NOT EXISTS:
SELECT *
FROM FirstTable T
WHERE NOT EXISTS (
SELECT *
FROM SecondTable T2
WHERE T.Id = T2.Id
)
From what I understand, you want to select all rows where the foreign key doesn't match anything in the other table. This should do the trick:
SELECT *
FROM Data A
RIGHT JOIN Entry B
ON A.ID = B.ID
WHERE A.ID IS NULL
Here's a handy chart that illustrates how to use joins for stuff like this.
You can also use NOT IN, and the mechanics for this one are actually a bit easier to understand.
SELECT *
FROM Data A
WHERE A.ID NOT IN (SELECT ID FROM Entry)
this is probably something simple but I can't wrap my head around it. I've tried IN, NOT EXISTS, EXCEPT, etc... and still can't seem to get this right.
I have two tables.
Table A
-----------
BK
NUM
Table B
------------
BK
NUM
How do I write a query to remove all records from table A, that are not in table B based on the two fields. So if Table A has a record where BK = 1 and NUM = 2, then it should look in table B. If table B also has a record where BK = 1 and NUM = 2 then do nothing, but if not, delete that record from table A. Does that make sense?
Any help is much appreciated.
You can do so
delete from tablea
where (BK,NUM) not in
(select BK,NUM from tableb)
using exists
delete from tablea a
where not exists
(select 1 from tableb where BK=a.BK and NUM = a.NUM)
Another alternative is to use an anti-join pattern, a LEFT [OUTER] JOIN and then a predicate in the WHERE clause that filters out all matches.
It's easiest to write this as a SELECT first, test it, and then convert to a DELETE.
SELECT t.*
FROM tablea t
LEFT
JOIN tableb s
ON s.BK = t.BK
AND s.NUM = t.NUM
WHERE s.BK IS NULL
The LEFT JOIN returns all rows from t along with matching rows from s. The "trick" is the predicate in the WHERE clause... we know that s.BK will be non-NULL on all matching rows (because the value had to satisfy an equality comparison, in a predicate in the ON clause). So s.BK will be NULL only for rows in t that didn't have a matching row in s.
For MySQL, changing that into a DELETE statement is easy, just replace the SELECT keyword with DELETE. (We could write either DELETE t or DELETE t.*, either of those will work.
(This is an illustration of only one (of several) possible approaches.)
I wanted to join 3 or more tables
table1 - 1 thousand record
table2 - 100 thousands record
table3 - 10 millions record
Which of the following is best(speed wise performance):-
Note: pk and fk are primary and foreign key for respective tables and FILTER_CONDITION1 and FILTER_CONDITION2 are respective restricting records query normally found in where
Case 1 :taking smaller tables first and joining larger one later
Select table1.*,table2.*,table3.*
from table1
join table2
on table1.fk = table2.pk and FILTER_CONDITION1
join table3
on table2.fk = table3.pk and FILTER_CONDITION2
Case 2
Select table1.*,table2.*,table3.*
from table3
join table2
on table2.fk = table3.pk and FILTER_CONDITION2
join table1
on table1.fk = table2.pk and FILTER_CONDITION1
Case 3
Select table1.*,table2.*,table3.*
from table3
join table2
on table2.fk = table3.pk
join table1
on table1.fk = table2.pk
where FILTER_CONDITION1 and FILTER_CONDITION2
The cases you show are equivalent. What you are describing is in the end the same query and will be seen by the database as such: the database will make a query plan.
The best thing you can do is use EXPLAIN and check out what your query actually does: this way you can see they will probably be run the same, AND if there might be a bottle neck in there.
As #Nanne updated in his answer that normally mysql do it its own (right ordering) but some time (rare case) mysql can read table join in wrong order and can kill query performance in this case you can follow below approach-
If you can filter data from your bulky tables like table2 and table3 (suppose you can get only 500 records after joining these tables and applying filter) then first you filter your data and then you can join that filtered data with your small table..in this way you can get performance but there can be various combinations, so you have to check by which join you can do more filteration..yes explain will help you to know it and index will help you to get filtered data.
After above approach you can say mysql to use ordering as you have in your query by syntax "SELECT STRAIGHT_JOIN....." same as some time mysql does not use proper index and we have to use force index