Are these two mysql select queries identical functionality wise? - mysql

I have 3 tables. Table 1 and 2 share column 1 and 2. All 3 tables share column 2 (an ID column), but only table 3 contains column 3. I want all rows where tables 1,2 have equal values for columns 1 and 2, but only where table3.Col3 (joined on the ID column 2) is equal to some specific value "X".
I have two queries which appear to be identical and are working for what I want, but I am asking the experts to make sure they are interchangeable :
SELECT *
from Table1 INNER JOIN Table2
ON Table1.Col1 = Table2.Col1 and Table1.Col2 = Table2.Col2
WHERE (Select Col3 from Table3 where Table2.Col2 = Table3.Col2) = "X"
SELECT *
from Table1 INNER JOIN Table2
ON Table1.Col1 = Table2.Col1 and Table1.Col2 = Table2.Col2
INNER JOIN Table3
ON Table1.Col2 = Table3.Col2
WHERE Table3.Col3 = "X"

I'm going to say yes they are equivalent, and try offering an explanation.
1st query:
The 1st INNER JOIN will only select rows from Table1 and Table2 where both Col1 and Col2 match. The subquery is in-fact a correlated subquery, which will be executed for each row of the outer subquery, which means every row filtered by the INNER JOIN. Additionally, you are filtering the outer query on the results of the inner query where Col3 from Table3 = 'X'. This is giving you exactly the data you want.
2nd query:
Slightly different. 1st INNER JOIN works the same way as in case 1. However, then you INNER JOIN this result set with Table3. Again, you are only joining on rows where Table1.Col2 = Table3.Col2. And, since Table1.Col2 = Table2.Col2, it results in an equivalent intermediate result set as defined by the correlated subquery in the 1st case. Finally, you are filtering on Table3.Col3 = 'X', which again results in the exact dataset as you wanted.
Hope this makes sense. Do correct me if my logic is wrong.

Related

Using WHERE or ON to filter

Example A
SELECT column_name(s)
FROM table1
LEFT JOIN table2
ON table1.key = table2.key
WHERE table2.key IS NULL;
Example B
SELECT column_name(s)
FROM table1
LEFT JOIN table2
ON table1.key = table2.key AND table2.key IS NULL;
Would the two SQL queries above be logically equivalent or what is the point of filtering within the ON?
Example1 - WHERE clause condition filters the final result so your example1 returns data from table1 for which matching data is not present in table2 as left join is based on key and after that where condition filters all the records that have no matching key in the table2 - you will find less than or equal to total table1 record in the result
Example2 - Condition in the ON is used to join two tables so your example2 will give you all the data from table1 and only matching data from table2. But as you have used conflicting condition in the ON clause (ON table1.key = table2.key AND table2.key IS NULL), your example2 will return all data for table1 and no data for table2(null in all columns for table2) - final result will have number of records equal to number of records in table1.
Example B will JOIN nothing because ON clause will be always FALSE.
So technically your Example B is equal to:
SELECT my_column
FROM table1
When Example A will return all records from table A which have no matching records in table B.
https://www.db-fiddle.com/f/jtCfu3ti1Rd3VDUB69iquF/0

SQL Count + Left join + Group by ... Missing rows

Trying to list all what's in table 1 and records under it in table 2
Table one each row has an id , and each row in table 2 has idontable1
select table1.*, count(table2.idintable1)as total
from table1
left join table2 on table1.id=table2.idintable1
WHERE table1.deleted='0' AND table2.deleted=0
group by
table2.idintable1
My current problem is rows from table1 with 0 records in table2 are not displayed
I want them to be displayed
The query that you want is:
select t1.*, count(t2.idintable1) as total
from table1 t1 left join
table2 t2
on t1.id = t1.idintable1 and t2.deleted = 0
where t1.deleted = 0
group by t1.id;
Here are the changes:
The condition on t2.deleted was moved to the on clause. Otherwise, this turns the outer join into an inner join.
The condition on t1.deleted remains in the where clause, because presumably you really do want this as a filter condition.
The group by clause is based on t1.id, because t2.idintable1 will be NULL when there are no matches. Just using t1.id is fine, assuming that id is unique (or a primary key) in table1.
The table aliases are not strictly necessary, but they make queries easier to write and to read.
You should GROUP BY table1.id.
The LEFT JOIN ensures all the rows from table1 appear in the result set. Those that do not have a pair in table2 will appear with NULL in field table2.idintable1. Because of that your original GROUP BY clause produces a single row for all the rows from table1 that do not appear in table2 (instead of one row for each row of table1).
You have fallen into mysql's non-standard group by support trap.
Change your group by to list all columns of table 1:
group by table1.id, table1.name, etc
or list the column positions of all table1 columns in the select:
group by 1, 2, 3, 4, etc
Or use a subquery to get the count vs the id, and join table1 to that.

3 tables and 2 left joins

Query 1:
SELECT sum(total_revenue_usd)
FROM table1 c
WHERE c.irt1_search_campaign_id IN (
SELECT assign_id
FROM table2 ga
LEFT JOIN table3 d
ON d.campaign_id = ga.assign_id
)
Query 2:
SELECT sum(total_revenue_usd)
FROM table1 c
LEFT JOIN table2 ga
ON c.irt1_search_campaign_id = ga.assign_id
LEFT JOIN table3 d
ON d.campaign_id = ga.assign_id
Query 1 gives me the correct result where as I need it in the second style without using 'in'. However Query 2 doesn't give the same result.
How can I change the first query without using 'in' ?
The reason being is that the small query is part of a much larger query, there are other conditions that won't work with 'in'
You could try something along the lines of
SELECT sum(total_revenue_usd)
FROM table1 c
JOIN
(
SELECT DISTINCT ga.assign_id
FROM table2 ga
JOIN table3 d
ON d.campaign_id = ga.assign_id
) x
ON c.irt1_search_campaign_id = x.assign_id
The queries do very different things:
The first query sums the total_revenue_usd from table1 where irt1_search_campaign_id exists in table2 as assign_id. (The outer join to table3 is absolutely unnecessary, by the way, because it doesn't change wether a table2.assign_id exists or not.) As you look for existence in table2, you can of course replace IN with EXISTS.
The second query gets you combinations of table1, table2 and table3. So, in case there are two records in table2 for an entry in table1 and three records in table3 for each of the two table2 records, you will get six records for the one table1 record. Thus you sum its total_revenue_usd sixfold. This is not what you want. Don't join table1 with the other tables.
EDIT: Here is the query using an exists clause. As mentioned, outer joining table3 doesn't alter the results.
Select sum(total_revenue_usd)
from table1 c
where exists
(
select *
from table2 ga
-- left join table3 d on d.campaign_id = ga.assign_id
where ga.assign_id = c.irt1_search_campaign_id
);

MySql View - Value from one column where some other column is max

I have Table1 and Table2 related on Table1.ID. There can be zero or more Table2 records for a given Table1.ID. I have a view where I want to get Table2.Value where Table2.ID is max for a given Table1.ID. A friend suggested a derived table, but that requires a subquery in the from clause, and MySQL doesn't like that. Are there any other ways to do this? I tried setting up a secondary view to take the place of the subquery, but it seems very slow. I also tried using a having clause to test Table2.ID = MAX(Table2.ID), but it doesn't recognize the column unless I put it into the group by, which screws everything else up.
SELECT t1.*, t2a.*
FROM Table1 t1
LEFT JOIN Table2 t2a
ON (t1.table1_id = t2a.table1_id)
LEFT JOIN Table2 t2b
ON (t1.table1_id = t2b.table1_id AND t2a.table2_id < t2b.table2_id)
WHERE t2b.table2_id IS NULL
AND t1.table1_id = ?;

How to retrieve non-matching results in mysql

I'm sure this is straight-forward, but how do I write a query in mysql that joins two tables and then returns only those records from the first table that don't match. I want it to be something like:
Select tid from table1 inner join table2 on table2.tid = table1.tid where table1.tid != table2.tid;
but this doesn't seem to make alot of sense!
You can use a left outer join to accomplish this:
select
t1.tid
from
table1 t1
left outer join table2 t2 on
t1.tid = t2.tid
where
t2.tid is null
What this does is it takes your first table (table1), joins it with your second table (table2), and fills in null for the table2 columns in any row in table1 that doesn't match a row in table2. Then, it filters that out by selecting only the table1 rows where no match could be found.
Alternatively, you can also use not exists:
select
t1.tid
from
table1 t1
where
not exists (select 1 from table2 t2 where t2.tid = t1.tid)
This performs a left semi join, and will essentially do the same thing that the left outer join does. Depending on your indexes, one may be faster than the other, but both are viable options. MySQL has some good documentation on optimizing the joins, so you should check that out..

Categories