Why does LEFT JOINing leave the second table's id NULL? - mysql

Please refer to the following answer first: this
So in that answer, it provided a way to take a data based on one distinct column with its corresponding columns. How it picks which one of that distinct column can be sorted by ascending or descending on the condition WHERE t1.lastname > t2.lastname or WHERE t1.lastname < t2.lastname respectively, which I understand.
I'm still practicing SQL and I have questions in regard to the method provided in the above link.
When I tried to select t2.id, it is all null. I can't comprehend this, isn't t2 basically the same table as t1? If so, how is it possible that its id became NULL, but not t1s ids?
Why is it necessary to check WHERE t2.id IS NULL when all t2.id is going to return NULL anyway?
This is a part about which I think I have a slight idea. However, please correct me if I'm wrong. The above method worked (let's talk about the descending order here) because: firstly, the I LEFT JOINed my t1 and t2 together based on their ids. Secondly, I also check that the t1.lastname has to be bigger (>) than t2.lastname which I assume is using ASCII or UNICODE values, which returns me voila! Only one result; the one that has a higher value. Okay now more question: does it check the t1.lastname with t2.lastname one by one but to all t2.lastname and then returns nothing if just one of the t2.lastname renders that condition invalid?
I think that I'm missing something about something here. Could someone please help me? Thank you in advance.

LEFT JOIN will show left part of the join(values from left table) and will show:
values from right table in right part of the join if join condition are
fulfilled or
null values if join condition are not fulfilled
For example:
INSERT INTO #test (id, firstname, lastname)
VALUES
(1, 'A', 'A'),
(2, 'B', 'B'),
(3, 'A', 'B'),
(4, 'B', 'C')
SELECT t1.*, t2.*
FROM #test AS t1
LEFT JOIN #test AS t2
ON t1.firstname = t2.firstname
AND t1.lastname < t2.lastname
it will show
1 A A 3 A B
2 B B 4 B C
3 A B NULL NULL NULL
4 B C NULL NULL NULL
It shows all rows from t1 but last two rows will have nulls for t2 becaue join condition t1.lastname < t2.lastname is not fulfilled. For AB we don't have rows with greater lastname value than B, and for BC we don't rows with greater lastname value than C.
If you now add WHERE condition:
SELECT t1.*, t2.*
FROM #test AS t1
LEFT JOIN #test AS t2
ON t1.firstname = t2.firstname
AND t1.lastname < t2.lastname
WHERE t2.id IS NULL;
you limit you result to rows which didn't fulfilled join conditions.

Related

How to find latest record from two different tables

There are 2 tables table1 and table 2
First column, foreign_id is the common column between both tables.
Data type of all the related columns are same.
Now, we need to find the latest record based on timestamp column, for each foreign_id taking from both the tables, for example as below, also an extra column from_table, which shows from which table this row is selected.
One method that I can think of is
Combine both the tables
then, find the latest for each foreign_id column
Any, better way to do this as there could be 5000+ rows in both the tables.
Try this:
SELECT
t1.foreign_id,
MAX(t1.timestamp) max_time_table1,
MAX(t2.timestamp) max_time_table2
FROM *table1* t1
LEFT JOIN *table2* t2 USING (foreign_id)
GROUP BY foreign_id;
Note: This can be a bit slow, if the number of records are quite large.
However you can also use this:
SELECT a.foreign_id,
IF(a.max_time_table1 > a.max_time_table2, a.max_time_table1, a.max_time_table2) latest_update
FROM(
SELECT
t1.foreign_id,
SUBSTRING_INDEX(GROUP_CONCAT(t1.timestamp ORDER BY t1.id DESC),',',1) max_time_table1,
SUBSTRING_INDEX(GROUP_CONCAT(t2.timestamp ORDER BY t2.id DESC),',',1) max_time_table2
FROM *table1* t1
LEFT JOIN *table2* t2 USING (foreign_id)
GROUP BY foreign_id) a;
Make sure the id columns in both tables are auto_increment.
From your explanation, this would do then:
SELECT
foreign_id,
CASE
WHEN max_time_table1 < max_time_table2 THEN max_time_table2
WHEN max_time_table2 < max_time_table1 THEN max_time_table1
END as timestamps
FROM(
SELECT
t1.foreign_id,
SUBSTRING_INDEX(GROUP_CONCAT(t1.timestamp ORDER BY t1.id DESC),',',1) max_time_table1,
SUBSTRING_INDEX(GROUP_CONCAT(t2.timestamp ORDER BY t2.id DESC),',',1) max_time_table2
FROM *table1* t1
LEFT JOIN *table2* t2 USING (foreign_id)
GROUP BY foreign_id) a;

MySQL: COALESCE within JOIN

I've been trying to create a query using COALESCE with multiple forms of matching in order to join two tables -- but as far as I can tell, it hasn't been working.
Can somebody tell me what's wrong with this query?
SELECT *
FROM t1
LEFT JOIN t2 ON COALESCE(t1.id,t1.phone,t1.address) = COALESCE(t2.id,t2.phone,t2.address)
Something like that. The hope would be that the query would look to see if the unique IDs in t1 and t2 matched first, and if they didn't, it would move on to see if the phones matched, etc. It would be very helpful to attempt to match on multiple sets of criteria and to return the ones that matched any of the column and to only return the NULLs from t1 if the query couldn't find a match at all.
Edit:
By "not working," I meant that it seems like the ID matching works -- where it will return the data from t2 (and not NULLs) if the unique identifiers matched, but it doesn't move on to attempt to match on phone number or address line; this is obviously likely because the t1 table isn't returning any NULL values.
The definitions of the table is that t1 is a smaller subset of data that likely lives in the larger t2 table. Let's say that t1 is a table of about 100 people with only a few criteria: name, phone, address, id (though ID doesn't exist in every row) -- whereas t2 is a larger table of about 30,000 with much more criteria (name, phone, address, id, job, volunteer, email, notes, etc.) where I'm trying to find the 100 within the 30k.
I suspect that you want this:
SELECT *
FROM t1 LEFT JOIN
t2
ON t1.id = t2.id OR t1.phone = t2.phone or t1.address = t2.address;
This will match two rows if any of the keys match. However, you might want:
SELECT *
FROM t1 LEFT JOIN
t2
ON (t1.id = t2.id) OR
((t1.id is null or t2.id is null) AND t1.phone = t2.phone) or
(((t1.id is null or t2.id is null) and (t1.phone is null or t2.phone is null)) and t1.address = t2.address
)
Your query could be failing for a number of reasons. One possibility is type incompatibility. Another is that different rows have different distributions of NULL values, so you end up comparing difference columns, such as t1.id to t2.address.

INNER JOIN table and return all joined results as CSV (not separate records)

I have the following two tables with the following data. I would like to return all data when the two tables are joined. For instance, SELECT t1.data, t2.data FROM t1 INNER JOIN t2 ON t2.t1_id=t2.id WHERE t1.id=1; Now the tricky part. I don't want to return 3 rows but only one, and I would like t2.data to be CSVs. For instance, the above query would return "bla1","hi1,hi2,hi2" (if no join results, then null or "", and not ","). Is this fairly easy with just SQL, or am I better off using PHP, etc? If so with just SQL, how? Thanks
t1
-id
-data
t2
-id
-t1_id
-data
t1
-id=1, data="bla1"
-id=2, data="bla2"
-id=3, data="bla3"
t2
-id=1, t1_id=1, data=hi1
-id=2, t1_id=1, data=hi2
-id=3, t1_id=1, data=hi3
-id=4, t1_id=2, data=hi4
You can use GROUP_CONCAT which concatenates non-null values from a group using a delimiter (comma by default)
SELECT t1.data, GROUP_CONCAT(t2.data)
FROM t1 JOIN t2
ON t1.id = t2.t1_id
WHERE t1.id = 1;
Example on SQLFiddle: http://sqlfiddle.com/#!2/68154/5
You can use CONCAT or CONCAT_WS to "stick" column values together and GROUP_CONCAT to "stick" row values together.
Example:
SELECT
x, y, z
FROM
table;
Turns into:
SELECT
GROUP_CONCAT(CONCAT('"', x, '","', y, '","', z, '"') SEPARATOR '\n')
FROM
table
GROUP BY x; -- considering x would be the unique row identifier
The above example will return exactly one cell (one row with one column).
SELECT concat(t1.data,',',group_concat(t2.data)) FROM t1 INNER JOIN t2 ON
t2.t1_id=t1.id WHERE t1.id=1
group by t1.data

Selecting record from one table and insert it into different table with different structure avoiding duplicates

I have two tables with different structure, say property_bid and sc_property_queries.
sc_property_queries holds value from property_bid as well as another table. And there is a field called query_method in a destination table which tells from which table the rows came from. The field raw_id holds the ID from the source tables. What I want to do is , selecting from the property_bid table and insert it into sc_property_queries, but with new items only, i.e. avoiding the duplicates based on raw_id and query_method. Below is my MySQL code which doesn't seem to work
INSERT INTO sc_property_queries (
`property_id`,
`raw_id`, `query_method`,
`contact_fullname`,
`contact_address`,
`contact_email`,
`contact_phone`,
`contact_date`,
`entry_date`,
`title`,
`query_status`,
`remarks`,
`description`
)
SELECT
t1.property_id,
t1.id,
'web-bids',
t1.fullname,
'n/a',
t1.email,
t1.phone,
t1.on_date,
NOW(),
'n/a',
'1',
'n/a',
t1.comment
FROM
property_bid t1
LEFT JOIN sc_property_queries t2
ON (t1.id = t2.raw_id)
WHERE t2.query_method='web-bids' AND t2.raw_id IS NULL;
This query should return the all the rows from property_bid that doesnot exist in sc_property_queries. But it is not doing anything. Can anybody shed light on this?
WHERE t2.raw_id IS NULL restricts your resultset to only those records that do not exist in t2; therefore t2.* are all NULL. Hence this criterion cannot be true simultaneously with the other criterion WHERE t2.query_method='web-bids'.
Perhaps you meant to include that criterion in the join:
FROM
property_bid t1
LEFT JOIN sc_property_queries t2
ON (t1.id = t2.raw_id AND t2.query_method='web-bids')
WHERE t2.raw_id IS NULL
You don't need JOIN just use NOT IN:
FROM
property_bid t1
where
t1.id not in (select t2.raw_id from sc_property_queries t2 where t2.query_method='web-bids')

Left join with condition

Suppose I have these tables
create table bug (
id int primary key,
name varchar(20)
)
create table blocking (
pk int primary key,
id int,
name varchar(20)
)
insert into bug values (1, 'bad name')
insert into bug values (2, 'bad condition')
insert into bug values (3, 'about box')
insert into blocking values (0, 1, 'qa bug')
insert into blocking values (1, 1, 'doc bug')
insert into blocking values (2, 2, 'doc bug')
and I'd like to join the tables on id columns and the result should be like this:
id name blockingName
----------- -------------------- --------------------
1 bad name qa bug
2 bad condition NULL
3 about box NULL
This means:
I'd like to return all rows from #bug
there should be only 'qa bug' value in column 'blockingName' or NULL (if no matching row in #blocking was found)
My naive select was like this:
select * from #bug t1
left join #blocking t2 on t1.id = t2.id
where t2.name is null or t2.name = 'qa bug'
but this does not work, because it seems that the condition is first applied to #blocking table and then it is joined.
What is the simplest/typical solution for this problem? (I have a solution with nested select, but I hope there is something better)
Simply put the "qa bug" criteria in the join:
select t1.*, t2.name from #bug t1
left join #blocking t2 on t1.id = t2.id AND t2.name = 'qa bug'
correct select is:
create table bug (
id int primary key,
name varchar(20)
)
insert into bug values (1, 'bad name')
insert into bug values (2, 'bad condition')
insert into bug values (3, 'about box')
CREATE TABLE blocking
(
pk int IDENTITY(1,1)PRIMARY KEY ,
id int,
name varchar(20)
)
insert into blocking values (1, 'qa bug')
insert into blocking values (1, 'doc bug')
insert into blocking values (2, 'doc bug')
select
t1.id, t1.name,
(select b.name from blocking b where b.id=t1.id and b.name='qa bug')
from bug t1
It looks like you want to select only one row from #blocking and join that to #bug. I would do:
select t1.id, t1.name, t2.name as `blockingName`
from `#bug` t1
left join (select * from `#blocking` where name = "qa bug") t2
on t1.id = t2.id
select *
from #bug t1
left join #blocking t2 on t1.id = t2.id and t2.name = 'qa bug'
make sure the inner query only returns one row.
You may have to add a top 1 on it if it returns more than one.
select
t1.id, t1.name,
(select b.name from #blocking b where b.id=t1.id and b.name='qa bug')
from #bug t1
Here's a demo: http://sqlfiddle.com/#!2/414e6/1
select
bug.id,
bug.name,
blocking.name as blockingType
from
bug
left outer join blocking on
bug.id = blocking.id AND
blocking.name = 'qa bug'
order by
bug.id
By adding the "blocking.name" clause under the left outer join, rather than to the where, you indicate that it should also be consider "outer", or optional. When part of the where clause, it is considered required (which is why the null values were being filtered out).
BTW - sqlfiddle.com is my site.