Joining three tables such that extra matches are discarded? - mysql

How can I write a query to give the results of three tables such that there's only one result per "line"?
The tables are:
T1 (ID, name, IP)
T2 (ID, date_joined)
T3 (ID, address, date_modified)
The relations are:
T1-T2 1:1, T1-T3 1:M - there can be many address rows per ID in T3.
What I want is a listing of all users with the fields above, but IF they have an address, I only want to record ONE (bonus would be if it is the latest one based on T3.date_modified).
So I should end up with exactly the number of records in T1 (happens to be equal to T2 in this case) and no more.
I tried:
select t.ID, t.name, t.IP, tt.ID, tt.date_joined, ttt.ID, ttt.address
from T1 t JOIN T2 tt ON (t.ID = tt.ID) JOIN T3 ttt ON (t.ID = ttt.ID)
And every sensible combination of LEFT, RIGHT, INNER, etc joins I could think of! I keep getting multiple duplicate because of T3

This query should work:
select
t1.ID, t1.name, t1.IP, t2.date_joined, t3x.address
from t1
join t2 on t1.ID = t2.id
left join (
select t3.*
from t3
join (
select id, max(date_modified) max_date
from t3
group by id
) max_t3 on t3.id = max_t3.id and t3.date_modified = max_t3.max_date
) t3x on t1.ID = t3x.id
First you do the normal join between t1 and t2 and then you left join with a derived table (t3x) that is the set of t3 rows having the latest date.

So T2 is actually not relevant here. You just need a way to join from T1 to T3 in a way that gets you at most one T3 row per T1 row.
One way of doing this would be:
select
T1.*,
(select address from T3 where T3.ID=T1.ID order by date_modified desc limit 1)
from T1;
This won't likely be very efficient, being a correlated subquery, but you may not care depending on the size of your dataset.
It's also only good for getting one column from T3, so if you had Address, City, and State, you'd have to figure out something else.

You can use sub query with Top 1 so that u get only one result from T3
here is a sample sql
select * into #T1 from(
select 1 ID
union select 2
union select 3) A
select * into #T2 from(
select 1 ID
union select 2
union select 3) A
select * into #T3 from(
select 1 ID, 'ABC' Address, getDate() dateModified
union select 1, 'DEF', getDate()
union select 3, 'GHI', getDate()) A
select *, (select top 1 Address from #T3 T3 where T3.ID= T1.ID order by datemodified desc) from #T1 T1
inner join #T2 T2 on T1.ID = T2.ID
Bonus :- you can also add order by dateModified desc to get the latest address

Related

Get id of the record having Min() value

I have a complex mysql query where one of the Select fields is Min(value). Since all the 'values' are unique, is there also a way to get found min value's row id along?
In other words if we simplify the query to this question, it is like this:
SELECT t1.name, MIN(t2.value) AS minval
FROM table1 t1
LEFT JOIN table2 t2
ON t2.id_user = t1.id
GROUP BY id_user
How can i now know which t2.id was chosen for lowest t2.value for particular user? Thank you!
Use ROW_NUMBER() to find the first value of each id_user
You can replace * with the fields you need
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY t2.id_user ORDER BY t2.value) as rnk
FROM table1 t1
LEFT JOIN table2 t2
ON t2.id_user = t1.id
) as X
WHERE X.rnk = 1
Maybe this simple, dont know how complex your statement is:
SELECT name,value,id
FROM(
SELECT t1.name,t2.value,t2.id
FROM table1 t1
LEFT JOIN table2 t2
ON t2.id_user = t1.id
GROUP BY t2.id,id_user
ORDER BY t1.name,t2.id asc) as test
GROUP BY name;

SQL: full outer join (ambitious column name)

I have two table, t1 and t2.
-- t1
id name address
1 Tim A
2 Marta B
-- t2
id name address
1 Tim A
3 Katarina C
If I do t1 full outer join with t2
SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id
UNION ALL
SELECT * FROM t1
RIGHT JOIN t2 ON t1.id = t2.id
However, the result has ambitious id, name, address.
How do I rename this so that I don't have duplicate column name?
Attempt:
SELECT name, address FROM
(SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id
UNION ALL
SELECT * FROM t1
RIGHT JOIN t2 ON t1.id = t2.id) as derived_table;
return: ERROR- duplicate column name "name".
Ditch the * in the SELECT list.
Specify the list of expressions to be returned. And qualify all column references with either the table name, or preferably, a shorter table alias.
And assign an alias to the expression and that will be the name of the column in the resultset.
Also, the query shown is not equivalent to a FULL OUTER JOIN.
If the goal is return all rows from t1, and to also return rows from t2 where a matching row doesn't exist in t1, I'd do something like this...
SELECT t.id AS t_id
, t.name AS t_name
, t.addr AS t_addr
FROM t1 t
UNION ALL
SELECT s.id
, s.name
, s.addr
FROM t2 s
LEFT
JOIN t1 r
ON r.id = s.id
WHERE r.id IS NULL
Try fully qualifying it like
SELECT t1.id, t1.name, t1.address FROM t1

MySQL select count based on two rows data

Table column headers: n,t1,t2
entries :
1 A B
2 A C
3 B C
4 D E
5 B A
How do I count total number of rows each letter appears in t1 MINUS the number of rows they appear in t2 ? I need to do something like following 2 lines in 1 query :
select count(*) as val,t1 from table group by t1
select count(*) as val,t2 from table group by t2
Thanks,
Martin
Here is one way:
select t1, max(t1cnt) - max(t2cnt) as diff
from ((select t1, count(*) as t1cnt, 0 as t2cnt
from t
group by t1
) union all
(select t2, 0 as t1cnt, count(*) as t2cnt
from t
group by t2
)
) t
group by t1
Using the union all ensures that you get all possible values from both columns, even values that only appear in one column.
You can use the following query to get the result. This query first gets a list of all the distinct t1 and t2 values (this is the UNION query). Once you have the list of these values, then you can use a LEFT JOIN to the original queries that you posted:
select d.col, coalesce(totT1, 0) - coalesce(totT2, 0) Total
from
(
select t1 col
from entries
union
select t2 col
from entries
) d
left join
(
select count(*) totT1, t1
from entries
group by t1
) d1
on d.col = d1.t1
left join
(
select count(*) totT2, t2
from entries
group by t2
) d2
on d.col = d2.t2;
See SQL Fiddle with Demo

Simple MYSQL distinct select

If I have a table with two columns, name and timestamp, and a bunch of rows that will have shared names. How do I select the most recent row of each set of rows that shares the same name?
Thanks!
SELECT name, MAX(timestamp) FROM Table1 GROUP BY name
EDIT: Based on the comment, please try the following:
SELECT name, timestamp, col3, col4
FROM Table1 t1
WHERE timestamp = (SELECT MAX(t2.timestamp)
FROM Table1 t2
WHERE t1.name = t2.name);
Added by Mchl
Version with no dependent subquery (should perform better)
SELECT
t1.name, t1.timestamp, t1.col3, t1.col4
FROM
Table1 AS t1
CROSS JOIN (
SELECT
name, MAX(timestamp) AS timestamp
FROM
Table1
GROUP BY
name
) AS sq
USING (name,timestamp)
Then you need a subquery:
SELECT columns
FROM Table1 t1
WHERE row_id = (SELECT row_id
FROM table1 t2
WHERE t1.name = t2.name
ORDER BY timestamp DESC
LIMIT 1)
GROUP BY name
Edited, forgot the group by name

inner join for a query?

I want to do a sql query and have some problems:
I want to select from table_1 the ID's Where parent_id is the value I have:
SELECT ID
FROM table_1
WHERE parent_ID = 'x'
I want to use the ID'S I got in 1. and
SELECT
FROM table_2
WHERE ID = 'The ID's from Query 1.'
Like this?
select ...
from table_1 a
join table_2 b on(a.id = b.id)
where a.parent_id = 'x';
Edit
Note: the query will potentially produce duplicate rows depending on the keys and relation between the tables. For example, you will get duplicates if, for a given table_1.parent_id = X, there can be multiple occurrences of the same table_1.ID.
Another example is when table_2.ID isn't unique.
In those cases you would want to remove the duplicates (using distinct, group by, partitioned #row_number, etc) or, not produce the duplicates in the first place using a semi-join instead (exists, in). Have a look #OMG Ponies answer for reference.
Using IN
SELECT t2.*
FROM TABLE_2 t2
WHERE t2.id IN (SELECT t1.id
FROM TABLE_1 t1
WHERE t1.parent_id = 'x')
Using EXISTS
SELECT t2.*
FROM TABLE_2 t2
WHERE EXISTS (SELECT NULL
FROM TABLE_1 t1
WHERE t1.id = t2.id
AND t1.parent_id = 'x')
Using an INNER JOIN
The DISTINCT (or GROUP BY) is necessary to eliminate duplicates if there are more than one records in TABLE_1 that relate to a record in TABLE_2:
SELECT DISTINCT t2.*
FROM TABLE_2 t2
JOIN TABLE_1 t1 ON t1.id = t2.id
AND t1.parent_id = 'x'
It can be solved with the use of IN as follows:
SELECT * FROM table_2 WHERE ID IN (SELECT ID FROM table_1 WHERE parent_ID = 'x')
select * from table_2 where id in (select id from table_1 where parent_id = 'x')
Yes, it's better to you use this:
SELECT [value]
FROM [table2]
WHERE [value] IN (SELECT [value]
FROM [table1]
WHERE [value] = "[value]"
)