How to find latest record from two different tables - mysql

There are 2 tables table1 and table 2
First column, foreign_id is the common column between both tables.
Data type of all the related columns are same.
Now, we need to find the latest record based on timestamp column, for each foreign_id taking from both the tables, for example as below, also an extra column from_table, which shows from which table this row is selected.
One method that I can think of is
Combine both the tables
then, find the latest for each foreign_id column
Any, better way to do this as there could be 5000+ rows in both the tables.

Try this:
SELECT
t1.foreign_id,
MAX(t1.timestamp) max_time_table1,
MAX(t2.timestamp) max_time_table2
FROM *table1* t1
LEFT JOIN *table2* t2 USING (foreign_id)
GROUP BY foreign_id;
Note: This can be a bit slow, if the number of records are quite large.
However you can also use this:
SELECT a.foreign_id,
IF(a.max_time_table1 > a.max_time_table2, a.max_time_table1, a.max_time_table2) latest_update
FROM(
SELECT
t1.foreign_id,
SUBSTRING_INDEX(GROUP_CONCAT(t1.timestamp ORDER BY t1.id DESC),',',1) max_time_table1,
SUBSTRING_INDEX(GROUP_CONCAT(t2.timestamp ORDER BY t2.id DESC),',',1) max_time_table2
FROM *table1* t1
LEFT JOIN *table2* t2 USING (foreign_id)
GROUP BY foreign_id) a;
Make sure the id columns in both tables are auto_increment.

From your explanation, this would do then:
SELECT
foreign_id,
CASE
WHEN max_time_table1 < max_time_table2 THEN max_time_table2
WHEN max_time_table2 < max_time_table1 THEN max_time_table1
END as timestamps
FROM(
SELECT
t1.foreign_id,
SUBSTRING_INDEX(GROUP_CONCAT(t1.timestamp ORDER BY t1.id DESC),',',1) max_time_table1,
SUBSTRING_INDEX(GROUP_CONCAT(t2.timestamp ORDER BY t2.id DESC),',',1) max_time_table2
FROM *table1* t1
LEFT JOIN *table2* t2 USING (foreign_id)
GROUP BY foreign_id) a;

Related

Have a left join where duplicates in the second table is involved - MYSQL

Table 1:
user score
------------
A 1
B 2
Table 2:
user comment time
----------------------------
A good <timestamp 1>
A bad <timestamp 2>
B average <timestamp 3>
I want to join these two tables such that I get the below:
user score comment
-------------------------
A 1 good
B 2 average
As you can see I'll need to join the second table's comment based on the timestamp (the most recent timestamp). I tried
SELECT st.user as user,st.score,
case when v.comment is null then 'NA' else v.comment end as comment
FROM tale1
left JOIN (select distinct user,comment,max(time) from table2) v ON st.user=v.user
but this doesnt work.
You can join with a correlated subquery that filters on the latest timestamp:
select
t1.*,
t2.comment
from table1 t1
left join table2 t2
on t2.user = t1.user
and t2.time = (
select max(t22.time)
from table2 t22
where t21.user = t1.user
)
Side note: I am unsure that you do need a left join here (your sample data does not demonstrate that).
You only want one column from table2 so I recommend a correlated subquery:
select t1.*,
(select t2.comment
from table2 t2
where t2.user = t1.user
order by t2.time desc
limit 1
) as comment
from table1 t1;
This query will make optimal use of an index on table2(user, time desc, comment) -- alas, though, I think the desc is ignored in MySQL.

How to optimize mysql on left join

I try to explain a very high level
I have two complex SELECT queries(for the sake of example I reduce the queries to the following):
SELECT id, t3_id FROM t1;
SELECT t3_id, MAX(added) as last FROM t2 GROUP BY t3_id;
query 1 returns 16k rows and query 2 returns 15k
each queries individually takes less than 1 second to compute
However what I need is to sort the results using column added of query 2, when I try to use LEFT join
SELECT
t1.id, t1.t3_Id
FROM
t1
LEFT JOIN
(SELECT t3_id, MAX(added) as last FROM t2 GROUP BY t3_id) AS t_t2
ON t_t2.t3_id = t1.t3_id
GROUP BY t1.t3_id
ORDER BY t_t2.last
However, the execution time goes up to over a 1 minute.
I like to understand the reason
what is the cause of such a huge explosion?
NOTE:
ALL the used columns on every table have been indexed
e.g. :
table t1 has index on id,t3_Id
table t2 has index on t3_id and added
EDIT1
after #Tim Biegeleisen suggestion, I change the query to the following now the query is executing in about 16 seconds. If I remove the ORDER BY it query gets executed in less than 1 seconds. The problem is that ORDER BY the sole reason for this.
SELECT
t1.id, t1.t3_Id
FROM
t1
LEFT JOIN
t2 ON t2.t3_id = t1.t3_id
GROUP BY t1.t3_id
ORDER BY MAX(t2.added)
Even though table t2 has an index on column t3_id, when you join t1 you are actually joining to a derived table, which either can't use the index, or can't use it completely effectively. Since t1 has 16K rows and you are doing a LEFT JOIN, this means the database engine will need to scan the entire derived table for each record in t1.
You should use MySQL's EXPLAIN to see what the exact execution strategy is, but my suspicion is that the derived table is what is slowing you down.
The correct query should be:
SELECT
t1.id,
t1.t3_Id,
MAX(t2.added) as last
FROM t1
LEFT JOIN t2 on t1.t3_Id = t2.t3_Id
GROUP BY t2.t3_id
ORDER BY last;
This is happen because a temp table is generating on each record.
I think you could try to order everything after the records are available. Maybe:
select * from (
select * from
(select t3_id,max(t1_id) from t1 group by t3_id) as t1
left join (select t3_id,max(added) as last from t2 group by t3_id) as t2
on t1.t3_id = t2.t3_id ) as xx
order by last

Mysql Join 3 Tables in One Query with sorted Result

I have 3 tables and want to join all in one query to show latest 10 entries by datetime.
t1: id, username
t2: id, id_t1, med_id, ga_id, au_id, re_id, text, datetime
t3: id, id_t1, pro_id, au_id, re_id, text, datetime
First I saw it would be easy with simple left join and where id, but i got double results. Then i tried inner and outer join, also group by, but the result was bad.
So my question is how can i join all without double results of the last 10 of t2 and t3?
Hard to tell what exactly you are trying to acheive, but here is a clue how it could be complemented.
SELECT TOP 10 DISTINCT T1.*
FROM T1
INNER JOIN T2 ON T1.id = T2.id_t1
INNER JOIN T3 ON T1.id = T3.id_t1
ORDER BY (CASE WHEN T2.[DateTime] > T3.[DateTime] THEN
T2.[DateTime]
ELSE
T3.[DateTime]
END) DESC
If you need to select field from T2 and T3, GROUP BY on all T1 field with aggregate on field from t2 and t3 is an option. Otherwise, linked-subquery is the way to go.
As sgeddes commented already, it's hard to know what you need, without seeing some example data from your tables. It would really help to know what the relationship between the three tables is.
One question I have, in particular, is: how are t2 and t3 related, if at all? It looks like they might not be, as each of them has its own datetime column.
Perhaps the following could do the job, but we need some more info to know for sure:
(SELECT DISTINCT t1.*, t2.id, t2.au_id, t2.re_id, t2.text, t2.`datetime`, t2.med_id, t2.ga_id, NULL AS pro_id
FROM t1
INNER JOIN t2 ON t1.id = t2.id_t1)
UNION
(SELECT DISTINCT t1.*, t3.id, t3.au_id, t3.re_id, t3.text, t3.`datetime`, NULL AS med_id, NULL AS ga_id, t3.pro_id
FROM t1
INNER JOIN t3 ON t1.id = t3.id_t1)
ORDER BY datetime DESC
LIMIT 10
The following selects the username and the datetime for the last ten posts.
SELECT username, last_ten.`datetime` AS lastpost
FROM t1
INNER JOIN (
SELECT 't2' AS tab, id, `datetime`, t2.id_t1
FROM t2
UNION ALL
SELECT 't3' AS tab, id, `datetime`, t3.id_t1
FROM t3
ORDER BY datetime DESC
LIMIT 10
) AS last_ten ON t1.id = last_ten.id_t1

Join two tables on two columns, even if table 2 does not have row

What I am trying to do is join two tables, lets call them t1 and t2 on two columns. id and name, for this example. t1 will always have id and name, but t2 won't always have id and name. t1 has more columns like viewes, reports, and t2 has other columns that need to be joined. My question is, how can I show 0's for t2's columns if they don't exist?
I hav something similar to this, that joins tables only if both tables' rows have some value.
SELECT
date(t1.start_time) date,
t1.name,
t1.viewes,
t1.reports,
t2.col5,
t2.col6
from
table1 t1
left outer join table2 t2
on t2.name = t1.name and date(t2.start_time) = date(t1.start_time)
group by
1,2
order by
1 desc,
2 asc
;
I have lot's of experience with MySQL, but sometimes find that things need to be hacked to work correctly. What's your suggestion for this problem?

SQL query to get only the latest value by Date

I have the following two tables:
Table1 {T1ID, Name}
Table2 {T2ID, T1ID, Date, Value}
Date is of type DATE.
and I am looking for a SQL query to fetch only the latest value (by Date) for each T1ID for which the Name matches a specific string.
SELECT`Table2`.`T1ID`,
`Table2`.`Value`,
`Table2`.`Date`,
`Table1`.`Name`,
FROM `Table1`
INNER JOIN `Table2` ON `Table2`.`T1ID` = `Table1`.`T1ID`
WHERE `Table1`.`Name` LIKE 'Smith'
but this returns the value for several dates for the same T1ID.
How do I get only the latest value by Date?
Edit:
I am using MySQL 5.5.8
If I've understodd the question correctly:
Assuming MySQL:
SELECT`Table2`.`T1ID`,
`Table2`.`Value`,
`Table2`.`Date`,
`Table1`.`Name`
FROM `Table1`
INNER JOIN `Table2` ON `Table2`.`T1ID` = `Table1`.`ID`,
(SELECT T1ID, MAX(Date) AS 'Date' FROM Table2 GROUP BY T1ID) Table3
WHERE
`Table3`.`T1ID` = `Table2`.`T1ID`
AND
`Table3`.`Date` = `Table2`.`Date`
AND
`Table1`.`Name` LIKE 'Smith'
EDIT: Updated the code to bring back the correct result set. Removed MSSQL answer as it wasn't relevant
You have two options.
select t1.t1id, max(t1.Name) Name, max(t2.date) Date,
(select Value from table2 t22
where t22.date = max(t2.date) and t22.t1id = t2.t1id) Value
from table1 t1 left join table2 t2 on t1.t1id = t2.t1id
where Name like '%Smith%'
group by t2.t1id order by 2
OR
select mx.t1id, mx.Name, mx.Date, t2.Value
from
(
select t1.t1id, max(t1.Name) Name, max(t2.date) Date
from table1 t1 left join table2 t2 on t1.t1id = t2.t1id
where Name like '%Smith%'
group by t2.t1id
) mx left join table2 t2 on (t2.t1id = mx.t1id and t2.date = mx.date)
order by 2
Both will produce the same result. The first one takes less code but you might have performance issues with a huge set of data. The second one takes a little more code, but it is also a little more optimized. Notes on the JOIN option:
If you go LEFT JOIN (as the example shows), items in Table1 with no correspondent records on Table2 will be displayed in the result, but the values in columns Date and Value will be NULL
If you go INNER JOIN, items in Table1 with no correspondent records on Table2 will not be displayed.
EDIT
I missed one of the requirements, which was the Name matching a specific string. The code is now updated. The '%' acts like a wildcard, so it will match names like 'Will Smith' and 'Wail Smithers'. If you want a exact match, remove the wildcards ('%').
Add this to your SQL:
ORDER BY 'Date' DESC LIMIT 1