left join two separate queries - mysql

I have two separate queries which pretty much return the same thing:
select id
from t
where id<>''
GROUP BY id
having count(*) >= 2;
select id
from t2
where id is not null
GROUP BY id
having count(*) >= 2
ORDER BY id ASC;
a list of ids whose values appear more than once. The first query returns more than the second, So I need to left join them somehow to get the results that are in the first query but not in the second. I tried to do a left join but it is not working properly.
I also tried the following to no avail:
select id
from t
where id<>''
GROUP BY id
having count(*) >= 2
not in (select id from t2 where id is not null GROUP BY id having count(*) >= 2
ORDER BY id ASC)
Additional Info
Query one is giving me all ids that have the same value for table 1 and query 2 is giving me all the same value ones for table 2. There are additional gotchas like there are some blank ids in table one while there are some nulls in table 2, hence the conditions excluding blanks for the former and nulls for the latter. So I get back these two separate results, which are almost the same, except in results 1 there are claims that arent in results 2 but only when these queries are run because they are duplicated in table 1 but not in table 2. although they do exist in table 2. So a simple left join where t1.id <> t2.id will not work because they do exist in t2.

You want to select ids in t1 that are not in t2. JOIN on t2 and ensure that the result is NULL.
SELECT t.id
FROM t1
LEFT JOIN t2 ON t1.id = t2.id
WHERE t2.id IS NULL
GROUP BY t.id
HAVING COUNT(t.id) > 1

SELECT t.id, t2.id FROM t
LEFT OUTER JOIN t2
ON t.id <> t2.id
WHERE t.id<>''
GROUP BY t.id having count(t.id) >= 2
ORDER BY t.id ASC
I assume your not in part before the subquery is actually mean that the id from the table t and t2 are not same and so added the condition t.id <> t2.id
EDIT
SELECT t.id FROM t
WHERE t.id<>''
AND t.id NOT IN (SELECT id FROM t2 where id is not null )
GROUP BY t.id having count(t.id) >= 2
ORDER BY t.id ASC
SQLFIDDLE

Related

Have a left join where duplicates in the second table is involved - MYSQL

Table 1:
user score
------------
A 1
B 2
Table 2:
user comment time
----------------------------
A good <timestamp 1>
A bad <timestamp 2>
B average <timestamp 3>
I want to join these two tables such that I get the below:
user score comment
-------------------------
A 1 good
B 2 average
As you can see I'll need to join the second table's comment based on the timestamp (the most recent timestamp). I tried
SELECT st.user as user,st.score,
case when v.comment is null then 'NA' else v.comment end as comment
FROM tale1
left JOIN (select distinct user,comment,max(time) from table2) v ON st.user=v.user
but this doesnt work.
You can join with a correlated subquery that filters on the latest timestamp:
select
t1.*,
t2.comment
from table1 t1
left join table2 t2
on t2.user = t1.user
and t2.time = (
select max(t22.time)
from table2 t22
where t21.user = t1.user
)
Side note: I am unsure that you do need a left join here (your sample data does not demonstrate that).
You only want one column from table2 so I recommend a correlated subquery:
select t1.*,
(select t2.comment
from table2 t2
where t2.user = t1.user
order by t2.time desc
limit 1
) as comment
from table1 t1;
This query will make optimal use of an index on table2(user, time desc, comment) -- alas, though, I think the desc is ignored in MySQL.

why the sql correct and the inner mechanism for run it?

the sql as follows come from mysql document. it is:
SELECT * FROM t1 AS t
WHERE 2 = (SELECT COUNT(*) FROM t1 WHERE t1.id = t.id);
The document say It finds all rows in table t1 containing a value that occurs twice in a given column , and doesnot explain the sql.
t1 and t is the same table, so the
count(*) in subquery == select count(*) from t
, isn't it?
count(*) in subquery == select count(*) from t
is wrong. because in mysql you can't use it like that. so you have to run it like that to get result of same id having two rows.
if you want to get count of same occurrence,
SELECT id, name, count(*) AS all_count FROM t1 GROUP BY id HAVING all_count > 1 ORDER BY all_count DESC
And also you can get values as your query like this as well,
select * from t1 where id in ( select id from t1 group by id having count(*) > 1 )
The query contains a correlated subquery in WHERE clause:
SELECT COUNT(*) FROM t1 WHERE t1.id = t.id
It is called correlated because it is related to the main query via t.id. So, this subquery counts the number of records having an id value that is equal to the current id value of the record returned by the main query.
Thus, predicate
(SELECT COUNT(*) FROM t1 WHERE t1.id = t.id) = 2
evaluates to true for any row with an id value that occurs twice in the table.
SELECT * FROM t1 AS t
WHERE 2 = (SELECT COUNT(*) FROM t1 WHERE t1.id = t.id);
This query goes through each record in t1 and then in the subquery looks into t1 again to see if in this case id is found 2 times (and only 2 times). You can do the same for any other column in t1 (or any table for that matter).
When you would like to see all values that are multiple times in the table, change WHERE 2 = by WHERE 1 <. This will also give you the values that are 3 times, 4 times, etc. in the table.
{
SELECT id,count( * )
FROM
MyTable
group by id
having count( * )>1
}
with this code, you can see the rows which repet more than one,
and you can change this query by yourself
How about using GROUP BY and HAVING:
SELECT id, count(1) as Total FROM MyTable AS t1
GROUP BY t1.id
HAVING Total = 2

How to find latest record from two different tables

There are 2 tables table1 and table 2
First column, foreign_id is the common column between both tables.
Data type of all the related columns are same.
Now, we need to find the latest record based on timestamp column, for each foreign_id taking from both the tables, for example as below, also an extra column from_table, which shows from which table this row is selected.
One method that I can think of is
Combine both the tables
then, find the latest for each foreign_id column
Any, better way to do this as there could be 5000+ rows in both the tables.
Try this:
SELECT
t1.foreign_id,
MAX(t1.timestamp) max_time_table1,
MAX(t2.timestamp) max_time_table2
FROM *table1* t1
LEFT JOIN *table2* t2 USING (foreign_id)
GROUP BY foreign_id;
Note: This can be a bit slow, if the number of records are quite large.
However you can also use this:
SELECT a.foreign_id,
IF(a.max_time_table1 > a.max_time_table2, a.max_time_table1, a.max_time_table2) latest_update
FROM(
SELECT
t1.foreign_id,
SUBSTRING_INDEX(GROUP_CONCAT(t1.timestamp ORDER BY t1.id DESC),',',1) max_time_table1,
SUBSTRING_INDEX(GROUP_CONCAT(t2.timestamp ORDER BY t2.id DESC),',',1) max_time_table2
FROM *table1* t1
LEFT JOIN *table2* t2 USING (foreign_id)
GROUP BY foreign_id) a;
Make sure the id columns in both tables are auto_increment.
From your explanation, this would do then:
SELECT
foreign_id,
CASE
WHEN max_time_table1 < max_time_table2 THEN max_time_table2
WHEN max_time_table2 < max_time_table1 THEN max_time_table1
END as timestamps
FROM(
SELECT
t1.foreign_id,
SUBSTRING_INDEX(GROUP_CONCAT(t1.timestamp ORDER BY t1.id DESC),',',1) max_time_table1,
SUBSTRING_INDEX(GROUP_CONCAT(t2.timestamp ORDER BY t2.id DESC),',',1) max_time_table2
FROM *table1* t1
LEFT JOIN *table2* t2 USING (foreign_id)
GROUP BY foreign_id) a;

Select most recent rows from inner joined tables in MySQL

I want to get the most recent row from each inner joined tables. Both tables have a timestamp field. Below is what I have so far. But it only targets for table1 how about for table2?
SELECT
`table1`.`fieldX`,
`table2`.`fieldY`
FROM `db`.`table1`
INNER JOIN `db`.`table2`
ON `table1`.`id` = `table2`.`id`
WHERE `table1`.`id` = ?
ORDER BY `table1`.`timestamp`
DESC LIMIT 1
table1
row_id
id
fieldX
timestamp
table2
row_id
id
fieldY
timestamp
Both tables can have repeating ids. It was designed this way to store older versions of the data entries.
For example: table1 can have 3 rows with the same id while table2 can have 2 rows of the same id. I want to get the latest row from both tables.
Use can use it like this if you want recent record from table2
SELECT SUBSTRING_INDEX(GROUP_CONCAT(t1.fieldX ORDER BY t1.timestamp DESC),',',1) as field_x, SUBSTRING_INDEX(GROUP_CONCAT(t2.fieldY ORDER BY t2.timestamp DESC),',',1) as field_y
FROM table1 t1
JOIN table2 t2 ON(t2.id = t1.id)
GROUP BY t1.id
ORDER BY t1.timestamp

Mysql Join 3 Tables in One Query with sorted Result

I have 3 tables and want to join all in one query to show latest 10 entries by datetime.
t1: id, username
t2: id, id_t1, med_id, ga_id, au_id, re_id, text, datetime
t3: id, id_t1, pro_id, au_id, re_id, text, datetime
First I saw it would be easy with simple left join and where id, but i got double results. Then i tried inner and outer join, also group by, but the result was bad.
So my question is how can i join all without double results of the last 10 of t2 and t3?
Hard to tell what exactly you are trying to acheive, but here is a clue how it could be complemented.
SELECT TOP 10 DISTINCT T1.*
FROM T1
INNER JOIN T2 ON T1.id = T2.id_t1
INNER JOIN T3 ON T1.id = T3.id_t1
ORDER BY (CASE WHEN T2.[DateTime] > T3.[DateTime] THEN
T2.[DateTime]
ELSE
T3.[DateTime]
END) DESC
If you need to select field from T2 and T3, GROUP BY on all T1 field with aggregate on field from t2 and t3 is an option. Otherwise, linked-subquery is the way to go.
As sgeddes commented already, it's hard to know what you need, without seeing some example data from your tables. It would really help to know what the relationship between the three tables is.
One question I have, in particular, is: how are t2 and t3 related, if at all? It looks like they might not be, as each of them has its own datetime column.
Perhaps the following could do the job, but we need some more info to know for sure:
(SELECT DISTINCT t1.*, t2.id, t2.au_id, t2.re_id, t2.text, t2.`datetime`, t2.med_id, t2.ga_id, NULL AS pro_id
FROM t1
INNER JOIN t2 ON t1.id = t2.id_t1)
UNION
(SELECT DISTINCT t1.*, t3.id, t3.au_id, t3.re_id, t3.text, t3.`datetime`, NULL AS med_id, NULL AS ga_id, t3.pro_id
FROM t1
INNER JOIN t3 ON t1.id = t3.id_t1)
ORDER BY datetime DESC
LIMIT 10
The following selects the username and the datetime for the last ten posts.
SELECT username, last_ten.`datetime` AS lastpost
FROM t1
INNER JOIN (
SELECT 't2' AS tab, id, `datetime`, t2.id_t1
FROM t2
UNION ALL
SELECT 't3' AS tab, id, `datetime`, t3.id_t1
FROM t3
ORDER BY datetime DESC
LIMIT 10
) AS last_ten ON t1.id = last_ten.id_t1