I have a situation where we have inserted duplicated data into some tables.
Given the following database schema, I want to find all records with s_id and co_id combinations associated to more than 1 record from table A. The highlighted rows are the rows I'm looking for, based off of finding the duplicates I need to find the id's from table A associated to the duplicate records.
I'm able to group by s_id & co_id to determine potential duplicates, but because Table B is a 1:M, this isn't entirely accurate.
Select c.s_id, c.co_id, Count(*)
from c
INNER JOIN b on c.b_id = b.id
INNER JOIN a on a.id = b.a_id
Group By c.s_id, c.co_id
Having count(*) > 1;
I think you just want count(distinct):
Select c.s_id, c.co_id, Count(distinct a.id)
from c join
b
on c.b_id = b.id join
a
on a.id = b.a_id
Group By c.s_id, c.co_id
having count(distinct a.id) > 1;
Gordon's answer will get you the s_id and co_id values. If you need to trace those back to a then try this:
select distinct a.id
from
a inner join b on b.a_id = a.id inner join c on c.b_id = b.id inner join
(
select c.s_id, c.co_id
from a inner join b on b.a_id = a.id inner join c on c.b_id = b.id
group by c.s_id, c.co_id
having count(distinct a.id) > 1
) as dups
on dups.s_id = c.s_id and dups.co_id = s.co_id
Related
I think it's impossible, but I'm asking if there's a good way.
There are A table / B table / C table.
The table was joined LEFT JOIN based on table A with FK called id of each table.
At this time, I would like to output the count(*) as B table rows and C table rows based on b.id(B_CNT) c.id(C_CNT)
SELECT
*
FROM
A
LEFT JOIN B ON A.ID = B.ID
LEFT JOIN C ON A.ID = C.ID (base query)
how could I count group by b.id and c.id?
You could try:
SELECT
COUNT(DISTINCT B.ID), COUNT(DISTINCT C.ID)
FROM A
LEFT JOIN B
ON A.ID = B.ID
LEFT JOIN C
ON A.ID = C.ID
(I couldn't quite understand from your question, but I'm making an assumption that you want the distinct count of "ID" from each table)
You can use a couple of scalar subqueries. For example:
select id,
(select count(*) from b where b.id = a.id) as b,
(select count(*) from c where c.id = a.id) as c
from a
How could I count an inner join output, thanks a lot
-- Quantity A = 981
SELECT COUNT(DISTINCT ID) FROM A;
-- Quantity B = 673
SELECT COUNT(DISTINCT ID) FROM B;
How can i count an inner join
SELECT * FROM A
INNER JOIN B
ON A.ID = B.ID;
Combine your two attempts into one since you're performing an INNER JOIN, it does not matter if you use A.ID or B.ID in the DISTINCT COUNT:
SELECT COUNT(DISTINCT A.ID) AS AB_Count FROM A INNER JOIN B ON A.ID = B.ID;
Fiddle for reference.
First of all, for example I have 3 table A, B, C. Table A has relation with table B and table has relation with table C. I want to get SUM of some field from table A which depends on some fields from table C.
Table A has > 300k rows, Table B has > 4k rows, Table C has ~ 100 rows
My query looks like that:
SELECT SUM(a.hours) AS total
FROM table_a a
LEFT JOIN table_b b
ON a.table_b_id = b.id
LEFT JOIN table_c c
ON b.table_c_id = c.id
WHERE a.customer_id = 1
AND c.title IN ('Title D','Title E')
Query execution time is ~7 sec, it's very slow. But execution time of query like below is ~0.0 sec.
SELECT a.hours
FROM table_a a
LEFT JOIN table_b b
ON a.table_b_id = b.id
LEFT JOIN table_c c
ON b.table_c_id = c.id
WHERE a.customer_id = 1
AND c.title IN ('Title D','Title E')
Why SUM is so slow? what should I do?
Move your condition to ON clause for related table:
SELECT SUM(a.hours) AS total
FROM table_a a
LEFT JOIN table_b b
ON a.table_b_id = b.id
LEFT JOIN table_c c
ON b.table_c_id = c.id
AND c.title IN ('Title D','Title E')
WHERE a.customer_id = 1
EDIT 1 According to #dnoeth comment I can agree, probably we should use inner join when join table_c:
SELECT SUM(a.hours) AS total
FROM table_a a
LEFT JOIN table_b b
ON a.table_b_id = b.id
INNER JOIN table_c c
ON b.table_c_id = c.id
AND c.title IN ('Title D','Title E')
WHERE a.customer_id = 1
Use LEFT JOIN in both places and add this composite index:
INDEX(customer_id, title)
I have a query in my application that is performing poorly. I think it can be optimzed but my SQL skills are failing me. Here's the query in a sort of meta-sql:
SELECT A.Value, count(*)
FROM B
JOIN A ON B.A_ID = A.ID
JOIN C ON C.ID = B.C_ID
WHERE B.C_ID IN (
SELECT B.C_ID
FROM C
JOIN B ON B.C_ID = C.ID
JOIN A ON B.A_ID = A.ID
WHERE A.VALUE IN 'string literal'
)
GROUP BY A.VALUE
C is a table of vacancies, B is a table of properties of the vacancies and A is a table of property values. The tables have 1 to N relationships. We need to find a list of all other property values (and the number of times they occur) of vacancies that have a certain fixed property value related to it.
Please help in optimizing the query for efficiency.
Thanks in advance!
You don't need to join in C in either query, unless that is being used for filtering (that is, non matches are being filtered out). Try this:
SELECT A.Value, count(*)
FROM B JOIN
A
ON B.A_ID = A.ID
WHERE EXISTS (SELECT 1
FROM B b2 JOIN
A a2
ON b2.A_ID = a2.ID
WHERE a2.VALUE = 'string literal' AND b2.C_ID = b.C_ID
)
GROUP BY A.VALUE;
So I have two tables like this:
create table A
{
id int;
...
}
create table B
{
id int;
a_id int;
t timestamp;
...
}
A is one-to-many with B
I want to:
SELECT * FROM A LEFT JOIN B ON A.id = B.a_id ???
But I want to return exactly one row for each entry in A which has the B with the newest t field (or null for Bs fields if it has no B entry).
That is rather than returning all A-B pairs, I want to only select the newest one with respect to A (or A-null if no B entry).
Is there some way to express this in SQL? (I'm using MySQL 5.5)
LEFT JOIN is only concerned with ensuring every row in A is returned, even if there is no corresponding joined row in B.
The need for just one row needs another condition. MySQL is limitted in its options, but one could be:
SELECT
*
FROM
A
LEFT JOIN
B
ON B.id = A.id
AND B.t = (SELECT MAX(lookup.t) FROM B AS lookup WHERE lookup.id = A.id)
Another could be...
SELECT
*
FROM
A
LEFT JOIN
(
SELECT id, MAX(t) AS t FROM B GROUP BY id
)
AS lookup
ON lookup.id = A.id
LEFT JOIN
B
ON B.id = lookup.id
AND B.t = lookup.t
You could do the following:
SELECT A.*, B.*
FROM
A
LEFT JOIN
(SELECT B.a_id, MAX(t) as t FROM B GROUP BY B.a_id) BMax
ON A.id = BMax.a_id
JOIN B
ON B.a_id = BMax.a_id AND B.t = BMax.t
you first need to get the newest t from tableB in a subquery, then join it with tableA and tableB.
SELECT a.*, c.*
FROM tableA a
LEFT JOIN
(
SELECT a_ID, max(t) maxT
FROM tableB
GROUP BY a_ID
) b on a.a_id = b.a_ID
LEFT JOIN tableB c
ON b.a_ID = c.a_ID AND
b.maxT = c.t
try this:
SELECT *
FROM tableA A LEFT JOIN
(select a_id ,max(t) as max_t
from tableB
group by a_id )b
on A.id = b.a_id
and A.t=b.max_t