Get number of duplicate rows resulting from a DISTINCT query - mysql

I have a table with rows where a, b, and c are commonly the same.
I have a query that gives me each unique record. I'm trying to get the count, of the duplicate records for each distinct record returned.
SELECT DISTINCT
a,
b,
c,
COUNT(id) as counted
FROM
table
The COUNT here returns the count for all the records. What I was looking for was the count of records identical to the unique record.

SELECT a,b,c,COUNT(*) FROM table GROUP BY a,b,c

SELECT DISTINCT
a,
b,
c,
(
SELECT
COUNT(id)
FROM
table_name t1
WHERE
t2.a = t1.a
) AS counted
FROM
table_name t2
The above sub query know as inline sub query. in where clause t1 and t2 treat as different table(It's single table in DB) by query. So it check the equality and then count. as we put distinct for a column so all play done with that only.
I hope am able to enplane.

Ah, figured this one out from a duplicate as I was writing the question - I figured I'd share my results as they were different enough from the answer I got mine from.
I have to use a subquery to get query non-distinct records. Then, I can use results from the first query in the subquery's WHERE clause.
SELECT DISTINCT
a,
b,
c,
(
SELECT
COUNT(id)
FROM
table_name t1
WHERE
t2.a = t1.a
) AS counted
FROM
table_name t2
This works. Let me know if there are gaps in my understanding.
With help from this answer: https://stackoverflow.com/a/14110336/1270996

Related

Create a new variable in SQL by groupby

I have 2 sql table as follows:
First table t1:
Second table t2:
I need to calculate the count of "Number" column based on "Name" column from t1 and merge it with t2.
I wrote following code. But it seems not working
select *
from (
select Name, count(Number) as count
from t1
group by Name ) as a
join ( select *
from t2 ) as b
on a.Name = b.Name;
Can any one figure out what is wrong ? Thank you very much
I think you want to use SUM() instead of COUNT().
Because SUM() sums some integers, while COUNT() counts number of occurencies.
And as also stated in the comments, multiple columns with same names will create conflicts, so you have to select the wanted columns explicit (that is usually a good idea anyway).
You could obtain your wanted endgoal by this query:
select
SUM(Number),
t1.Name,
(select val1 FROM t2 WHERE t2.Name = t1.Name LIMIT 1) as val1
FROM t1
GROUP BY t1.Name
Example in sqlfiddle: http://sqlfiddle.com/#!9/04dddf/7

Select Count from Two Tables = Multiplication in SQL?

I randomly tried running a query like:
select count(*) from table1, table2
The result was essentially multiplication of the actual row count of the two tables, i.e. the result was 645792 rows based on the fact that table1 had 868 rows, and table2 had 744 rows.
Is this an expected behaviour, I checked out the documentation but could not get any better understanding of this behaviour.
This is your from clause:
from table1, table2
This is equivalent to:
from table1 cross join table2
This is a cartesian product of both tables, which generates a resultset containing 868 * 744 rows. Then count(*) just counts the number of resulting rows, hence the result that you are getting.
If you wanted to sum the number of rows in each table, you would compute two separate counts:
select
(select count(*) from table1)
+ (select count(*) from table2) total_no_rows
Your current query:
select count(*) from table1, table2
is using the old school implicit join syntax. As there is no join criteria appearing in a WHERE clause (there is no WHERE clause), the join defaults to being a cross join. This is just the cross product between records in the two tables, which is what you are currently seeing. A better way to write your query would be:
SELECT COUNT(*)
FROM table1
CROSS JOIN table2;
'INNER JOIN and , (comma) are semantically equivalent in the absence of a join condition: both produce a Cartesian product' https://dev.mysql.com/doc/refman/8.0/en/join.html

Explaining MySQL query with multiple tables listed in FROM

a, b are not directly related.
What does a,b have to do with the results?
select * from a,b where b.id in (1,2,3)
can you explain sql?
Since you haven't specified a relationship between a and b, this produces a cross product. It's equivalent to:
SELECT *
FROM a
CROSS JOIN b
WHERE b.id IN (1, 2, 3)
It will combine every row in a with the three selected rows from b. If a has 100 rows, the result will be 300 rows.
What you using is Multitable SELECT.
Multitable SELECT (M-SELECT) is similar to the join operation. You
select values from different tables, use WHERE clause to limit the
rows returned and send the resulting single table back to the
originator of the query.
The difference with M-SELECT is that it would return multiply tables
as the result set. For more deatils: https://dev.mysql.com/worklog/task/?id=358
In other word, you query is :
SELECT *
FROM a
CROSS JOIN b
WHERE b.id in (1,2,3)

mysql get all records when looking for duplicates

I want to make a report of all the entries in a table where one column has duplicate entries. Let's assume we have a table like this:
customer_name | some_number
Tom 1
Steve 3
Chris 4
Tim 3
...
I want to show all the records that have some_number as a duplicate. I have used a query like this to show all the duplicate records:
select customer_name, some_number from table where some_number in (select some_number from table group by some_number having count(*) > 1) order by some_number;
This works for a small table, but the one I actually need to operate on is fairly large. 30,000 + rows and it is taking FOREVER! Does someone have a better way to do this?
Thanks!
Try this query:
SELECT t1.*
FROM (SELECT some_number, COUNT(*) AS nb
FROM your_table
GROUP BY some_number
HAVING nb>1
) t2, your_table t1
WHERE t1.some_number=t2.some_number
The query first uses GROUP BY to find duplicate records, then joins with the table to retrieve all fields.
Since HAVING is used, it will return only the records you are interested in, then do the join with your_table.
Be sure your table has an index on some_number if you want the query to be fast.
Does this perform better? It joins on a table of some_number counts and then filters to include only those with a count > 1.
SELECT t.customer_name, t.some_number
FROM my_table t
INNER JOIN (
SELECT some_number, COUNT(*) AS ct
FROM my_table
GROUP BY some_number ) dup ON t.some_number = dup.some_number
WHERE dup.ct > 1

Selecting from two tables, with different columns, where one needs a count

I have two tables, TableA and TableB.
I need to select one count value from TableA, based on a where condition.
I need to select two values from TableB.
I'd like all the values in one result set. There will never be more than one row in the result set.
Here's what I have now:
SELECT count(id) FROM TableA WHERE ($some_where_statement) SELECT owner, owner_ID from TableB
I know this should be simple, but this is throwing an error. Any suggestions?
You can cross join to join rows from two unrelated tables:
SELECT T1.cnt, T2.owner, T2.owner_ID
FROM (SELECT count(id) FROM TableA WHERE ($some_where_statement)) AS T1
CROSS JOIN (SELECT owner, owner_ID from TableB) AS T2
To have only one row in the result set, it is assumed that both subqueries only return one row. I suspect that this is not the case for the second subquery. You are probably missing a where clause.