Find duplicate records with different values - mysql

I know how to query to find records with duplicate records based on a field or fields e.g.
Select Customer,Count(*) from Table1 group by Customer,Month having count(*)>1
which would give me a list of all customers who ordered more than once in a given month.
However from that select I'd like to:
Refine the group to show only dupes where the product is DIFFERENT. I know if I wanted to do the same I'd simple add to group by ',Product' but in my case it is Product != Product and I'm not sure how to indicate that in the group
Instead of getting a list of just which Customers ordered more than one product in a given month a list of all those orders. In other words instead of this type of list from the group:
Bob,December
Mary,June
I am trying to return:
Bob,Widget,December
Bob,Pipes,December
Mary,Books,June
Mary,Cars,June

If product field is in the same table, then you can use count with distinct on the product field to get the number of distinct products:
Select Customer, Month, Count(distinct product)
from Table1
group by Customer, Month
having count(distinct product)>1
If you want to know what they ordered, then join it back as a subquery to your main table:
select distinct t1.customer, t1.month, t1.product from table1 t1
inner join
(Select Customer, Month, Count(distinct product)
from Table1
group by Customer, Month
having count(distinct product)>1
) t2 on t1.customer=t2.customer and t1.month=t2.month
The distinct in the outer select depends on your exact needs.

Related

Count Distinct on multiple values within same column in SQL Aggregation

Objective:
I wanted to show the number of distinct IDs for any combination selected.
In the below example, I have data at a granular level: ID level data.
I wanted to show the number of distinct IDs for each combination.
For this, I use count distinct which will give me '1' for the below combinations.
But let's say if I wanted to find the number of IDs who made both E-commerce and Face to face transactions, in that case, if I just use this data, I would be showing the sum of E-comm and Face to face and the result would be '2' instead of '1'.
And this is not limited to Ecom/Face to face. I wanted to apply the same logic for all columns.
Please let me know if you have any other alternative approach to address this issue.
First aggregate in your table to get the distinct ids for each TranType:
SELECT TranType, COUNT(DISTINCT id) counter_distinct
FROM tablename
GROUP BY TranType
and then join to the table:
SELECT t.*, g.counter_distinct
FROM tablename t
INNER JOIN (
SELECT TranType, COUNT(DISTINCT id) counter_distinct
FROM tablename
GROUP BY TranType
) g ON g.TranType = t.TranType
Or use a correlated subquery:
SELECT t1.*,
(SELECT COUNT(DISTINCT t2.id) FROM tablename t2 WHERE t2.TranType = t1.TranType) counter_distinct
FROM tablename t1
But let's say if I wanted to find the number of IDs who made both E-commerce and Face to face transactions, in
You can get the list of ids using:
select id
from t
where tran_type in ('Ecomm', 'Face to face')
group by id
having count(distinct tran_type) = 2;
You can get the count using a subquery:
select count(*)
from (select id
from t
where tran_type in ('Ecomm', 'Face to face')
group by id
having count(distinct tran_type) = 2
) i;

Select entries appearing in all results from another query

I have a single table.
This table has 2 fields, product IDs and Store IDs. The same product ID can exist with many different Store IDs.
I need to find the products (if any) that are common across all the stores.
I'm having difficulty constructing the correct query, any advice?
You can check distinct store ids count with product id. If distinct Store ids count equal to total stores that will be the product ids you want.
SELECT productID, count(DISTINCT StoreID) as stroes FROM [Table name] GROUP BY productID
HAVING COUNT(DISTINCT StoreID) = (SELECT COUNT(DISTINCT StoreID) FROM [Table name] );
I'm sure you'll get many better answers, but it sounds like you are wanting the reverse of the distinct clause, not sure if this will work though:
SELECT NOT DISTINCT [Product_ID]
FROM TABLENAMEHERE
You could sue count(distinct productID)
select productID
from my_table
group by productID
having count(distinct productID) = (
select count(distinct store)
from my_table )

MYSQL - Count and GROUP from table1 and get info from table2

I am struggling with a MySQL query which I cant get to work as I want.
In table1 I have co_id, name, code, product, logindate.
in table2 I have pr_id, productname, productno, price.
I want to count and group the PRODUCT from table1, so I can see how many that have picked for example product 1,2,3 etc.
But when I list the result on the page I will need productname, and productno for each id number in the GROUP search. table1.product is joined with table2.pr_id
This is what I have so far, but I think I am missing something with INNER JOIN or similar, right?
SELECT
codes.pickedgift,
products.productno,
products.productname,
COUNT(codes.pickedgift) as num
FROM
codes,
products
GROUP BY codes.pickedgift
ORDER BY codes.pickedgift
you missing the join condition, when you join 2 tables you should link primary key in table1 to its foreign key in another table, so your query can be:
SELECT
codes.pickedgift,
products.productno,
products.productname,
COUNT(codes.pickedgift) as num
FROM
codes INNER JOIN products ON codes.product = products.pr_id
GROUP BY codes.pickedgift
ORDER BY codes.pickedgift
You should use a sub-select for this query.
-- assuming I have your table structure correct.
SELECT p.productno, p.productname, num
FROM (SELECT codes.pickedgift, COUNT(codes.pickedgift) as num
FROM codes
GROUP BY codes.pickedgift) g
JOIN products p ON p.id = g.pickedgift
ORDER BY g.pickedgift
The other thing you have to make sure of is if you're using a group-by, the fields in your select must either be the fields in the group by, or aggregates. MySQL let's you include columns that are not part of the group-by / aggregate, it becomes ambiguous as to which value productno and productname should be represented, which is why I opted for a sub-select instead.

MySQL - Group and total, but return all rows in each group

I'm trying to write a query that finds each time the same person occurs in my table between a specific date range. It then groups this person and totals their spending for a specific range. If their spending habits are greater than X amount, then return each and every row for this person between date range specified. Not just the grouped total amount. This is what I have so far:
SELECT member_id,
SUM(amount) AS total
FROM `sold_items`
GROUP BY member_id
HAVING total > 50
This is retrieving the correct total and returning members spending over $50, but not each and every row. Just the total for each member and their grand total. I'm currently querying the whole table, I didn't add in the date ranges yet.
JOIN this subquery with the original table:
SELECT si1.*
FROM sold_items AS si1
JOIN (SELECT member_id
FROM sold_items
GROUP BY member_id
HAVING SUM(amount) > 50) AS si2
ON si1.member_id = si2.member_id
The general rule is that the subquery groups by the same column(s) that it's selecting, and then you join that with the original query using the same columns.
SELECT member_id, amount
FROM sold_items si
INNER JOIN (SELECT member_id,
SUM(amount) AS total
FROM `sold_items`
GROUP BY member_id
HAVING total > 50) spenders USING (member_id)
The query you have already built can be used as a temporary table to join with. if member_id is not an index on the table, this will become slow with scale.
The word spenders is a table alias, you can use any valid alias in its stead.
There are a few syntaxes that will get the result you are looking, here is one using an inner join to ensure that all rows returned have a member_id in the list returned by the group by and that the total is repeated for each a certain member has:
SELECT si.*, gb.total from sold_items as si, (SELECT member_id as mid,
SUM(amount) AS total
FROM `sold_items`
GROUP BY member_id
HAVING total > 50) as gb where gb.mid=si.member_id;
I think that this might help:
SELECT
member_id,
SUM(amount) AS amount_value,
'TOTAL' as amount_type
FROM
`sold_items`
GROUP BY
member_id
HAVING
SUM(amount) > 50
UNION ALL
SELECT
member_id,
amount AS amount_value,
'DETAILED' as amount_type
FROM
`sold_items`
INNER JOIN
(
SELECT
A.member_id,
SUM(amount) AS total
FROM
`sold_items` A
GROUP BY
member_id
HAVING
total <= 50
) AS A
ON `sold_items`.member_id = A.member_id
Results of the above query should be like the following:
member_id amount_value amount_type
==========================================
1 55 TOTAL
2 10 DETAILED
2 15 DETAILED
2 10 DETAILED
so the column amount_type would distinguish the two specific member groups
You could do subquery with EXISTS as an alternative:
select *
from sold_items t1
where exists (
select * from sold_items t2
where t1.member_id=t2.member_id
group by member_id
having sum(amount)>50
)
ref: http://dev.mysql.com/doc/refman/5.7/en/exists-and-not-exists-subqueries.html
In case you need to group by multiple columns, you can use a composite identifier with concatenate in combination with a group by subquery
select id, key, language, group
from translation
--query all key-language entries by composite identifier...
where concat(key, '_', language) in (
--by lookup of all key-language combinations...
select concat(key, '_', language)
from translation
group by key, language
--that occur more than once
having count(*) > 1
)

Alternative for a loop

I am trying to generate a simple report that will display the number of customers owning number of distinct brands. The following query I wrote generates the desired numbers one at a time. I tried writing a loop and it takes forever. Is there an alternative?
SELECT COUNT(DISTINCT customer_id)
FROM
(
SELECT customer_id,COUNT(DISTINCT brand) AS no_of_customers
FROM table_A
WHERE brand_id != 10
GROUP BY customer_id
HAVING COUNT(DISTINCT brand) =1
ORDER BY customer_id) as t1;
What this does is to give me a count of customers with a total count of distinct brands =1. I change the count of brands to 2,3 and so on. Please let me know if there is a way to automate this.
Thanks a lot.
Use a second level of GROUP BY to get them all in one query, rather than looping.
SELECT no_of_brands, COUNT(*) no_of_customers
FROM (SELECT customer_id, COUNT(DISTINCT brand) no_of_brands
FROM Table_A
WHERE brand_id != 10
GROUP BY customer_id) x
GROUP BY no_of_brands
You also don't need DISTINCT in your outer query, since the inner query's grouping guarantees that the customer IDs will be distinct.