How to count duplication items? - mysql

Every customer should not have duplicated code, as you can see the result below for example Customer-A have duplicated Code of 22 and Customer-D have duplicated Code of 44
I like to run a query to get a number of how many duplications do we have, from the result below it should be 4. I have tried using Group By Code and Having but not having much luck.
customer Code
------ ---------
A 11
A 22
A 22
B 33
C 22
D 44
D 44
D 44
D 22

We can use group by and keep the combinations with more than one line
create table t(
customer char(1),
Code int);
insert into t values
('A', 11),
('A', 22),
('A', 22),
('B', 33),
('C', 22),
('D', 44),
('D', 44),
('D', 44),
('D', 22);
SELECT
customer,
code,
count(*) "number"
FROM t
GROUP BY
customer,
code
HAVING
COUNT(*) > 1;
customer | code | number
:------- | ---: | -----:
A | 22 | 2
D | 44 | 3
db<>fiddle here

Related

Using time interval in table for select in another

I am using a MySQL data base with 2 tables:
In one table I have BatchNum and Time_Stamp. In another I have ErrorCode and Time_Stamp.
My goal is to use timestamps in one table as the beginning and end of an interval within which I'd like to select in another table. I would like to select the beginning and end of intervals within which the BatchNum is constant.
CREATE TABLE Batch (BatchNum INT, Time_Stamp DATETIME);
INSERT INTO Batch VALUES (1,'2020-12-17 07:29:36'), (1, '2020-12-17 08:31:56'), (1, '2020-12-17 08:41:56'), (2, '2020-12-17 09:31:13'), (2, '2020-12-17 10:00:00'), (2, '2020-12-17 10:00:57'), (2, '2020-12-17 10:01:57'), (3, '2020-12-17 10:47:59'), (3, '2020-12-17 10:48:59'), (3, '2020-12-17 10:50:59');
CREATE TABLE Errors (ErrorCode INT, Time_Stamp DATETIME);
INSERT INTO Errors VALUES (10, '2020-12-17 07:29:35'), (11, '2020-12-17 07:30:00'), (12, '2020-12-17 07:30:35'), (10, '2020-12-17 07:30:40'), (22, '2020-12-17 10:01:45'), (23, '2020-12-17 10:48:00');
In my example below, I would like something like SELECT BatchNum , ErrorCode, Errors.Time_Stamp WHERE Erorrs.Time_Stamp BETWEEN beginning_of_batch and end_of_batch:
+----------+-----------+---------------------+
| BatchNum | ErrorCode | Errors.Time_Stamp |
+----------+-----------+---------------------+
| 1 | 11 | 2020-12-17 07:30:00 |
| 1 | 12 | 2020-12-17 07:30:35 |
| 1 | 10 | 2020-12-17 07:30:40 |
| 2 | 22 | 2020-12-17 10:01:45 |
| 3 | 23 | 2020-12-17 10:48:00 |
+----------+-----------+---------------------+
I am using an answer from a previous question:
Select on value change
to find BatchNum changes but I don't know how to include this in another select to get the ErrorCodes happening within the interval defined by BatchNum changes.
I think you want:
select b.*, e.error_code, e.time_stamp as error_timestamp
from (
select b.*,
lead(time_stamp) over(order by time_stamp) lead_time_stamp
from batch b
) b
inner join errors e
on e.time_stamp >= b.time_stamp
and (e.time_stamp < b.lead_time_stamp or b.lead_time_stamp is null)

SQL - get distinct values and its frequency count of occurring in a group

I have a table data as:
CREATE TABLE SERP (
id INT(6) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
s_product_id INT,
search_product_result VARCHAR(255)
);
INSERT INTO SERP(s_product_id, search_product_result)
VALUES
(0, 'A'),
(0, 'B'),
(0, 'C'),
(0, 'D'),
(1, 'A'),
(1, 'E'),
(2, 'A'),
(2, 'B'),
(3, 'D'),
(3, 'E'),
(3, 'D');
The data set is as follows:
s_product_id | search_product_result
___________________________________________
0 | A
0 | B
0 | C
0 | D
-------------------------------------------
1 | A
1 | E
-------------------------------------------
2 | A
2 | B
-------------------------------------------
3 | D
3 | E
3 | D
I need to list all distinct search_product_result values and count frequencies of these values occurring in s_product_id.
Required Output result-set:
DISTINCT_SEARCH_PRODUCT | s_product_id_frequency_count
------------------------------------------------------------
A | 3
B | 2
C | 1
D | 2 [occurred twice in 3, but counted only once.]
E | 2
Here, A occurs in three s_product_id : 0, 1, 2, B in two : 0, 2, and so on.
D occurred twice in the same group 3, but is counted only once for that group.
I tried grouping by search_product_result, but this counts D twice in the same group.
select search_product_result, count(*) as Total from serp group by search_product_result
Output:
search_product_result | Total
------------------------------------
A | 3
B | 2
C | 1
D | 3 <---
B | 2
You can try below - use count(distinct s_product_id)
select search_product_result, count(distinct s_product_id) as Total
from serp group by search_product_result
use count(distinct()
select search_product_result, count(distinct s_product_id, search_product_result) as Total
from SERP
group by search_product_result
see dbfiddle

Update a table with conditions in specfic order

I have searched the web for my problem, tested some subqueries and derived table approaches with Case Statements, but didnĀ“t get the result. Perhaps you can help? Thanks.
The examples below are just an example.
# generate the table as it is
DROP TABLE IF EXISTS `IN`;
CREATE TABLE `IN`
(`Part` CHAR(1),
`Warehouse` INT(1),
`Percentage` INT(1),
`Update` INT(1));
#some values for the table
INSERT INTO `IN`
(Part, Warehouse, Percentage)
VALUES
('A' , 1, 80),
('A', 2, 100),
('A', 3, 50),
('B', 1, 100),
('B', 2, 50),
('B', 3, 100);
# generate table as it should be
DROP TABLE IF EXISTS `OUT`;
CREATE TABLE `OUT`
(`Part` CHAR(1),
`Warehouse` INT(1),
`Percentage` INT(1),
`Update` INT(1));
# values for the table
INSERT INTO `OUT`
(Part, Warehouse, Percentage, `Update`)
VALUES
('A' , 1, 80, 2),
('A', 2, 100, 2),
('A', 3, 50, 2),
('B', 1, 100, 3),
('B', 2, 50, 3),
('B', 3, 100, 3);
I would like to add the specific warehouse name in the column Update for the specific part if the percentage is 100.
The value of the warehouse should be filled to every row for the specific part.
The fill calculation of the update column should be in a specific order.
So first there should be a check if warehouse 3 has 100 and take this value. If warehouse 3 only has 50 then check warehouse 2, if it has 100.
Thank you very much!
Here's one way...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(Part CHAR(1)
,Warehouse INT NOT NULL
,Percentage TINYINT NOT NULL
);
INSERT INTO my_table
(Part, Warehouse, Percentage)
VALUES
('A' , 1, 80),
('A', 2, 100),
('A', 3, 50),
('B', 1, 100),
('B', 2, 50),
('B', 3, 100);
SELECT w1.*, COALESCE(w3.warehouse,w2.warehouse,w1.warehouse) warehouse
FROM my_table w1
LEFT
JOIN my_table w2
ON w2.part = w1.part
AND w2.warehouse = 2
AND w2.percentage = 100
LEFT
JOIN my_table w3
ON w3.part = w1.part
AND w3.warehouse = 3
AND w3.percentage = 100;
+------+-----------+------------+-----------+
| Part | Warehouse | Percentage | warehouse |
+------+-----------+------------+-----------+
| A | 1 | 80 | 2 |
| A | 2 | 100 | 2 |
| A | 3 | 50 | 2 |
| B | 1 | 100 | 3 |
| B | 2 | 50 | 3 |
| B | 3 | 100 | 3 |
+------+-----------+------------+-----------+

Left join that will exclude certain rows

When I left join the following tables, I get the results for all the id's. I need to exclude the results where there is no single id present in sms table.
So the expected output is as follows:
+-----------+-----------+
| messageid | mobilenos |
+-----------+-----------+
| a | 12 |
| c | 31 |
+-----------+-----------+
2 rows in set (0.00 sec)
The messageid "d" should not be displayed in the output because there is not a single entry for "d" in the sms table.
I will like to know if the following query is correct or if there is a better way:
select a.* from splitvalues as a
left join sms as b on a.messageid = b.batchid and a.mobilenos = b.destination
left join (select a.messageid from splitvalues as a left join sms as b on a.messageid = b.batchid where b.batchid is null) as dt on dt.messageid = a.messageid where dt.messageid is null and b.destination is null;
Following are the table details:
splitvalues
messageid mobilenos
a 10
a 11
a 12
b 20
b 21
b 22
b 23
b 24
c 30
c 31
d 40
d 41
d 42
d 43
sms
batchid destination
a 10
a 11
b 20
b 21
b 22
b 23
b 24
c 30
drop table if exists splitvalues;
drop table if exists sms;
create table if not exists splitvalues (messageid varchar(255), mobilenos int);
create table if not exists sms (batchid varchar(255), destination int);
insert into splitvalues values ('a', 10), ('a', 11), ('a', 12), ('b', 20), ('b', 21), ('b', 22), ('b', 23), ('b', 24), ('c', 30), ('c', 31), ('d', 40), ('d', 41), ('d', 42), ('d', 43);
insert into sms values ('a', 10), ('a', 11), ('b', 20), ('b', 21), ('b', 22), ('b', 23), ('b', 24), ('c', 30);
mysql> select a.* from splitvalues as a left join sms as b on a.messageid = b.batchid and a.mobilenos = b.destination where b.destination is null;
+-----------+-----------+
| messageid | mobilenos |
+-----------+-----------+
| a | 12 |
| c | 31 |
| d | 40 |
| d | 41 |
| d | 42 |
| d | 43 |
+-----------+-----------+
6 rows in set (0.00 sec)
Try This...
select a.* from [dbo].[splitvalues] a join [dbo].[sms] b on a.messageid=b.batchid
Or
select a.* from [dbo].[splitvalues] a ,[dbo].[sms] b where a.messageid=b.batchid
Try inner join it will produce rows which exists in both table,
select * from splitvalues as a
inner join sms as b on a.messageid = b.batchid and a.mobilenos = b.destination
select * from
splitvalues
where mobilenos not in(select destination from sms) limit 2;

Detecting near duplicates above a threshold

I want to be able to query a table for records I suspect may be nearly duplicates.
I've racked my brains but can't think where to begin with this one, so I've simplified the problem as much as possible, and came to ask here!
Here's my simplified table:
CREATE TABLE sales
(
`id1` int auto_increment primary key,
`amount` decimal(6,2),
`date` datetime
);
Here's some test values:
INSERT INTO sales
(`amount`, `date`)
VALUES
(10, '2013-05-15T11:11:00'),
(11, '2013-05-15T11:11:11'),
(20, '2013-05-15T11:22:00'),
(3, '2013-05-15T12:12:00'),
(4, '2013-05-15T12:12:12'),
(45, '2013-05-15T12:22:00'),
(4, '2013-05-15T12:24:00'),
(8, '2013-05-15T13:00:00'),
(9, '2013-05-15T13:01:00'),
(10, '2013-05-15T14:00:00');
The problem
I want to return sales above amount Y, that have neighbour sales above Y that recorded within X minutes of each other.
ie, from this data:
amt, date
(10, '2013-05-15T11:11:00'),
(11, '2013-05-15T11:11:11'),
(20, '2013-05-15T11:22:00'),
(3, '2013-05-15T12:12:00'),
(4, '2013-05-15T12:12:12'),
(45, '2013-05-15T12:22:00'),
(4, '2013-05-15T12:24:00'),
(8, '2013-05-15T13:00:00'),
(9, '2013-05-15T13:01:00'),
(10, '2013-05-15T14:00:00');
where #yVal = 5 and #xMins = 10
expected result would be:
(10, '2013-05-15T11:11:00'),
(11, '2013-05-15T11:11:11'),
(20, '2013-05-15T11:22:00'),
(8, '2013-05-15T13:00:00'),
(9, '2013-05-15T13:01:00'),
I've put the above into a fiddle: http://sqlfiddle.com/#!2/cf8fe
Any help will be greatly appreciated!
Try somthing like this:
SELECT DISTINCT s1.* FROM sales s1
LEFT JOIN sales s2
ON (s1.id1 != s2.id1
AND s1.amount >= s2.amount - #xVal AND s1.amount <= s2.amount + #xVal
AND s1.date >= DATE_SUB(s2.date, INTERVAL #xMins minute) AND s1.date <= DATE_ADD(s2.date, INTERVAL #xMins minute)
)
WHERE
s2.id1 is not null
Extends
Fix some errors
Result for your data looks like:
+-----+--------+---------------------+
| id1 | amount | date |
+-----+--------+---------------------+
| 1 | 10.00 | 2013-05-15 11:11:00 |
| 2 | 11.00 | 2013-05-15 11:11:11 |
| 4 | 3.00 | 2013-05-15 12:12:00 |
| 5 | 4.00 | 2013-05-15 12:12:12 |
| 8 | 8.00 | 2013-05-15 13:00:00 |
| 9 | 9.00 | 2013-05-15 13:01:00 |
+-----+--------+---------------------+
Extends 2
SELECT DISTINCT s1.* FROM sales s1
LEFT JOIN sales s2
ON (s1.id1 != s2.id1
AND s2.amount >= #xVal
AND s1.date >= DATE_SUB(s2.date, INTERVAL #xMins minute) AND s1.date <= DATE_ADD(s2.date, INTERVAL #xMins minute)
)
WHERE
s2.id1 is not null
AND s1.amount >= #xVal