Remove Duplicate Based on three Columns MYSQL - mysql

Trying to remove duplicate based on three column like query will not find duplicate based on single column it will concatenate three column then remove duplicate rows based on the merge column.
I would appreciate if someone can share an easiest way of achieving this. I know this is not an appropraiote way but tried and its not working
Select concat(col1, col2, col3,) as newCol distinct newcCol from Table2
I know how to remove the table duplicate based on multiple columns using Excel VBA but do not know how to achieve this using mysql
Sub DelDupl()
Range("A1").CurrentRegion.RemoveDuplicates Columns:=Array(1, 2, 3), Header:=xlYes
End Sub
Table name is Table2 in Mysql
enter image description here
Sample Data
CREATE TABLE Table2(
col1 INT,
col2 varchar(10),
col3 INT,
col4 varchar(10),
col5 varchar(10),
col6 varchar(10),
col7 varchar(10));
INSERT INTO Table2 (col1,col2,col3,col4,col5,col6,col7)
VALUES ('1','A','123456','data1','data1','data1','data1'),
('2','B','78910','data2','data2','data2','data2'),
('3','C','45698','data3','data3','data3','data3'),
('1','A','123456','data1','data1','data1','data1'),
('2','B','78910','data2','data2','data2','data2'),
('3','C','45698','data3','data3','data3','data3'),
('4','D','85969','data5','data5','data5','data5');

The problem is there is no way to establish which to keep so I suggest you set up a staging table with a compound key, load it , truncate your existing table and load it from the staging table

How about this:
SELECT *
FROM table2 AS t
GROUP BY t.col1, t.col2, t.col3;
GROUP BY is normally used for aggregating functions(count, sum, max, etc), but will do the job for your purpose.
Edit:
Ok, since you actually need to delete duplicates, thats a bit more complicated, but it's possible. First we need to somehow discriminate duplicate rows, so we will temporary add a primary key. Then execute delete statement, while joining the table on itself to find duplicated rows. And lastly drop the primary key column we added.
add primary key column, we need some discrimination for duplicates
ALTER TABLE `table2` ADD COLUMN id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY;
join table on itself to get duplicates and delete them
DELETE t1
FROM `table2` t1
INNER JOIN `table2` t2
ON t1.col1 = t2.col1 AND t1.col2 = t2.col2 AND t1.col3 = t2.col3
WHERE t1.id < t2.id;
drop the primary key column
ALTER TABLE `table2` DROP COLUMN id;
As a note, the join is done on the columns which define duplication(col1, col2, col3) in your case. You can execute all 3 queries at once, just be sure you really need this data gone.

Related

Remove duplicate records in mysql

I have a table called leads with duplicate records
Leads:
*account_id
*campaign_id
I want to remove all the duplicate account_id where campaign_id equal to "51"
For example, if account_id = 1991 appears two times in the table then remove the one with campaign_id = "51" and keep the other one.
You could use a delete join:
DELETE t1
FROM yourTable t1
INNER JOIN yourTable t2
ON t2.account_id = t1.account_id AND
t2.campaign_id <> 51
WHERE
t1.campaign_id = 51;
There's no problem to delete from a table provided that:
You use the correct syntax.
You have done a backup of the table BEFORE you do any deleting.
However, I would suggest a different method:
Create a new table based on the existing table:
CREATE TABLE mytable_new LIKE mytable;
Add unique constraint (or PRIMARY KEY) on column(s) you don't want to have duplicates:
ALTER TABLE mytable_new ADD UNIQUE(column1,[column2]);
Note: if you want to identify a combination of two (or more) columns as unique, place all the column names in the UNIQUE() separated by comma. Maybe in your case, the constraint would be UNIQUE(account_id, campaign_id).
Insert data from original table to new table:
INSERT IGNORE INTO mytable_new SELECT * FROM mytable;
Note: the IGNORE will insert only non-duplicate values that match with the UNIQUE() constraint. If you have an app that runs a MySQL INSERT query to the table, you have to update the query by adding IGNORE.
Check data consistency and once you're satisfied, rename both tables:
RENAME TABLE mytable TO mytable_old;
RENAME TABLE mytable_new TO mytable;
The best thing about this is that in case that if you see anything wrong with the new table, you still have the original table.
Changing the name of the tables only take less than a second, the probable issue here is that it might take a while to do the INSERT IGNORE if you have a large data.
Demo fiddle
DELETE t1
FROM yourTable t1
INNER JOIN yourTable t2
ON t2.account_id = t1.account_id AND
t2.campaign_id <> 51
WHERE
t1.campaign_id = 51;

Copy content of one table to another, leaving the primary element

I have two tables with same elements, the only differnce is there ids' which are primary key and auto increment.
Table1 | Table2
id1(PK)| id2(PK)
col1 | col1
col2 | col2
col3 | col3
I know some quick ways to do that like,
INSERT INTO table2 SELECT * FROM table1 where id1 = 2
while using such method the content of table2 has id2 = 2 as it copies all the fields directly to table2 from table1, to restric that,
I can also use a method
INSERT INTO table2(col1,col2,col3) SELECT col1,col2,col3 FROM table1 WHERE id1 = 2
such way is good for short tables, but I have lot of columns in my table.
I need a quick way to copy all the columns from table1 to table2 leaving the primary columns which is id2, as it is autoincremented.
Its like I want to copy a specified row from table1 to table2 with different id2(which will be generated as its autoincremented).
Are there any possibilities.
If you do not want to mention column names but want to copy all, then try copy all data into a temp table, then drop pk_id field, and then copy rest of the fields into desired table, lastly drop the temp table.
Refer to one of my answers to a similar queries:
Mysql: Copy row but with new id
We can use temporary table to buffer first from main table and use it to copy to main table again. We can drop the pk field from the temp table and copy all other to the second table.
With reference to the answer by Tim Ruehsen in a referred posting:
CREATE TEMPORARY TABLE tmp_table SELECT * from first_table WHERE ...;
ALTER TABLE tmp_table drop pk_id; # drop autoincrement field
# UPDATE tmp_table SET ...; # just needed to change other unique keys
INSERT INTO second_table SELECT 0, tmp_table.* FROM tmp_table;
DROP TABLE tmp_table;
Copy all rows from table1 except that have same id in table2
INSERT INTO table2(col1, col2, col3)
SELECT t1.col1, t1.col2, t1.col3
FROM table1 t1
WHERE t1.id1 not in (SELECT id2 FROM table2);

mySQL synchronise certain rows from two tables - copy data and update

I have two tables, certain rows of which need to be synchronised at different times.
What is the cleanest way to copy rows from one table to another while preserving the primary keys of both tables?
At present I'm using the two queries shown below but I'm occasionally getting errors like this: Duplicate entry '465' for key 1
DELETE * FROM t2 WHERE instanceID='10'
INSERT INTO t2 (SELECT * FROM t1 WHERE instanceID='10')
Use the ON DUPLICATE KEY SET clause to copy the columns when there's a duplicate.
INSERT INTO t2
SELECT * FROM t1 WHERE <condition>
ON DUPLICATE KEY UPDATE col1 = t1.col1, col2 = t1.col2, col3 = t1.col3, ...

How to load values of one table to another automatically in MySQL..?

there is a existing table A. suppose i want to add all or specific(column) values of an existing table A to table B using foreign key, how do i do it in MySQL?
and if there is any new insert or update in table A it should automatically insert into table B also..
To automatically update Table B from changes in Table A would require triggers that MySQL supports but phpMyAdmin does not. If instead you're looking to insert rows into Table B from Table A on an ad-hoc basis then that's simple
INSERT INTO TABLEA (COL1, COL2, COL3)
SELECT FROM TABLEB (COL1, COL2, COL3)
WHERE (SELECT COUNT(*) FROM TABLEA WHERE TABLEA.COL1 = TABLEB.COL1) = 0
The above SQL does a simple copy from TableB into TableA. The WHERE clause ensures only records which don't already exist are inserted.
You can use insert/update trigger for that.
Why should you want to do that?
Why don't you just reference the data in table A using the forign key in table B you mentioned?
insert into tableb(columns) select columns from tablea

DELETE Difference NOT IN vs NOT EXISTS

I have two scenarios represented below, SCENARIO 1 works as well as SCENARIO 2 but are both those SCENARIOS achieving the Same Objective, Note in both Scenario's otherTbl is static
SCENARIO 1
CREATE TABLE `tbl`(
col1 VARCHAR(255),
PRIMARY KEY(col1)
) ENGINE='InnoDb';
Here is my set of queries that I run previously that make sense and run fine.
#Create an exact copy of the `tbl`
CREATE TEMPORARY TABLE `tmp_tbl`( .. SAME AS `tbl` .. );
#Add grouped records from another table into `tmp_table`
INSERT INTO tmp_tbl SELECT col1 FROM otherTbl GROUP BY col1;
#Delete the tables that donot exist any more int the `otherTbl`
DELETE FROM tbl WHERE tbl.col1 NOT IN (SELECT col1 FROM tmp_tbl);
SCENARIO 2
In this scenario the difference is only of the columns, As you can see all of them are primary Keys
CREATE TABLE `tbl`(
col1 VARCHAR(255),
col2 VARCHAR(255),
col3 VARCHAR(255),
PRIMARY KEY(col1, col2, col3)
) ENGINE='InnoDb';
Here are the new set of Queries
#Create an exact copy of the `tbl`
CREATE TEMPORARY TABLE `tmp_tbl`( .. SAME AS `tbl` .. );
#Add grouped records from another table into `tmp_table`
INSERT INTO tmp_tbl
SELECT col1, col2, col3 FROM otherTbl GROUP BY col1, col2, col3;
#Delete the tables that donot exist any more int the `otherTbl`
DELETE FROM tbl WHERE NOT EXISTS(SELECT col1, col2, col3 FROM `tmp_tbl`);
The question simply is, Do they achieve the same conclusion HENCE if we replace the delete query from NOT IN to NOT EXISTS in SCENARIO 1 it will still work the same way.
******SIMPLE VERSION******
Is:
DELETE FROM `tbl` WHERE tbl.col1 NOT IN (SELECT col1 FROM tmp_tbl);
Equall To:
DELETE FROM `tbl` WHERE NOT EXISTS(SELECT col1 FROM `tmp_tbl`);
I haven't tested it, but they are most likely not equivalent. The NOT EXISTS form would make sense if used with a correlated subquery. But your subquery doesn't contain any reference to the outer query, so probably the second form won't delete any rows at all.
Also, presence of NULLs in the table may make these two forms act very differently.
These two queries should, to my knowledge, achieve the same results (since the query checks for the same data - only the second one does it in a more elegant manner maybe).