how to delete duplicate records in mysql table - mysql

I'm having an issue with finding and deleting duplicate records, I have a table with IDs called CallDetailRecordID which I need to scan and delete records, the reason there are duplicates is that I'm exporting data to special arching engine works with MySQL and it doesn't support indexing.
I tried using "Select DISTINCT" but it dosn't work, is there is another way? I'm hoping I can create a store procedure and have it run weekly to perform clean up.
your help is highly appreciated.
Thank you

CREATE TABLE tmp_table LIKE table
INSERT INTO tmp_table (SELECT * FROM table GROUP BY CallDetailRecordID)
RENAME table TO old_table
RENAME tmp_table to table
Drop the old table if you want, add a LOCK TABLES statement at the beginning to avoid lost inserts.

Related

MySQL renaming and create table at the same time

I need to rename MySQL table and create a new MySQL table at the same time.
There is critical live table with large number of records. master_table is always inserted records from scripts.
Need to backup the master table and create a another master table with same name at the same time.
General SQL is is like this.
RENAME TABLE master_table TO backup_table;
Create table master_table (id,value) values ('1','5000');
Is there a possibility to record missing data during the execution of above queries?
Any way to avoid missing record? Lock the master table, etc...
What I do is the following. It results in no downtime, no data loss, and nearly instantaneous execution.
CREATE TABLE mytable_new LIKE mytable;
...possibly update the AUTO_INCREMENT of the new table...
RENAME TABLE mytable TO mytable_old, mytable_new TO mytable;
By renaming both tables in one statement, they are swapped atomically. There is no chance for any data to be written "in between" while there is no table to receive the write. If you don't do this atomically, some writes may fail.
RENAME TABLE is virtually instantaneous, no matter how large the table. You don't have to wait for data to be copied.
If the table has an auto-increment primary key, I like to make sure the new table starts with an id value greater than the current id in the old table. Do this before swapping the table names.
SELECT AUTO_INCREMENT FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA='mydatabase' AND TABLE_NAME='mytable';
I like to add some comfortable margin to that value. You want to make sure that the id values inserted to the old table won't exceed the value you queried from INFORMATION_SCHEMA.
Change the new table to use this new value for its next auto-increment:
ALTER TABLE mytable_new AUTO_INCREMENT=<increased value>;
Then promptly execute the RENAME TABLE to swap them. As soon as new rows are inserted to the new, empty table, it will use id values starting with the increased auto-increment value, which should still be greater than the last id inserted into the old table, if you did these steps promptly.
Instead of renaming the master_backup table and recreating it, you could
just create a backup_table with the data from the master_table for the first backup run.
CREATE TABLE backup_table AS
SELECT * FROM master_table;
If you must add a primary key to the backup table then run this just once, that is for the first backup:
ALTER TABLE backup_table ADD CONSTRAINT pk_backup_table PRIMARY KEY(id);
For future backups do:
INSERT INTO backup_table
SELECT * FROM master_table;
Then you can delete all the data in the backup_table found in the master_table like:
DELETE FROM master_table A JOIN
backup_table B ON A.id=B.id;
Then you can add data to the master_table with this query:
INSERT INTO master_table (`value`) VALUES ('5000'); -- I assume the id field is auto_incrementable
I think this should work perfectly even without locking the master table, and with no missing executions.

After copying with INSERT...SELECT, I have more records than before

I have a strange occurance: I copied ~4.7m records from one table to another in MySQL 5.6.14, using INSERT INTO tabl1 (col1,...) SELECT (col2...) FROM tbl2... and I have more records than before. 640 to be exact.
I checked by doing a select count(*) on both tables, subtracting the new table from the old table (which gave me the -640).
Any ideas? I'd like to know where the extra 640 records came from.
Both are InnoDB; the old table is latin charset, the new is utf8. Doubt that's part of the equation, but maybe someone with much more exp with MySQL would know.
SQL statement example:
INSERT INTO `table1` (`col1`,`col2`,`col3`) SELECT (`colA`,`colB`,`colC`) FROM `table2`;
The table receiving records is new, and has 0 records in it, and never had records in it. Also, it's not a production environment, so nothing should be adding records to it except this 1 statement.
try this:
First remove the table if it exists already in the database. This way you know for sure that tabl1 won't have any extra data.
DROP TABLE IF EXISTS tabl1;
Recreate the table to be copied which will be tabl2 using a create table statement to copy everything from tabl2 into tabl1 as follows:
CREATE TABLE tabl1 SELECT * FROM `tabl2

Deduplicate entries in large table that share same values in multiple columns

We've got this big 'favorites' table, and we ran in to an issue that uncovered the fact that we dont have a unique constraint on user, favorite_type, and favorite_id. I've made a migration that will add an index on to these 3, but it won't work because we have existing data that has the same set of entries. There's other data in there too (updated_at, created_at, id) that is fine to lose, but makes it an imperfect match.
Is there a way in rails (3.2.x) to do this, or a way in (my)SQL?
I know I could pull all of them, then group by, and map over a delete of all extra elements, but it is a very large table (1mil+) and we can't have long-running migrations.
Copy the table structure to a new table, add the unique constraints, then insert all the records. The duplicates will fail due to the constraint.
CREATE TABLE tableTmp LIKE table;
Add the constraints then insert all the records into the temporary table.
INSERT INTO tableTmp SELECT * FROM table
Verify the entries then drop and rename.
DROP TABLE table;
RENAME TABLE tableTmp TO table;

SQL Entry's Duped

I messed up when trying to create a test Database and accidently duplicated everything inside of a certain table. Basically there is now 2 of every entry there was once before. Is there a simple way to fix this? (Using InnoDB tables)
Yet another good reason to use auto incrementing primary keys. That way, the rows wouldn't be total duplicates.
Probably the fastest way is to copy the data into another table, truncate the first table, and re-insert it:
create temporary table tmp as
select distinct *
from test;
truncate table test;
insert into test
select *
from tmp;
As a little note: in almost all cases, I recommend using the complete column list on an insert statement. This is the one case where it is optional. After all, you are putting all the columns in another table and just putting them back a statement later.

mySQL: duplicating multiple records via temporary table, how to preserve autoincrement index?

I wish to duplicate a selection of records in a mySQL table.
The pk of the table is an autoincremented int.
I want to do this with one set of mysql queries (for performance reasons).
It seems like the fastest way to do this is to put the results of the selection into a temporary table,
make any changes needed, and reinsert the records back to the original table, like this:
CREATE TEMPORARY TABLE temp1234 ENGINE=MEMORY SELECT * FROM a_table WHERE column='my selection';
# do updates in temp1234; (altering FK's mainly)
INSERT INTO a_table SELECT * FROM temp1234;
But when I try to do this i get an error for duplicate PKs.
Now, I realise that I could alter the INSERT with SELECT query to exclude the pk/ID column, but as I am proceduraly generating these queries across multiple tables for a large data copying function, i want to avoid having to supply column names.
What is the best way around this problem?