I have a table with 60 columns in it. I would like to delete duplicate entries. It has to compare all 60 columns for a record to be considered duplicate.
I tried setting all 60 columns to UNIQUE in MySQL, but I get this error
#1070 - Too many key parts specified; max 16 parts allowed
Any other solutions out there?
If your new table should have the exact same schema than the old one
CREATE TABLE new_table LIKE old_table;
To INSERT all distinct rows into new_table use
INSERT INTO new_table
SELECT DISTINCT * FROM old_table;
Then you can DROP TABLE old_table and RENAME TABLE new_table TO old_table or whatever.
I would suggest try something like this
Select col1, ..., col60 from [mytable] group by col1,...col60 HAVING count(*)>1
this will list all the duplicate rows. Once you have that you can delete the duplicate rows.
Related
I have a table contains more than 500 millions records in MySQL database ,
i need to remove duplicated from it ,
i tried this query on table contain 20 millions , it was ok but for the 500 millions it take very long time :
-- Create temporary table
CREATE TABLE temp_table LIKE names_tbles;
-- Add constraint
ALTER TABLE temp_table ADD UNIQUE(name , family);
-- Copy data
INSERT IGNORE INTO temp_table SELECT * FROM names_tbles;
is there better solution ?
One option is aggregation rather than insert ignore. That way, there is no need for the database to manage rejected records:
insert into temp_table(id, name, family)
select min(id), name, family
from names_tbles
group by id, family;
I would take one step further and suggest adding the unique constraints only after the table is populated, so there is no need for the database to check for duplicates (the query guarantees that already), which should speed up the insert statement.
I need to transfer all data of one table to another dumping table.
My purpose is to get table ready for daily transaction and previous data should be moved to another table which stores every days data.
i need mysql syntax for this, thank you in advance for your support and help
You can try these queries:
This query will copy the data and structure, but the indexes are not included:
CREATE TABLE new_table SELECT * FROM old_table;
To copy everything, including database objects such as indexes, primary key constraint, foreign key constraints, triggers run these queries:
CREATE TABLE new_table LIKE old_table;
INSERT new_table SELECT * FROM old_table;
To insert data into an existing table, use this :
INSERT INTO table2 SELECT * FROM table1
One of my table contains data(numbers) that i would like to copy to other table, but problem is that data is not unique there can be 2 or more rows with same data i would like to copy (i need to copy each number only once). Table is around 3 milion records. Is any effcient way to do this?
Would this work for you?
INSERT INTO destination_table ('the_value_field') SELECT DISTINCT('the_value_field') FROM origin_table
Suppose there are two columns a, b in your table
INSERT INTO new_table (a, b) SELECT
a, b FROM old_table GROUP BY
a, b HAVING COUNT(*) > 1;
you can extend this with more columns.
this will be a slow process and may never complete with huge data.
So, instead copy all values into new_table using
Insert into new_table select * from old_table;
and then delete duplicate records from new table . This can be relatively faster and is with an assured completion.
You can use SELECT DISTINCT to select only the unique values.
https://www.w3schools.com/sql/sql_distinct.asp
SELECT DISTINCT `val` FROM `table_name`
I believe my question was asked on SO, but I didn't find the answer.
There is a mysql table mytable with one column mycolumn.
What is the mysql query to remove duplicates from a table?
Only one column without pk or another column that you can use for see if they are different?
if yes, this is a bad practice. Consider inserting a new column (number) and insert id for every record, than you can try this query:
delete from table
where counter > 1 and inner_query.mycolumn = table.mycolumn and inner_query.col_id = table.col_id from
(select mycolumn, col_id, count (mycolumn) counter
from table group by mycolumn
) inner_query
than, you can add a primary key
Here is one way to go about it as long as there are no triggers or foreign keys. Not tested because I'm on my phone, but should work. After this, maybe create a unique index on mycolumn to keep from getting duplicates.
Create table _mytable
Select distinct mycolumn from mytable;
delete from mytable;
Insert into mytable(mycolumn)
Select mycolumn from _mytable;
Drop table _mytable;
I have read many article about this one. I want to hear from you.
My problem is:
A table: ID(INT, Unique, Auto Increase) , Title(varchar), Content(text), Keywords(varchar)
My PHP Code will always do insert new record, but not accept duplicated record base on Title or Keywords. So, the title or keyword can't be Primary field. My PHP Code need to do check existing and insert like 10-20 records same time.
So, I check like this:
SELECT * FROM TABLE WHERE TITLE=XXX
And if return nothing, then I do INSERT.
I read some other post. And some guy say:
INSERT IGNORE INTO Table values()
An other guy suggest:
SELECT COUNT(ID) FROM TABLE
IF it return 0, then do INSERT
I don't know which one faster between those queries.
And I have 1 more question, what is different and faster on those queries too:
SELECT COUNT(ID) FROM ..
SELECT COUNT(0) FROM ...
SELECT COUNT(1) FROM ...
SELECT COUNT(*) FROM ...
All of them show me total of records in table, but I don't know do mySQL think number 0 or 1 is my ID field? Even I do SELECT COUNT(1000) , I still get total records of my table, while my table only have 4 columns.
I'm using MySQL Workbench, have any option for test speed on this app?
I would use insert on duplicate key update command. One important comment from the documents states that: "...if there is a single multiple-column unique index on the table, then the update uses (seems to use) all columns (of the unique index) in the update query."
So if there is a UNIQUE(Title,Keywords) constraint on the table in the example, then, you would use:
INSERT INTO table (Title,Content,Keywords) VALUES ('blah_title','blah_content','blah_keywords')
ON DUPLICATE KEY UPDATE Content='blah_content';
it should work and it is one query to the database.
SELECT COUNT(*) FROM .... is faster than SELECT COUNT(ID) FROM .. or build something like this:
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=3;