Remove duplicate rows from table with 1 column only? - mysql

I believe my question was asked on SO, but I didn't find the answer.
There is a mysql table mytable with one column mycolumn.
What is the mysql query to remove duplicates from a table?

Only one column without pk or another column that you can use for see if they are different?
if yes, this is a bad practice. Consider inserting a new column (number) and insert id for every record, than you can try this query:
delete from table
where counter > 1 and inner_query.mycolumn = table.mycolumn and inner_query.col_id = table.col_id from
(select mycolumn, col_id, count (mycolumn) counter
from table group by mycolumn
) inner_query
than, you can add a primary key

Here is one way to go about it as long as there are no triggers or foreign keys. Not tested because I'm on my phone, but should work. After this, maybe create a unique index on mycolumn to keep from getting duplicates.
Create table _mytable
Select distinct mycolumn from mytable;
delete from mytable;
Insert into mytable(mycolumn)
Select mycolumn from _mytable;
Drop table _mytable;

Related

delete all rows after 100 in a table in mysql (without auto_increment)

I want to delete all rows after 100th row in a table in mysql. For some reason this table does not have any coloumn with Auto_increment or primary key. How can I delete all rows after 100th row? what can be the query for this?
as I think that I have to add another column in my table with auto_increment and then delete after 100th row and then remove this column? or any better suggestion?
The actual problem aroused when I was copying all rows from 1 table to another and mistakenly duplicated all rows in the same table through a sql dump file. So, now I want to delete all rows after 100th row.
This deletes every SECOND row starting with the first (ex: 1,2,3,4,5,6,7 => 2,4,6)
DELETE FROM table_name WHERE rand() < 0.5
This deletes all rows after your set id
DELETE FROM table_name WHERE id > row_number
I hope this is helpful
You can use swap temp table.
Say you table name is mytable;
CREATE TEMPORARY TABLE mytable_temp LIKE mytable;
INSERT INTO mytable_temp SELECT * FROM mytable LIMIT 100;
DELETE FROM mytable;
INSERT INTO mytable SELECT * FROM mytable_temp;
DROP TABLE mytable_temp;
Note: In general case when the duplicate rows are unordered or you don't know where the original rows end you can set unique index on (all) swap table columns and use REPLACE instead of INSERT. Then there is no need for using LIMIT.

How do I delete all duplicate rows without a primary key? [duplicate]

This question already has answers here:
Deleting duplicate rows from a table
(3 answers)
Closed 8 years ago.
I have a table that does not has any unique key or primary key. It has 50 columns and any or all of these columns can be duplicates. How do I delete all duplicate rows but keep the first occurrence?
The generic SQL approach is to store the data, truncate the table, and reinsert the data. The syntax varies a bit by database, but here is an example:
create table TempTable as
select distinct * from MyTable;
truncate table MyTable;
insert into MyTable
select * from TempTable;
There are other approaches that don't require a temporary table, but they are even more database-dependent.
If you are using a mysql database use the following command
ALTER IGNORE TABLE tablename ADD UNIQUE INDEX (field1,field2,field3...)
This allows duplicates to be removed through the addition of a unique index even with duplicate entries.(the IGNORE keyword is thus used)
If you are using an Oracle database use the following command
Delete from tablename where rowid not in (select min(rowid) from tablename group by row1,row2,row3.....)

#1062 - Duplicate entry '' for key 'unique_id' When Trying to add UNIQUE KEY (MySQL)

I've got an error on MySQL while trying to add a UNIQUE KEY. Here's what I'm trying to do. I've got a column called 'unique_id' which is VARCHAR(100). There are no indexes defined on the table. I'm getting this error:
#1062 - Duplicate entry '' for key 'unique_id'
When I try to add a UNIQUE key. Here is a screenshot of how I'm setting it up in phpMyAdmin:
Here is the MySQL query that's generate by phpMyAdmin:
ALTER TABLE `wind_archive` ADD `unique_id` VARCHAR( 100 ) NOT NULL FIRST ,
ADD UNIQUE (
`unique_id`
)
I've had this problem in the past and never resolved it so I just rebuilt the table from scratch. Unfortunately in this case I cannot do that as there are many entries in the table already. Thanks for your help!
The error says it all:
Duplicate entry ''
So run the following query:
SELECT unique_id,COUNT(unique_id)
FROM yourtblname
GROUP BY unique_id
HAVING COUNT(unique_id) >1
This query will also show you the problem
SELECT *
FROM yourtblname
WHERE unique_id=''
This will show you where there are values that have duplicates. You are trying to create a unique index on a field with duplicates. You will need to resolve the duplicate data first then add the index.
This is 3rd time i am looking for solution to this problem so for the reference I am posting the answer here.
Depending on the data we may use IGNORE keyword with Alter command. If IGNORE is specified, only the first row is used of rows with duplicates on a unique key, The other conflicting rows are deleted. Incorrect values are truncated to the closest matching acceptable value.
The IGNORE keyword extension to MySQL seems to have a bug in the InnoDB version on some version of MySQL.
You could always, convert to MyISAM, IGNORE-ADD the index and then convert back to InnoDB
ALTER TABLE table ENGINE MyISAM;
ALTER IGNORE TABLE table ADD UNIQUE INDEX dupidx (field);
ALTER TABLE table ENGINE InnoDB;
Note, if you have Foreign Key constraints this will not work, you will have to remove those first, and add them back later.
Make unique_id NULL from NOT NULL and it will solve your problem
select ID from wind_archive
where ID not in (select max(ID) from wind_archive group by unique_id)
and this is what you should remove from the table before you succesfully add the unique key.
this also works for adding unique key with 2 or more columns.
such as -
delete from wind_archive
where ID in (
select * from (select ID from wind_archive where ID not in (
select max(ID) from wind_archive group by lastName, firstName
) ORDER BY ID
) AS p
);
because of you write in your query, unique_id be NOT NULL and previous rows all of them are null and you want this column be unique, then after run this query, you have several rows with the same value it means this column is not unique, then you have to change unique_id NOT NULL to unique_id NULL in your query.
I was getting the same error (Duplicate entry '' for key 'unique_id') when trying to add a new column as unique "after" I had already created a table containing just names of museums. I wanted to go back and add a unique code for each museum name, with the intention of inserting the code values one at a time. Poor table planning on my part.
My solution was to add the new column without making it unique; then entered the data for each code one row at a time; and then changing the column structure to make it unique for future entries. Lucky there were only 10 rows.

mySQL find dupes and remove them

I am wondering if there is a way to do this through one query.
Seems when I was initially populating my DB with dummy data to work with 10k records, somewhere in the mess of it all the script dummped an extra 1,044 rows where the rows are duplicates. I determined this using
SELECT x.ID, x.firstname FROM info x
INNER JOIN (SELECT ID FROM info
GROUP BY ID HAVING count(id) > 1) d ON x.ID = d.ID
What I am trying to figure out is through this single query can I add another piece to it that will remove one of the matching dupes from each dupe found?
also I realize the ID column should have been set to auto increment, but it wasn't
My favorite way of removing duplicates would be:
ALTER IGNORE TABLE info ADD UNIQUE (ID);
To explain a bit further (for reference, take a look here)
UNIQUE - you are adding unique index to ID column.
IGNORE - is a MySQL extension to standard SQL. It controls how ALTER TABLE works if there are duplicates on unique keys in the new table or if warnings occur when strict mode is enabled. If IGNORE is not specified, the copy is aborted and rolled back if duplicate-key errors occur. If IGNORE is specified, only the first row is used of rows with duplicates on a unique key. The other conflicting rows are deleted. Incorrect values are truncated to the closest matching acceptable value.
The query that I use is generally something like
Delete from table where id in (
Select Max(id) from table
Group by (DUPFIELD)
Having count (*)>1)
You have to run this several times since it all only remove one duplicated row at a time, but it's fast.
The most efficient way is you do it in below steps:
Step 1: Move the non duplicates (unique tuples) into a temporary table
CREATE TABLE new_table as
SELECT * FROM old_table WHERE 1 GROUP BY [column to remove duplicates by];
Step 2: delete delete the old table.We no longer need the table with all the duplicate entries, so drop it!
DROP TABLE old_table;
Step 3: rename the new_table to the name of the old_table
RENAME TABLE new_table TO old_table;

Which one faster on Check and Skip Insert if existing on SQL / MySQL

I have read many article about this one. I want to hear from you.
My problem is:
A table: ID(INT, Unique, Auto Increase) , Title(varchar), Content(text), Keywords(varchar)
My PHP Code will always do insert new record, but not accept duplicated record base on Title or Keywords. So, the title or keyword can't be Primary field. My PHP Code need to do check existing and insert like 10-20 records same time.
So, I check like this:
SELECT * FROM TABLE WHERE TITLE=XXX
And if return nothing, then I do INSERT.
I read some other post. And some guy say:
INSERT IGNORE INTO Table values()
An other guy suggest:
SELECT COUNT(ID) FROM TABLE
IF it return 0, then do INSERT
I don't know which one faster between those queries.
And I have 1 more question, what is different and faster on those queries too:
SELECT COUNT(ID) FROM ..
SELECT COUNT(0) FROM ...
SELECT COUNT(1) FROM ...
SELECT COUNT(*) FROM ...
All of them show me total of records in table, but I don't know do mySQL think number 0 or 1 is my ID field? Even I do SELECT COUNT(1000) , I still get total records of my table, while my table only have 4 columns.
I'm using MySQL Workbench, have any option for test speed on this app?
I would use insert on duplicate key update command. One important comment from the documents states that: "...if there is a single multiple-column unique index on the table, then the update uses (seems to use) all columns (of the unique index) in the update query."
So if there is a UNIQUE(Title,Keywords) constraint on the table in the example, then, you would use:
INSERT INTO table (Title,Content,Keywords) VALUES ('blah_title','blah_content','blah_keywords')
ON DUPLICATE KEY UPDATE Content='blah_content';
it should work and it is one query to the database.
SELECT COUNT(*) FROM .... is faster than SELECT COUNT(ID) FROM .. or build something like this:
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=3;