This question already has answers here:
Deleting duplicate rows from a table
(3 answers)
Closed 8 years ago.
I have a table that does not has any unique key or primary key. It has 50 columns and any or all of these columns can be duplicates. How do I delete all duplicate rows but keep the first occurrence?
The generic SQL approach is to store the data, truncate the table, and reinsert the data. The syntax varies a bit by database, but here is an example:
create table TempTable as
select distinct * from MyTable;
truncate table MyTable;
insert into MyTable
select * from TempTable;
There are other approaches that don't require a temporary table, but they are even more database-dependent.
If you are using a mysql database use the following command
ALTER IGNORE TABLE tablename ADD UNIQUE INDEX (field1,field2,field3...)
This allows duplicates to be removed through the addition of a unique index even with duplicate entries.(the IGNORE keyword is thus used)
If you are using an Oracle database use the following command
Delete from tablename where rowid not in (select min(rowid) from tablename group by row1,row2,row3.....)
Related
This question already has answers here:
How to insert new row to database with AUTO_INCREMENT column without specifying column names?
(3 answers)
Closed 2 years ago.
This is a silly little problem. I have a large table with hundreds of columns, so I don't want to write out each column name individually. The problem is that say, table1 is the source table, with 200 columns, and table2 is the destination table with 201 columns, where the last column of table2 is an extra auto-increment (primary key) column. The idea is that I simply can do
insert into table2 select * from table1 where row = ##;
and I would wish that all the data would be copied and the auto-increment column would just do its job. However I get this pesky error message:
Error Code: 1136. Column count doesn't match value count at row 1
Anyone have a simple solution to this?
My recommendation is to generate the column names with a SQL query and just cut-and-paste.
But you can also use the temporary table approach:
create table temp_table1 as
select * from table1 where row = ##;
alter table temp_table1 drop column row;
Then you can use temp_table1 with *. Of course, this assumes that all the other columns line up! I also recommend listing all the columns for the insert . . . and you are back to the recommendation at the beginning of the answer.
The simplest solution can be create a backup of the table2, drop the autoincrement column then insert whatever to you want to insert as the column number would match, and then you can add the autoincrement column at the end again.
This question already has answers here:
How to reset the auto increment number/column in a MySql table
(4 answers)
Closed 8 years ago.
I have a table named users_items. In this table there are 3 columns. 1 of them is called id. There are like 100.000 - 150.000 data in this table. id is set to AUTO_INCREMENT.
I want to reset all id's to 0 and than replace with numbers 1,2,3,4,5,6 continue like that.
You won't be able to make all of them 0 at the same time, as you can't have duplicates for the PK.
Create a copy of the table using phpMyAdmin or any tool you want or using SQL queries.
Then delete all data from the original table using:
DELETE users_items
Or:
TRUNCATE users_items
Then reset auto increment using:
ALTER TABLE users_items AUTO_INCREMENT = 1
If you used TRUNCATE then you won't have to reset the auto increment counter.
After this you can use SELECT and INSERT to get the data from the copied table back to this one:
INSERT INTO users_items (col2, col3...) SELECT col2, col3,... FROM users_items_copy
(Note: the id column was not touched while selecting and inserting rows.)
To start from one simply do the following:
ALTER TABLE tablename AUTO_INCREMENT = 1
IF need reference use following links:
altering table:
http://dev.mysql.com/doc/refman/5.1/en/alter-table.html
Auto increament/Reset primary key:
Reorder / reset auto increment primary key
Mysql reset autoincreament:
http://www.mysqltutorial.org/mysql-reset-auto-increment
I believe my question was asked on SO, but I didn't find the answer.
There is a mysql table mytable with one column mycolumn.
What is the mysql query to remove duplicates from a table?
Only one column without pk or another column that you can use for see if they are different?
if yes, this is a bad practice. Consider inserting a new column (number) and insert id for every record, than you can try this query:
delete from table
where counter > 1 and inner_query.mycolumn = table.mycolumn and inner_query.col_id = table.col_id from
(select mycolumn, col_id, count (mycolumn) counter
from table group by mycolumn
) inner_query
than, you can add a primary key
Here is one way to go about it as long as there are no triggers or foreign keys. Not tested because I'm on my phone, but should work. After this, maybe create a unique index on mycolumn to keep from getting duplicates.
Create table _mytable
Select distinct mycolumn from mytable;
delete from mytable;
Insert into mytable(mycolumn)
Select mycolumn from _mytable;
Drop table _mytable;
I have a table that has some duplicate results. For example:
`person_url` `movie_url`
1 2
1 2
2 3
Would become -->
`person_url` `movie_url`
1 2
2 3
I know how to do it by creating a new table,
create table tmp_credits (select distinct * from name);
However, it is a pretty large table and I have a couple indexes on it which will need to be re-created. How would I do this transformation in place, that is, without creating a new table?
You can add a UNIQUE index over your table's columns using the IGNORE keyword:
ALTER IGNORE TABLE name ADD UNIQUE INDEX (person_url, movie_url);
As stated in the manual:
IGNORE is a MySQL extension to standard SQL. It controls how ALTER TABLE works if there are duplicates on unique keys in the new table or if warnings occur when strict mode is enabled. If IGNORE is not specified, the copy is aborted and rolled back if duplicate-key errors occur. If IGNORE is specified, only the first row is used of rows with duplicates on a unique key. The other conflicting rows are deleted. Incorrect values are truncated to the closest matching acceptable value.
This will also prevent duplicates from being added in the future.
`create table temp
(col1 varchar(20),col2 varchar(20));
INSERT INTO temp VALUES
('1','one'),('2','two'),('2','two');
`select col1,col2 from temp
union
select col1,col2 from temp;
`
Have you considered just putting a semantic layer/view on top of the table that de-dups?
select person_url, movie_url
from name
group by person_url, movie_url
I have read many article about this one. I want to hear from you.
My problem is:
A table: ID(INT, Unique, Auto Increase) , Title(varchar), Content(text), Keywords(varchar)
My PHP Code will always do insert new record, but not accept duplicated record base on Title or Keywords. So, the title or keyword can't be Primary field. My PHP Code need to do check existing and insert like 10-20 records same time.
So, I check like this:
SELECT * FROM TABLE WHERE TITLE=XXX
And if return nothing, then I do INSERT.
I read some other post. And some guy say:
INSERT IGNORE INTO Table values()
An other guy suggest:
SELECT COUNT(ID) FROM TABLE
IF it return 0, then do INSERT
I don't know which one faster between those queries.
And I have 1 more question, what is different and faster on those queries too:
SELECT COUNT(ID) FROM ..
SELECT COUNT(0) FROM ...
SELECT COUNT(1) FROM ...
SELECT COUNT(*) FROM ...
All of them show me total of records in table, but I don't know do mySQL think number 0 or 1 is my ID field? Even I do SELECT COUNT(1000) , I still get total records of my table, while my table only have 4 columns.
I'm using MySQL Workbench, have any option for test speed on this app?
I would use insert on duplicate key update command. One important comment from the documents states that: "...if there is a single multiple-column unique index on the table, then the update uses (seems to use) all columns (of the unique index) in the update query."
So if there is a UNIQUE(Title,Keywords) constraint on the table in the example, then, you would use:
INSERT INTO table (Title,Content,Keywords) VALUES ('blah_title','blah_content','blah_keywords')
ON DUPLICATE KEY UPDATE Content='blah_content';
it should work and it is one query to the database.
SELECT COUNT(*) FROM .... is faster than SELECT COUNT(ID) FROM .. or build something like this:
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=3;