Here's my table with some sample data
a_id | b_id
------------
1 225
2 494
3 589
When I run this query
INSERT IGNORE INTO table_name (a_id, b_id) VALUES ('4', '230') ('2', '494')
It inserts both those rows when it's supposed to ignore the second value pair (2, 494)
No indexes defined, neither of those columns are primary.
What don't I know?
From the docs:
If you use the IGNORE keyword, errors that occur while executing the INSERT statement are treated as warnings instead. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted. With IGNORE, the row still is not inserted, but no error is issued.
(my italics).
Your row is not duplicating "an existing UNIQUE index or PRIMARY KEY value" since you have no primary key nor any unique constraints.
If, as you mention in one of your comments, you want neither field to be unique but you do want the combination to be unique, you need a composite primary key across both columns (get rid of any duplicates first):
alter table MYTABLE add primary key (a_id,b_id)
If you don't put a UNIQUE criteria or set a PRIMARY KEY, MySql won't know that your new entry is a duplicate.
if there is no primary key, there can't be duplicate key to ignore. you should always set a primary key, so pleae do that - and if you want to have additional colums that shouldn't be duplicate, set them as "unique".
If I understand you correctly, after you run the insert command your table looks like this
1 225
2 494
3 589
4 230
2 494
If so, then the answer is because your table design allows duplicates.
If you want it prevent the second record from being inserted, you'll need to define the a_id column as a primary key, or a unique index. If you do, then the insert ignore statement will work as you expect it to, i.e. insert the records, ignore the errors such as trying to add a duplicate record.
Related
Hello I am using the "INSERT ON DUPLICATE KEY UPDATE" sql statement to update my database.
All was working fine since I always inserted an unique id like this:
INSERT INTO devices(uniqueId,name)
VALUES (4,'Printer')
ON DUPLICATE KEY UPDATE name = 'Central Printer';
But for now, I need to insert elements but I don't insert a unique id, I only insert or update the values like this:
INSERT INTO table (a,b,c,d,e,f,g)
VALUES (2,3,4,5,6,7,8)
ON DUPLICATE KEY
UPDATE a=a, b=b, c=c, d=d, e=e, f=f, g=g;
Have to say that an autoincrement primary key is generated always that I insert a row.
My problem is that now the inserted rows are duplicated since I don't insert the primary key or unique id explicitly within the sql statement.
What I am supposed to do?
For example, maybe I need to insert the primary key explicitly? I would like to work with this primary autoincremented key.
For recommendation from Gordon I am adding a sample case the you can see in the next image
Rows Output
In this case I add the first three rows, and then I try to update the three first rows again with different information.... Ok I am seeing the error... There is no key to compare to...... :$
Thanks for your answers,
If you want to prevent columns from being duplicated, then create a unique index or constraint on them. For instance:
create unique index unq_table_7 on table(a, b, c, d, e, f, g);
This will guarantee that the 7 columns -- in combination -- are unique.
I inserted between two tables fields A,B,C,D, believing I had created a Unique Index on A,B,C,D to prevent duplicates. However I somehow simply made a normal index on those. So duplicates got inserted. It is 20 million record table.
If I change my existing index from normal to unique or simply a add a new unique index for A,B,C,D will the duplicates be removed or will adding fail since unique records exist? I'd test it yet it is 30 mil records and I neither wish to mess the table up or duplicate it.
If you have duplicates in your table and you use
ALTER TABLE mytable ADD UNIQUE INDEX myindex (A, B, C, D);
the query will fail with Error 1062 (duplicate key).
But if you use IGNORE
-- (only works before MySQL 5.7.4)
ALTER IGNORE TABLE mytable ADD UNIQUE INDEX myindex (A, B, C, D);
the duplicates will be removed. But the documentation doesn't specify which row will be kept:
IGNORE is a MySQL extension to standard SQL. It controls how ALTER TABLE works if there are duplicates on unique keys in the new table or
if warnings occur when strict mode is enabled. If IGNORE is not
specified, the copy is aborted and rolled back if duplicate-key errors
occur. If IGNORE is specified, only one row is used of rows with
duplicates on a unique key. The other conflicting rows are deleted.
Incorrect values are truncated to the closest matching acceptable
value.
As of MySQL 5.7.4, the IGNORE clause for ALTER TABLE is removed and
its use produces an error.
(ALTER TABLE Syntax)
If your version is 5.7.4 or greater - you can:
Copy the data into a temporary table (it doesn't technically need to be temporary).
Truncate the original table.
Create the UNIQUE INDEX.
And copy the data back with INSERT IGNORE (which is still available).
CREATE TABLE tmp_data SELECT * FROM mytable;
TRUNCATE TABLE mytable;
ALTER TABLE mytable ADD UNIQUE INDEX myindex (A, B, C, D);
INSERT IGNORE INTO mytable SELECT * from tmp_data;
DROP TABLE tmp_data;
If you use the IGNORE modifier, errors that occur while executing the
INSERT statement are ignored. For example, without IGNORE, a row that
duplicates an existing UNIQUE index or PRIMARY KEY value in the table
causes a duplicate-key error and the statement is aborted. With
IGNORE, the row is discarded and no error occurs. Ignored errors
generate warnings instead.
(INSERT Syntax)
Also see: INSERT ... SELECT Syntax and Comparison of the IGNORE Keyword and Strict SQL Mode
if you think there will be duplicates, adding the unique index will fail.
first check what duplicates there are:
select * from
(select a,b,c,d,count(*) as n from table_name group by a,b,c,d) x
where x.n > 1
This may be a expensive query on 20M rows, but will get you all duplicate keys that will prevent you from adding the primary index.
You could split this up into smaller chunks if you do a where in the subquery: where a='some_value'
For the records retrieved, you will have to change something to make the rows unique. If that is done (query returns 0 rows) you should be safe to add the primary index.
Instead of IGNORE you can use ON DUPLICATE KEY UPDATE, which will give you control over which values should prevail.
To answer your question- adding a UNIQUE constraint on a column that has duplicate values will throw an error.
For example, you can try the following script:
CREATE TABLE `USER` (
`USER_ID` INT NOT NULL,
`USERNAME` VARCHAR(45) NOT NULL,
`NAME` VARCHAR(45) NULL,
PRIMARY KEY (`USER_ID`));
INSERT INTO USER VALUES(1,'apple', 'woz'),(2,'apple', 'jobs'),
(3,'google', 'sergey'),(4,'google', 'larry');
ALTER TABLE `USER`
ADD UNIQUE INDEX `USERNAME_UNIQUE` (`USERNAME` ASC);
/*
Operation failed: There was an error while applying the SQL script to the database.
ERROR 1062: Duplicate entry 'apple' for key 'USERNAME_UNIQUE'
*/
I've been reading up on how to use MySQL insert on duplicate key to see if it will allow me to avoid Selecting a row, checking if it exists, and then either inserting or updating. As I've read the documentation however, there is one area that confuses me. This is what the documentation says:
If you specify ON DUPLICATE KEY UPDATE, and a row is inserted that would cause a duplicate value in a UNIQUE index or PRIMARY KEY, an UPDATE of the old row is performed
The thing is, I don't want to know if this will work for my problem, because the 'condition' I have for not inserting a new one is the existence of a row that has two columns equal to a certain value, not necessarily that the primary key is the same. Right now the syntax I'm imagining is this, but I don't know if it will always insert instead of replace:
INSERT INTO attendance (event_id, user_id, status) VALUES(some_event_number, some_user_id, some_status) ON DUPLICATE KEY UPDATE status=1
The thing is, event_id and user_id aren't primary keys, but if a row in the table 'attendance' already has those columns with those values, I just want to update it. Otherwise I would like to insert it. Is this even possible with ON DUPLICATE? If not, what other method might I use?
The quote includes "a duplicate value in a UNIQUE index". So, your values do not need to be the primary key:
create unique index attendance_eventid_userid on attendance(event_id, user_id);
Presumably, you want to update the existing record because you don't want duplicates. If you want duplicates sometimes, but not for this particular insert, then you will need another method.
If I were you, I would make a primary key out of event_id and user_id. That will make this extremely easy with ON DUPLICATE.
SQLFiddle
create table attendance (
event_id int,
user_id int,
status varchar(100),
primary key(event_id, user_id)
);
Then with ease:
insert into attendance (event_id, user_id, status) values(some_event_number, some_user_id, some_status)
on duplicate key
update status = values(status);
Maybe you can try to write a trigger that checks if the pair (event_id, user_id) exists in the table before inserting, and if it exists just update it.
To the broader question of "Will INSERT ... ON DUPLICATE respect a UK even if the PK changes", the answer is yes: SQLFiddle
In this SQLFiddle I insert a new record, with a new PK id, but its values would violate the UK. It performs the ON DUPLICATE and the original PK id is preserved, but the non-UK ON DUPLICATE KEY UPDATE value changes.
I am trying to insert rows into a table that has no unique field or primary key. How can I write a query that will simply ignore the insert if there already exists a row with the exact same values on all fields -- a duplicate row?
Thanks.
You must have a primary key or unique key defined on some column or columns in the table for uniqueness to have any meaning. Every mechanism for detecting duplicates automatically relies on this being true.
You can't do the SELECT COUNT(*)... solution because it's subject to race conditions. That is, someone could insert a duplicate row in the moment after you select and before you insert. The only way around this is to lock the table with SELECT ... FOR UPDATE or LOCK TABLES.
Uh, why not make a primary key?
Otherwise, you have to basically do SELECT COUNT(*) FROM table WHERE field1=value AND ... AND fieldN=value for before EVERY insert.
I do file parsing in Perl and insert into a table in a MySQL database. For example, I have the following fields:
S.No ,PCID,USERNAME, TIME INFORMATION.
1 203 JANE 22:08 updation
2 203 JANE 22:09 deletion
3 203 JANE 22:10 insertion
In this table I wanted to have the PCID to be unique, USERNAME to be unique. S.No is unique, since I have set it to autonumbering and it's the primary key. Now my question is:
If I add PCID and USERNAME as composite primary key, I still find duplicates in the table. There is no change. The same o/p. What should be done to remove the duplicate? Should I code in Perl to check for duplicates before insertion?
Please guide and provide assistance. Thanks in advance.
You want the S.No to remain the primary key and PCID + USERNAME to be unique, so close to what Hammerite said:
ALTER TABLE MyTable
ADD PRIMARY KEY (`S.No`),
ADD UNIQUE KEY `PCID_USER_uk` (`PCID`, `USERNAME`);
I'm assuming that you want each column to be unique, rather than each composite key to be unique. Use a unique constraint on the columns that should be unique:
http://www.java2s.com/Code/SQL/Select-Clause/Altertableaddunique.htm
http://www.java2s.com/Code/SQL/Select-Clause/SettingaUniqueConstraint.htm
I'm not sure what happens if you add a unique constraint to MySQL on a column that doesn't have unique values already. You might have to perform manual cleanup before it will let you add the constraint.
You definitely shouldn't do this in Perl. Good data driven apps are about getting all of the logic of the app as close to the data model as possible. This one belongs database side.
ALTER TABLE MyTable DROP PRIMARY KEY
ALTER TABLE MyTable
ADD PRIMARY KEY (`S.No`),
ADD UNIQUE KEY `PCID_uk` (`PCID`),
ADD UNIQUE KEY `USERNAME_uk` (`USERNAME`)
If the file you're importing from contains duplicate values and you want the duplicate values to be discarded, use the IGNORE keyword. If you're using LOAD DATA INFILE then this is achieved using syntax like this:
LOAD DATA INFILE 'file_name' IGNORE INTO TABLE ...
See this documentation page.