When to use different never-rejected ("insert or update") MySQL statements? - mysql

A standard problem in applications is to insert a record if one doesn't exist - or update if it does. In cases where the PRIMARY KEY is unknown this is usally solved by issuing a SELECT and then running either an INSERT or UPDATE if the record was found.
However, there seems to be at least three ways I know of that you can insert a record into a database even when a record already exists. Personally, I would rather drop the new insert request if one already exists, but then there might be cases where you would rather drop the record in the database and use the new one.
CREATE TABLE IF NOT EXISTS `table` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`foo` int(10) unsigned NOT NULL,
`bar` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `row` (`foo`,`bar`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Here are the three methods:
INSERT IGNORE INTO table (foo, bar) VALUES (2,3);
INSERT INTO table (foo, bar) VALUES (2,3) ON DUPLICATE KEY UPDATE;
REPLACE INTO table (foo, bar) VALUES (2,3);
At what times should each of these methods be used?
Can someone give some examples of correct usage scenarios?

INSERT should be used when you just want to insert a new row
Lets say you are storing log entries, you'll want to log every event, use INSERT.
INSERT IGNORE should be used when you just want there to be a specific key exists in the table, it doesn't matter if it's the current insert that creates it, or if it's already present.
Let's say you have a table of phone-numbers and the number of uses, you find a new phone number that you are not sure exists in the table, but you want it to be there.
You use INSERT IGNORE to make sure that it's there.
REPLACE INTO should be used when you want to make sure that a specific key exists in the table, if it exists you'd like the new values to be used, instead of that present.
You have another table with phone-numbers, this time you find a new number and a name to associate it with.
You use REPLACE INTO to find and update a complete record, or just insert the new information.
INSERT INTO ... ON DUPLICATE KEY UPDATE ...
Please not that this is not an alternative method of writing REPLACE INTO, the above should be used whenever you'd like to make sure that a specific key exists, but if it does update some of the columns, not all of them.
For example if you are storing the numbers of visits from a certain IP, and the first page the user ever visited.
INSERT INTO visitors (ip,visits,first_page) VALUES (<ip>,1,<current_page>) ON DUPLICATE KEY visits = visits +1;

In your question, you have
INSERT INTO `table` (foo, bar) VALUES (2,3) ON DUPLICATE KEY UPDATE;
That can only work if the row index was UNIQUE:
CREATE TABLE IF NOT EXISTS `table` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`foo` int(10) unsigned NOT NULL,
`bar` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `row` (`foo`,`bar`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Otherwise, why have just an index?
Also the ON DUPLICATE KEY clauses allows you to update non-indexed columns.
As for REPLACE, keep in mind that REPLACE is actually DELETE and INSERT under the hood.

Related

How to implement conditional unique constraint

I have a table that needs a unique constraint on 3 columns, but, if the "date" column in for that insert transaction is a newer date than the current record's date, then I want to update that record (so the unique constraint is still true for the table).
Postgres has the concept of deferrable constraints, MySQL does not.
I do want to implement it with the SQL object tools available, though.
Here is my table DDL with column names obfuscated:
CREATE TABLE `apixio_results_test_sefath` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`number` varchar(20) DEFAULT NULL,
`insert_date` datetime DEFAULT NULL,
`item_id` int(5) DEFAULT NULL,
`rule` tinyint(4) DEFAULT NULL,
`another_column` varchar(20) DEFAULT NULL,
`another_column1` varchar(20) DEFAULT NULL,
PRIMARY KEY (`ID`),
KEY `insert_date_index` (`insert_date`),
KEY `number` (`number`),
) ENGINE=InnoDB AUTO_INCREMENT=627393 DEFAULT CHARSET=latin1
and here is the unique constraint statement
Alter Table dbname.table add unique constraint my_unique_constraint (number, item_id, rule);
but I can not add a condition here in this constraint (unless there is a way I'm not aware of?)
The logic I need to run before inserts are blocked by the constraint is to check if the three values: number, item_id, and rule are unique in the table, and if they aren't, then I want to compare the existing record's insert_date with the insert_date from the transaction, and only keep the record with the newest insert_date.
This could be achieved with a trigger I suppose, although I've heard triggers are only to be used if really needed. And on every insert, this trigger would be quite computationally taxing on the DB. Any advice? Any other sql tricks I can use? Or anything to help point me to how to make this trigger?
I tried the unique constraint statement
Alter Table dbname.table add unique constraint my_unique_constraint (number, item_id, rule);
But it will never update with the newer insert_date.
You can do this with an insert statement like:
insert into apixio_results_test_sefath (number, item_id, rule, insert_date, another_column, another_column1)
values (?,?,?,?,?,?)
on duplicate key update
another_column=if(insert_date>values(insert_date),another_column,values(another_column),
another_column1=if(insert_date>values(insert_date),another_column1,values(another_column1),
insert_date=greatest(insert_date,values(insert_date)
for each column besides the unique ones and insert_date, testing to see if the existing insert_date is greater than the value supplied with the insert and conditionally using the existing value or new value for the other column based on that, and ending with updating insert_date only if it is now greater.
mysql 8 has an alternate syntax it prefers to using the values function, but the values function still works.
If you want this to happen automatically for all inserts, you would need to use a trigger.

How to perform multiple updates with a unique index in MySQL

I have the following table with a unique index by field "position_in_list":
CREATE TABLE `planned_operation` (
`id` bigint(20) NOT NULL,
`position_in_list` bigint(20) NOT NULL,
`name` varchar(255) not null
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
ALTER TABLE `planned_operation`
ADD PRIMARY KEY (`id`),
ADD UNIQUE KEY `position_in_list` (`position_in_list`);
ALTER TABLE `planned_operation`
MODIFY `id` bigint(20) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=3;
INSERT INTO `planned_operation` (`id`, `position_in_list`, `name`) VALUES
(1, 1, 'first'),
(2, 2, 'second');
Then I have a trivial task, this is a change in position when updating the list. Accordingly, you need to update the list of items before which the record was inserted. In order not to perform thousands of updates, I execute one query:
update planned_operation
set position_in_list = case position_in_list
when 2 then 3
when 1 then 2
end
where position_in_list in (1, 2)
But when executing an error is issued:
#1062 - Duplicate entry '1' for key 'position_in_list'
Is there any way to avoid an error? Without disabling the unique index
You want deferrable constraints.
Unfortunately, MySQL does not implement deferrable constraint checks -- an integral part of SQL that few database engines implement.
As far as I know only PostgreSQL and Oracle (partial) do implement them.
In simple words, this means that MySQL checks the unique constraint on every single row change inside an UPDATE statement. With deferrable constraints you could defer this check to the end of the statement, or even to the end of the database transaction.
Now, you would need to switch to PostgrSQL or Oracle to defer contraints checks to the end of the statement (as you seem to want). I guess that's way out of the scope for you, but it's a theoretical option.
For a more in depth discussion you could look into Deferrable Constraints answer.

What is the fastest procedure to remove duplicates from a big table in MySQL

I have a table in MySQL (50 million rows) new data keep inserting periodically.
This table has following structure
CREATE TABLE values (
id double NOT NULL AUTO_INCREMENT,
channel_id int(11) NOT NULL,
val text NOT NULL,
date_time datetime NOT NULL,
PRIMARY KEY (id),
KEY channel_date_index (channel_id,date_time)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Two rows must never have duplicate channel_id and date_time, but if such insert occurs it is important to keep the newest value.
Is there a procedure to check for duplicates realtime before the insert or should I keep inserting all data while doing periodic checks for duplicity in a different cycle.
Realtime speed is important here, because 100 inserts occur per second.
To prevent future duplicates:
Change KEY channel_date_index (channel_id,date_time) to UNIQUE (channel_id,date_time)
Change the INSERT to INSERT ... ON DUPLICATE KEY UPDATE ... to change the timestamp when that pair exists.
To fix the existing table, you could do ALTER IGNORE TABLE ... ADD UNIQUE(...). However that would not give you the latest timestamps.
For minimum downtime (not maximum speed), use pt-online-schema-change.

On Duplicate Key Update same as insert

I've searched around but didn't find if it's possible.
I've this MySQL query:
INSERT INTO table (id,a,b,c,d,e,f,g) VALUES (1,2,3,4,5,6,7,8)
Field id has a "unique index", so there can't be two of them. Now if the same id is already present in the database, I'd like to update it. But do I really have to specify all these field again, like:
INSERT INTO table (id,a,b,c,d,e,f,g) VALUES (1,2,3,4,5,6,7,8)
ON DUPLICATE KEY UPDATE a=2,b=3,c=4,d=5,e=6,f=7,g=8
Or:
INSERT INTO table (id,a,b,c,d,e,f,g) VALUES (1,2,3,4,5,6,7,8)
ON DUPLICATE KEY UPDATE a=VALUES(a),b=VALUES(b),c=VALUES(c),d=VALUES(d),e=VALUES(e),f=VALUES(f),g=VALUES(g)
I've specified everything already in the insert...
A extra note, I'd like to use the work around to get the ID to!
id=LAST_INSERT_ID(id)
I hope somebody can tell me what the most efficient way is.
The UPDATE statement is given so that older fields can be updated to new value. If your older values are the same as your new ones, why would you need to update it in any case?
For eg. if your columns a to g are already set as 2 to 8; there would be no need to re-update it.
Alternatively, you can use:
INSERT INTO table (id,a,b,c,d,e,f,g)
VALUES (1,2,3,4,5,6,7,8)
ON DUPLICATE KEY
UPDATE a=a, b=b, c=c, d=d, e=e, f=f, g=g;
To get the id from LAST_INSERT_ID; you need to specify the backend app you're using for the same.
For LuaSQL, a conn:getlastautoid() fetches the value.
There is a MySQL specific extension to SQL that may be what you want - REPLACE INTO
However it does not work quite the same as 'ON DUPLICATE UPDATE'
It deletes the old row that clashes with the new row and then inserts the new row. So long as you don't have a primary key on the table that would be fine, but if you do, then if any other table references that primary key
You can't reference the values in the old rows so you can't do an equivalent of
INSERT INTO mytable (id, a, b, c) values ( 1, 2, 3, 4)
ON DUPLICATE KEY UPDATE
id=1, a=2, b=3, c=c + 1;
I'd like to use the work around to get the ID to!
That should work — last_insert_id() should have the correct value so long as your primary key is auto-incrementing.
However as I said, if you actually use that primary key in other tables, REPLACE INTO probably won't be acceptable to you, as it deletes the old row that clashed via the unique key.
Someone else suggested before you can reduce some typing by doing:
INSERT INTO `tableName` (`a`,`b`,`c`) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE `a`=VALUES(`a`), `b`=VALUES(`b`), `c`=VALUES(`c`);
There is no other way, I have to specify everything twice. First for the insert, second in the update case.
Here is a solution to your problem:
I've tried to solve problem like yours & I want to suggest to test from simple aspect.
Follow these steps: Learn from simple solution.
Step 1: Create a table schema using this SQL Query:
CREATE TABLE IF NOT EXISTS `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(30) NOT NULL,
`password` varchar(32) NOT NULL,
`status` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `no_duplicate` (`username`,`password`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1;
Step 2: Create an index of two columns to prevent duplicate data using following SQL Query:
ALTER TABLE `user` ADD INDEX no_duplicate (`username`, `password`);
or, Create an index of two column from GUI as follows:
Step 3: Update if exist, insert if not using following queries:
INSERT INTO `user`(`username`, `password`) VALUES ('ersks','Nepal') ON DUPLICATE KEY UPDATE `username`='master',`password`='Nepal';
INSERT INTO `user`(`username`, `password`) VALUES ('master','Nepal') ON DUPLICATE KEY UPDATE `username`='ersks',`password`='Nepal';
Just in case you are able to utilize a scripting language to prepare your SQL queries, you could reuse field=value pairs by using SET instead of (a,b,c) VALUES(a,b,c).
An example with PHP:
$pairs = "a=$a,b=$b,c=$c";
$query = "INSERT INTO $table SET $pairs ON DUPLICATE KEY UPDATE $pairs";
Example table:
CREATE TABLE IF NOT EXISTS `tester` (
`a` int(11) NOT NULL,
`b` varchar(50) NOT NULL,
`c` text NOT NULL,
UNIQUE KEY `a` (`a`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
I know it's late, but i hope someone will be helped of this answer
INSERT INTO t1 (a,b,c) VALUES (1,2,3),(4,5,6)
ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);
You can read the tutorial below here :
https://mariadb.com/kb/en/library/insert-on-duplicate-key-update/
http://www.mysqltutorial.org/mysql-insert-or-update-on-duplicate-key-update/
You may want to consider using REPLACE INTO syntax, but be warned, upon duplicate PRIMARY / UNIQUE key, it DELETES the row and INSERTS a new one.
You won't need to re-specify all the fields. However, you should consider the possible performance reduction (depends on your table design).
Caveats:
If you have AUTO_INCREMENT primary key, it will be given a new one
Indexes will probably need to be updated
With MySQL v8.0.19 and above you can do this:
mysql doc
INSERT INTO mytable(fielda, fieldb, fieldc)
VALUES("2022-01-01", 97, "hello")
AS NEW(newfielda, newfieldb, newfieldc)
ON DUPLICATE KEY UPDATE
fielda=newfielda,
fieldb=newfieldb,
fieldc=newfieldc;
SIDENOTE: Also if you want a conditional in the on duplicate key update part there is a twist in MySQL. If you update fielda as the first argument and include it inside the IF clause for fieldb it will already be updated to the new value! Move it to the end or alike. Let's say fielda is a date like in the example and you want to update only if the date is newer than the previous:
INSERT INTO mytable(fielda, fieldb)
VALUES("2022-01-01", 97)
AS NEW(newfielda, newfieldb, newfieldc)
ON DUPLICATE KEY UPDATE
fielda=IF(fielda<STR_TO_DATE(newfielda,'%Y-%m-%d %H:%i:%s'),newfielda,fielda),
fieldb=IF(fielda<STR_TO_DATE(newfielda,'%Y-%m-%d %H:%i:%s'),newfieldb,fieldb);
in this case fieldb would never be updated because of the <! you need to move the update of fielda below it or check with <= or =...!
INSERT INTO mytable(fielda, fieldb)
VALUES("2022-01-01", 97)
AS NEW(newfielda, newfieldb, newfieldc)
ON DUPLICATE KEY UPDATE
fielda=IF(fielda<STR_TO_DATE(newfielda,'%Y-%m-%d %H:%i:%s'),newfielda,fielda),
fieldb=IF(fielda=STR_TO_DATE(newfielda,'%Y-%m-%d %H:%i:%s'),newfieldb,fieldb);
This works as expected with using = since fielda is already updated to its new value before reaching the if clause of fieldb... Personally i like <= the most in such a case if you ever rearrange the statement...
you can use insert ignore for such case, it will ignore if it gets duplicate records
INSERT IGNORE
... ; -- without ON DUPLICATE KEY

What could cause duplicate ids on a auto increment primary key field (mysql)?

RESOLVED
From the developer: the problem was that a previous version of the code was still writing to the table which used manual ids instead of the auto increment. Note to self: always check for other possible locations where the table is written to.
We are getting duplicate keys in a table. They are not inserted at the same time (6 hours apart).
Table structure:
CREATE TABLE `table_1` (
`sales_id` int(10) unsigned NOT NULL auto_increment,
`sales_revisions_id` int(10) unsigned NOT NULL default '0',
`sales_name` varchar(50) default NULL,
`recycle_id` int(10) unsigned default NULL,
PRIMARY KEY (`sales_id`),
KEY `sales_revisions_id` (`sales_revisions_id`),
KEY `sales_id` (`sales_id`),
KEY `recycle_id` (`recycle_id`)
) ENGINE= MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=26759 ;
The insert:
insert into `table_1` ( `sales_name` ) VALUES ( "Blah Blah" )
We are running MySQL 5.0.20 with PHP5 and using mysql_insert_id() to retrieve the insert id immediately after the insert query.
I have had a few duplicate key error suddenly appear in MySql databases in the past even though the primary key is defined and auto_increment. Each and every time it has been because the table has become corrupted.
If it is corrupt performing a check tables should expose the problem. You can do this by running:
CHECK TABLE tbl_name
If it comes back as corrupt in anyway (Will usually say the size is bigger than it actually should be) then just run the following to repair it:
REPAIR TABLE tbl_name
Does the sales_id field have a primary (or unique) key? If not, then something else is probably making inserts or updates that is re-using existing numbers. And by "something else" I don't just mean code; it could be a human with access to the database doing it accidentally.
As the other said; with your example it's not possible.
It's unrelated to your question, but you don't have to make a separate KEY for the primary key column -- it's just adding an extra not-unique index to the table when you already have the unique (primary) key.
We are getting duplicate keys in a table.
Do you mean you are getting errors as you try to insert, or do you mean you have some values stored in the column more than once?
Auto-increment only kicks in when you omit the column from your INSERT, or try to insert NULL or zero. Otherwise, you can specify a value in an INSERT statement, over-riding the auto-increment mechanism. For example:
INSERT INTO table_1 (sales_id) VALUES (26759);
If the value you specify already exists in the table, you'll get an error.
Please post the results of this query:
SELECT `sales_id`, COUNT(*) AS `num`
FROM `table_1`
GROUP BY `sales_id`
HAVING `num` > 1
ORDER BY `num` DESC
If you have a unique key on other fields, that could be the problem.
If you have reached the highest value for your auto_increment column MySQL will keep trying to re-insert it. For example, if sales_id was a tinyint column, you would get duplicate key errors after you reached id 127.