I'm moving a msSQL database to mySQL. In my MS SQL database I have stored procedures that use an "upsert" (update, if ##rowcount = 0, Insert) type of thing.
I want to do the same kind of think in My SQL. I have found a couple of options
1) use - Insert xxxx on duplicate key update values (x, y, x) etc.
2) Replace into table_name (col_name,...) (value,...)
Which one is more efficient? I'm leaning towards the 2nd one since I will be doing a lot of updating, rather than inserting. Also, I believe that the insert on duplicate key will keep bumping the auto_increment values even when it ends up being an update.
Another note: Each account record will be updated EVERY night. Occasionally a new account record will be inserted, but again, primarily the accounts will be updated.
Are there other/better options that I'm overlooking. Am I on the right track?
MySQL has several facilities for this:
REPLACE has the effect of a DELETE if the row exists, then INSERT. This means it cannot perform partial updates on the data, any fields that are not specified revert to defaults.
ON DUPLICATE KEY UPDATE is an option on an INSERT that can handle key collisions, including those on a PRIMARY KEY. If a duplicate is found, the UPDATE statement you specify is executed instead.
For example:
INSERT INTO people (id, name) VALUES (1, 'Jeremy')
ON DUPLICATE KEY UPDATE name=VALUES(name)
You can use VALUES() to specify the same value in the INSERT without having to repeat it.
It's important to remember that NULL values can be duplicated since they don't exist and aren't equivalent: NULL=NULL is false. Non-NULL values can be enforced unique, or you can have a NOT NULL column to avoid this.
Related
I have a simpe query like so:
INSERT INTO myTable (col1, col2) VALUES
(1,2),
(1,3),
(2,2)
I need to do a check that no duplicate values have been added BUT the check needs to happen across both column: if a value exists in col1 AND col2 then I don't want to insert. If the value exists only in one of those columns but not both then then insert should go through..
In other words let's say we have the following table:
+-------------------------+
|____col1____|___col2_____|
| 1 | 2 |
| 1 | 3 |
|______2_____|_____2______|
Inserting values like (2,3) and (1,1) would be allowed, but (1,3) would not be allowed.
Is it possible to do a WHERE NOT EXISTS check a single time? I may need to insert 1000 values at one time and I'm not sure whether doing a WHERE check on every single insert row would be efficient.
EDIT:
To add to the question - if there's a duplicate value across both columns, I'd like the query to ignore this specific row and continue onto inserting other values rather than throwing an error.
What you might want to use is either a primary key or a unique index across those columns. Afterwards, you can use either replace into or just insert ignore:
create table myTable
(
a int,
b int,
primary key (a,b)
);
-- Variant 1
replace into myTable(a,b) values (1, 2);
-- Variant 2
insert ignore into myTable(a,b) values (1,2);
See Insert Ignore and Replace Into
Using the latter variant has the advantage that you don't change any record if it already exists (thus no need to rebuild any index) and would best match your needs regarding your question.
If, however, there are other columns that need to be updated when inserting a record violating a unique constraint, you can either use replace into or insert into ... on duplicate key update.
Replace into will perform a real deletion prior to inserting a new record, whereas insert into ... on duplicate key update will perform an update instead. Although one might think that the result will be same, so why is there a statement for both operations, the answer can be found in the side-effects:
Replace into will delete the old record before inserting the new one. This causes the index to be updated twice, delete and insert triggers get executed (if defined) and, most important, if you have a foreign key constraint (with on delete restrict or on delete cascade) defined, your constraint will behave exactly the same way as if you deleted the record manually and inserted the new version later on. This means: Either your operation fails because the restriction is in place or the delete operation gets cascaded to the target table (i.e. deleting related records there, although you just changed some column data).
On the other hand, when using on duplicate key update, update triggers will get fired, the indexes on changed columns will be rewritten once and, if a foreign key is defined on update cascade for one of the columns being changed, this operation is performed as well.
To answer your question in the comments, as stated in the manual:
If you use the IGNORE modifier, errors that occur while executing the INSERT statement are ignored. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted. With IGNORE, the row is discarded and no error occurs. Ignored errors may generate warnings instead, although duplicate-key errors do not.
So, all violations are treated as warnings rather than errors, causing the insert to complete. Otherwise, the insert would be applied partially (except when using transactions). Violations of duplicate key, however, do not even produce such a warning. Nonetheless, all records violating any constraint won't get inserted at all, but ignore will ensure all valid records get inserted (given that there is no system failure or out-of-memory condition).
I ve already seen some questions regarding this like below
MySQL “good” way to insert a row if not found, or update it if it is found
Now i have a summary table which gets updated with the qty every time say a sale occurs. so out of 1000 sales of an item only first time the insert executes and the rest of the times it would be update. My understanding is in Insert on Duplicate Key Update it tries to insert first and if it fails updates. so all 999 times the insert is not successfull
1) Is there a method to check Update first and if not updated then insert in a single statement?
2) which of the below methods would be desirable considering most of the cases update will be successfull
a) using Insert on Duplicate Key Update
b) Call Update; if no rows affected call insert
Right now i am using the second option(b). performance gain is very important here and also i m testing the first option. ill post the results here once done
INSERT ON DUPLICATE KEY UPDATE is the way to go. It does not perform a full insert or how you put it.
For this statement to work there has to be a primary key or unique key on the table and the corresponding columns have to be involved in the statement.
Before it's decided whether an insert or an update statement has to be done, the said keys are checked. This is usually really fast.
With your "update first" approach you gain nothing, no it gets even worse. The primary key lookup has to be done anyway. In the worst case you wasted time by having to look up the primary key two times. First for the update statement (which may not be necessary), then for the insert statement.
I'm trying to make 2 values unique, like if I have the values (5, 10) the same values can't be added again.
I'm currently selecting from the table the values x and y, checking if they both together exists on the table if they don't exists insert them, in other words
"Select * from location where x=? and y=?"
if no result is returned it will continue to insert the values.
This is typically accomplished by creating a unique index on both columns combined (a multi-column index).
Then, MySQL will prevent you from inserting duplicates. You can go ahead and try to insert the record, and if you get a duplicate key error, you know it already exists.
Alternatively, another way to handle it is to use INSERT IGNORE, so that no error occurs if you try to insert a duplicate row. Still, it won't insert, so you simply check the affected ROW_COUNT() to see if the insert was successful.
Using a unique index and catching the failure on the insert is more performant than selecting then trying to insert because in the case you do insert, MySQL only has to perform one search, rather than two.
Kinda strange to put it into words that short, heh.
Anyway, what I want is basically to update an entry in a table if it does exist, otherwise to create a new one filling it with the same data.
I know that's easy, but I'm relatively new to MySQL in terms of how much I've used it :P
A lot of developers still execute a query to check if a field is present in a table and then execute an insert or update query according to the result of the first query.
Try using the ON DUPLICATE KEY syntax, this is a lot faster and better then executing 2 queries. More info can be found here
INSERT INTO table (a,b,c) VALUES (4,5,6)
ON DUPLICATE KEY UPDATE c=9;
if you want to keep the same value for c you can do an update with the same value
INSERT INTO table (a,b,c) VALUES (4,5,6)
ON DUPLICATE KEY UPDATE c=6;
the difference between 'replace' and 'on duplicate key':
replace: inserts, or deletes and inserts
on duplicate key: inserts or updates
if your table doesn't have a primary key or unique key, the replace doesn't make any sense.
You can also use the VALUES function to avoid having to specify the actual values twice. E.g. instead of
INSERT INTO table (a,b,c) VALUES (4,5,6) ON DUPLICATE KEY UPDATE c=6;
you can use
INSERT INTO table (a,b,c) VALUES (4,5,6) ON DUPLICATE KEY UPDATE c=VALUES(c);
Where VALUES(c) will evaluate to the value specified prevously (6).
Use 'REPLACE INTO':
REPLACE INTO table SET id = 42, foo = 'bar';
See more in the MySQL documentation
As the others have said, REPLACE is the way to go. Just be careful using it though, since it actually does a DELETE and INSERT on the table. This is fine most of the time, but if you have foreign keys with constraints like ON DELETE CASCADE, it can cause some big problems.
Look up REPLACE in the MySQL manual.
REPLACE works exactly like INSERT,
except that if an old row in the table
has the same value as a new row for a
PRIMARY KEY or a UNIQUE index, the old
row is deleted before the new row is
inserted. See Section 12.2.5, “INSERT
Syntax”.
REPLACE is a MySQL extension to the
SQL standard. It either inserts, or
deletes and inserts. For another MySQL
extension to standard SQL — that
either inserts or updates — see
Section 12.2.5.3, “INSERT ... ON
DUPLICATE KEY UPDATE Syntax”.
If you have the following INSERT query:
INSERT INTO table (id, field1, field2) VALUES (1, 23, 24)
This is the REPLACE query you should run:
REPLACE INTO table (id, field1, field2) VALUES (1, 23, 24)
I want to update a table value on Mysql 5 but if the key does not exist create it.
The way I found to do it is by:
INSERT yyy ON DUPLICATE KEY UPDATE field;
The question is : is the format above less efficient than other ways to do it (As the insert will happen only once and update will happen very often)?
for example:
$result = UPDATE field;
if (num_rows_effected($result)==0) INSERT yyy
Furthermore: Is there a better way to do this in Mysql: for example a kind of:
UPDATE value IF NO SUCH ROW INSERT yyy;
Update: For those who suggested REPLACE, here is an extension to my question:
"Thanks! I need to increase a counter that is already in the table (if it exists). If not create a table row with value 1 for this column. How can I do update with this format (REPLACE)? "
There is a REPLACE also.
INSERT ON DUPLICATE KEY UPDATE will fire UPDATE triggers when it will stumble upon a duplicate key and won't violate FK's in case on UPDATE.
REPLACE will fire DELETE and INSERT triggers, and will violate FK's referencing the row being REPLACE'd.
If you don't have any triggers or FK's, then use INSERT ON DUPLICATE KEY UPDATE, it's most efficient.
You seem to be looking for this query:
INSERT
INTO table (key, counter)
VALUES (#key, 1)
ON DUPLICATE KEY UPDATE
counter = counter + 1
You cannot do this with REPLACE unless you have selected previous value of the counter before running the query.
P. S. REPLACE appeared in MySQL before ON DUPLICATE KEY UPDATE and is being kept only for compatibility. There is no performance increase from using it.
Yes, you can use the 'replace' syntax:
REPLACE INTO table1 (key, col1, col2) values (1, 'val1','val2');
This is a feature specific to MySQL and is not necessarily implemented in other databases.
As for efficiency, my guess is that a straight update will be faster, since MySQL essentially catches the duplicate key error and handles it accordingly. However, unless you are doing large amounts of insert/updates, the performance impact will be fairly small.
Look at the REPLACE command, it meets your requirements.