duplicate entry for key primary errors while replicating a table - mysql

I have some CRM data that exists in a MS SQL server, that I must move to mysql daily. I've got some python-pandas, read_sql() and to_sql() scripts that move the tables. I'm running into duplicate primary keys errors after doing some upsert logic. I have the GUID from CRM as the primary key for the table - in MySQL it is a varchar(64) datatype. Unsure what's triggering the duplicate warning.
mysql_table:
GUID-PK Name favorite_number modifiedon
00000B9D... Ben 10 '2017-01-01'
000A82A5... Harry 9 '2017-05-15'
000A9896... Fred 5 '2017-12-19'
(the GUIDs are longer, i'm shortening for the example)
I pull all the new records from MS SQL into a temporary table in MySQL based on modified dates that are greater than my current table. Some of these could be new records some could be records that already exist in my current table but have been updated.
mysql_temp_table:
GUID-PK Name favorite_number modifiedon
00000B9D... Ben 15 '2018-01-01'
000A82BB... John 3 '2018-03-15'
000A4455... Ray 13 '2018-04-01'
I want to replace any modified records, straight up, so I delete all the common records from the mysql_table. In this example, I want to remove Ben from the mysql_table, so that it can be replaced by Ben from the mysql_temp_table:
DELETE FROM mysql_table WHERE GUID-PK IN (SELECT GUID-PK FROM mysql_temp_table)
Then I want to just move the whole temp table into the replicated table with:
INSERT INTO mysql_table (SELECT * FROM temp_table)
But that gives me an error:
"Duplicate entry '0' for key 'PRIMARY'") [SQL: 'INSERT INTO mysql_table SELECT * FROM mysql_temp_table'
I can see that many of the GUID's start with '000', it seems like this is being interpreted as '0'. Shouldn't this be caught in the Delete-IN statement from above. i'm stuck on where to go next. Thanks in advance.

I suspect that the DELETE statement operation is failing with an error.
That's because the dash character isn't a valid character in an identifier. If the column name is really GUID-PK, then that needs to be properly escaped in the SQL text, either by enclosing it in backticks (the normal pattern in MySQL), or if sql_mode includes ANSI_QUOTES, then the identifiers can be enclosed in double quotes.
Another possibility is that temp_table does not have a PRIMARY or UNIQUE KEY constraint defined on the GUID-PK column, and there are multiple rows in temp_table that have the same value for GUID-PK, leading to a duplicate key exception on the INSERT into mysql_table.
Another guess (since we're not seeing the definition of the temp_table) is that the columns are in a different order, such that SELECT * FROM temp_table isn't returning columns in the order expected in mysql_table. I'd address that issue by explicitly listing the columns, of both the target table for the INSERT, and in the SELECT list.
Given that that GUID-PK column is a unique key, I would tend to avoid two separate statements (a DELETE followed by an INSERT), and just use INSERT ... ON DUPLICATE KEY UPDATE statement.
INSERT INTO mysql_table (`guid-pk`, `name`, `favorite_number`, `modifiedon` )
SELECT s.`guid-pk`, s.`name`, s.`favorite_number`, s.`modifiedon`
FROM temp_table s
ORDER
BY s.`guid-pk`
ON DUPLICATE KEY
UPDATE `name` = VALUES( `name` )
, `favorite_number` = VALUES( `favorite_number` )
, `modifiedon` = VALUES( `modifiedon` )

You may have AUTOCOMMIT disabled.
If you are performing both actions in the same TRANSACTION and do not have AUTOCOMMIT enabled your second READ COMMITTED statement will fail. INSERTS, UPDATES, and DELETES are executed using the READ COMMITTED Isolation Level
Your INSERT is being performed on the data set as it appeared before your DELETE. You need to either:
A. Explicitly COMMIT your DELETE within the TRANSACTION
or
B. Split the two statements into individual TRANSACTIONs
or
C. Renable AUTOCOMMIT
If this is not the case you will need to investigate your data sets for your DELETE and INSERT statements, because a DELETE will not just fail silently.

Related

How to update a row and insert one if it doesn't exist, without wrongly raising auto_increment [duplicate]

I have table structure like this
when I insert row to the table I'm using this query:
INSERT INTO table_blah ( material_item, ... hidden ) VALUES ( data, ... data ) ON DUPLICATE KEY UPDATE id = id, material_item = data, ... hidden = data;
when I first insert data without triggering the ON DUPLICATE KEY the id increments fine:
but when the ON DUPLICATE KEY triggers and i INSERT A NEW ROW the id looks odd to me:
How can I keep the auto increment, increment properly even when it triggers ON DUPLICATE KEY?
This behavior is documented (paragraph in parentheses):
If you specify ON DUPLICATE KEY UPDATE, and a row is inserted that
would cause a duplicate value in a UNIQUE index or PRIMARY KEY, MySQL
performs an UPDATE of the old row. For example, if column a is
declared as UNIQUE and contains the value 1, the following two
statements have similar effect:
INSERT INTO table (a,b,c) VALUES (1,2,3) ON DUPLICATE KEY UPDATE c=c+1;
UPDATE table SET c=c+1 WHERE a=1;
(The effects are not identical for
an InnoDB table where a is an auto-increment column. With an
auto-increment column, an INSERT statement increases the
auto-increment value but UPDATE does not.)
Here is a simple explanation. MySQL attempts to do the insert first. This is when the id gets auto incremented. Once incremented, it stays. Then the duplicate is detected and the update happens. But the value gets missed.
You should not depend on auto_increment having no gaps. If that is a requirement, the overhead on the updates and inserts is much larger. Essentially, you need to put a lock on the entire table, and renumber everything that needs to be renumbered, typically using a trigger. A better solution is to calculate incremental values on output.
This question is a fairly old one, but I answer it maybe it helps someone, to solve the auto-incrementing problem use the following code before insert/on duplicate update part and execute them all together:
SET #NEW_AI = (SELECT MAX(`the_id`)+1 FROM `table_blah`);
SET #ALTER_SQL = CONCAT('ALTER TABLE `table_blah` AUTO_INCREMENT =', #NEW_AI);
PREPARE NEWSQL FROM #ALTER_SQL;
EXECUTE NEWSQL;
together and in one statement it should be something like below:
SET #NEW_AI = (SELECT MAX(`the_id`)+1 FROM `table_blah`);
SET #ALTER_SQL = CONCAT('ALTER TABLE `table_blah` AUTO_INCREMENT =', #NEW_AI);
PREPARE NEWSQL FROM #ALTER_SQL;
EXECUTE NEWSQL;
INSERT INTO `table_blah` (`the_col`) VALUES("the_value")
ON DUPLICATE KEY UPDATE `the_col` = "the_value";
I had the same frustration of gaps in the auto increment but I found a way to avoid it.
In terms of previouslly discussed "overheads". When I first wrote my DB query code, it did so many separate queries that it took 5 hours. Once I put on
"ON DUPLICATE KEY UPDATE"
it got it down to about 50 seconds. Amazing! Anyway the way I solved it was by using 2 queries. Which doulbles the time it takes to 2 minutes, which is still fine.
First I did an sql query for writing all the data (updates and inserts), but I included "IGNORE" in the first query, so this just bypasses the updates and only inserts the new stuff. So assuming your auto_increment previously has no gaps then it will still have no gaps because its only new records. I believe it is updates that cause the gaps. So for inserts:
"INSERT IGNORE INTO mytablename(stuff,stuff2) VALUES "
Next I did the "ON DUPLICATE KEY UPDATE" variation of that sql query. It will keep the ID's in tact because all the records being updated have ID's already. The only thing it breaks is the auto_increment value, which gets incremented when a new record is added (or updated). So the solution is to patch this auto_increment value back to what it was before, once you have applied the updates.
To patch the auto increment value use this sql in your php:
"ALTER TABLE mytablename AUTO_INCREMENT = " . ($TableCount + 1);
This works because when you do the updates you are not increasing the amount of records. Therefore we can use the tablecount to know what the next ID should be. You set $TableCount to the table count, then we add 1 and that's the next auto increment number.
This is cheap and dirty but it seems to work. Could be bad using this while something else is writing to the db though.
Change database engine from InnoDB to MyIsam will resolve your issue.
I often deal with this by creating a temporary table, recording in the temporary table whether the record is new or not, doing an UPDATE only on the rows that are not new, and doing an INSERT with the new rows. Here's a complete example:
## THE SETUP
# This is the table we're trying to insert into
DROP TABLE IF EXISTS items;
CREATE TABLE items (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) UNIQUE,
price INT
);
# Put a few rows into the table
INSERT INTO items (name, price) VALUES
("Bike", 200),
("Basketball", 10),
("Fishing rod", 25)
;
## THE INSERT/UPDATE
# Create a temporary table to help with the update
DROP TEMPORARY TABLE IF EXISTS itemUpdates;
CREATE TEMPORARY TABLE itemUpdates (
name VARCHAR(100) UNIQUE,
price INT,
isNew BOOLEAN DEFAULT(true)
);
# Change the price of the Bike and Basketball and add a new Tent item
INSERT INTO itemUpdates (name, price) VALUES
("Bike", 150),
("Basketball", 8),
("Tent", 100)
;
# For items that already exist, set isNew false
UPDATE itemUpdates
JOIN items
ON items.name = itemUpdates.name
SET isNew = false;
# UPDATE the already-existing items
UPDATE items
JOIN itemUpdates
ON items.name = itemUpdates.name
SET items.price = itemUpdates.price
WHERE itemUpdates.isNew = false;
# INSERT the new items
INSERT INTO items (name, price)
SELECT name, price
FROM itemUpdates
WHERE itemUpdates.isNew = true;
# Check the results
SELECT * FROM items;
# Results:
# ID | Name | Price
# 1 | Bike | 150
# 2 | Basketball | 8
# 3 | Fishing rod | 25
# 4 | Tent | 100
The INSERT IGNORE INTO approach is simpler, but it ignores any error, which isn't what I want. And I agree that this is strange behavior on the part of MySQL but it's what we've got to work with.
I just thought I'd add, as i was trying to find an answer to my problem.
I could not stop the duplicate warning and found it was because I had it set it to TINYINT which only allows 127 entries, changing to SMALL/MED/BIGINT allows for many more
I don't think this is a problem with MySQL 5.6. See this example.
ON DUPLICATE KEY UPDATE id=LAST_INSERT_ID(id)
Adding less of a direct answer and more of a fix to the end results.
If you don't use your autoincrement as an identification field within your application (and you really shouldn't be. A UUID or something of that nature is better practice), and of course, if you don't have multi-billions of lines, you can reset your autoincrement field fairly easily.
SET SQL_SAFE_UPDATES = 0;
SET #num := 0;
UPDATE my_table SET id = #num := (#num+1);
ALTER TABLE my_table AUTO_INCREMENT =1;
I kinda hate that this is a thing when doing an INSERT UPDATE in MySQL.
This is not my code. I got it some somewhere on SO but it was so long ago...
Additional note, this is not really an answer to this issue. Its more to help fix an out-of-control autoincrement field.
INSERT INTO table_blah ( material_item, ... hidden ) VALUES ( data, ... data ) ON DUPLICATE KEY UPDATE material_item = data, ... hidden = data
Yes remove the ID=ID as it will automaticly add where PRIMARY KEY = PRIMARY KEY...

MySQL: Insert multiple values if they don't exist, but need a multiple column check

I have a simpe query like so:
INSERT INTO myTable (col1, col2) VALUES
(1,2),
(1,3),
(2,2)
I need to do a check that no duplicate values have been added BUT the check needs to happen across both column: if a value exists in col1 AND col2 then I don't want to insert. If the value exists only in one of those columns but not both then then insert should go through..
In other words let's say we have the following table:
+-------------------------+
|____col1____|___col2_____|
| 1 | 2 |
| 1 | 3 |
|______2_____|_____2______|
Inserting values like (2,3) and (1,1) would be allowed, but (1,3) would not be allowed.
Is it possible to do a WHERE NOT EXISTS check a single time? I may need to insert 1000 values at one time and I'm not sure whether doing a WHERE check on every single insert row would be efficient.
EDIT:
To add to the question - if there's a duplicate value across both columns, I'd like the query to ignore this specific row and continue onto inserting other values rather than throwing an error.
What you might want to use is either a primary key or a unique index across those columns. Afterwards, you can use either replace into or just insert ignore:
create table myTable
(
a int,
b int,
primary key (a,b)
);
-- Variant 1
replace into myTable(a,b) values (1, 2);
-- Variant 2
insert ignore into myTable(a,b) values (1,2);
See Insert Ignore and Replace Into
Using the latter variant has the advantage that you don't change any record if it already exists (thus no need to rebuild any index) and would best match your needs regarding your question.
If, however, there are other columns that need to be updated when inserting a record violating a unique constraint, you can either use replace into or insert into ... on duplicate key update.
Replace into will perform a real deletion prior to inserting a new record, whereas insert into ... on duplicate key update will perform an update instead. Although one might think that the result will be same, so why is there a statement for both operations, the answer can be found in the side-effects:
Replace into will delete the old record before inserting the new one. This causes the index to be updated twice, delete and insert triggers get executed (if defined) and, most important, if you have a foreign key constraint (with on delete restrict or on delete cascade) defined, your constraint will behave exactly the same way as if you deleted the record manually and inserted the new version later on. This means: Either your operation fails because the restriction is in place or the delete operation gets cascaded to the target table (i.e. deleting related records there, although you just changed some column data).
On the other hand, when using on duplicate key update, update triggers will get fired, the indexes on changed columns will be rewritten once and, if a foreign key is defined on update cascade for one of the columns being changed, this operation is performed as well.
To answer your question in the comments, as stated in the manual:
If you use the IGNORE modifier, errors that occur while executing the INSERT statement are ignored. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted. With IGNORE, the row is discarded and no error occurs. Ignored errors may generate warnings instead, although duplicate-key errors do not.
So, all violations are treated as warnings rather than errors, causing the insert to complete. Otherwise, the insert would be applied partially (except when using transactions). Violations of duplicate key, however, do not even produce such a warning. Nonetheless, all records violating any constraint won't get inserted at all, but ignore will ensure all valid records get inserted (given that there is no system failure or out-of-memory condition).

MYSQL primary key combination of 2 unique and not the same values

I have a simple MYSQL table with following columns:
first | second
first and second are integers. The primary key for the table is
PRIMARY KEY (`first`,`second`)
So this allows only a unique combination of values like:
first | second
1 | 2
2 | 1
But this key also accepts the same value for both columns. For example:
first | second
1 | 1
Is there a way to force both values to be different using MYSQL. I can do a check with PHP before inserting into the database but I'm wondering if there is a way in MYSQL to achieve it?
This restriction can't be enforced by a PRIMARY KEY or UNIQUE constraint.
Unfortunately, MySQL does not enforce CHECK CONSTRAINTS, which is what we would likely use in other databases.
To get MySQL to enforce a constraint like this, you would need to implement a BEFORE INSERT and a BEFORE UPDATE trigger.
The "trick" in the trigger body would be to detect this condition you want to restrict, e.g.
IF (NEW.first = NEW.second) THEN
And then have the trigger throw an error. In more recent versions of MySQL provide the SIGNAL statement for raising an exception. In older versions of MySQL, you'd run a statement that would throw an error (for example, performing a SELECT against a table name that is known not to exist.)
FOLLOWUP
The IF statement is valid only within the context of a MySQL stored program (for example, a PROCEDURE, FUNCTION, or TRIGGER).
To get this kind of restriction applied by an INSERT statement itself, without a constraint or trigger, we'd need to use the INSERT ... SELECT form of an INSERT statement.
For example:
INSERT INTO `mytable` (`first`, `second`)
SELECT t.first, t.second
FROM ( SELECT '1' AS `first, '1' AS `second`) t
WHERE t.first <> t.second
Since the SELECT statement returns no rows, no rows are inserted to the table.
Note that this approach applies the restriction only on this statement; This doesn't prevent some other session from performing an INSERT that doesn't enforce this restriction. To get this restriction enforced as a constraint "by the database", you'd need to implement a BEFORE INSERT and BEFORE UPDATE trigger I described earlier in the answer.

How to swap values of two rows in MySQL without violating unique constraint?

I have a "tasks" table with a priority column, which has a unique constraint.
I'm trying to swap the priority value of two rows, but I keep violating the constraint. I saw this statement somewhere in a similar situation, but it wasn't with MySQL.
UPDATE tasks
SET priority =
CASE
WHEN priority=2 THEN 3
WHEN priority=3 THEN 2
END
WHERE priority IN (2,3);
This will lead to the error:
Error Code: 1062. Duplicate entry '3' for key 'priority_UNIQUE'
Is it possible to accomplish this in MySQL without using bogus values and multiple queries?
EDIT:
Here's the table structure:
CREATE TABLE `tasks` (
`id` int(11) NOT NULL,
`name` varchar(200) DEFAULT NULL,
`priority` varchar(45) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `priority_UNIQUE` (`priority`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Is it possible to accomplish this in MySQL without using bogus values and multiple queries?
No. (none that I can think of).
The problem is how MySQL processes updates. MySQL (in difference with other DBMS that implement UPDATE properly), processes updates in a broken manner. It enforces checking of UNIQUE (and other) constraints after every single row update and not - as it should be doing - after the whole UPDATE statement completes. That's why you don't have this issue with (most) other DBMS.
For some updates (like increasing all or some ids, id=id+1), this can be solved by using - another non-standard feature - an ORDER BY in the update.
For swapping the values from two rows, that trick can't help. You'll have to use NULL or a bogus value (that doesn't exist but is allowed in your column) and 2 or 3 statements.
You could also temporarily remove the unique constraint but I don't think that's a good idea really.
So, if the unique column is a signed integer and there are no negative values, you can use 2 statements wrapped up in a transaction:
START TRANSACTION ;
UPDATE tasks
SET priority =
CASE
WHEN priority = 2 THEN -3
WHEN priority = 3 THEN -2
END
WHERE priority IN (2,3) ;
UPDATE tasks
SET priority = - priority
WHERE priority IN (-2,-3) ;
COMMIT ;
I bumped into the same issue. Had tried every possible single-statement query using CASE WHEN and TRANSACTION - no luck whatsoever. I came up with three alternative solutions. You need to decide which one makes more sense for your situation.
In my case, I'm processing a reorganized collection (array) of small objects returned from the front-end, new order is unpredictable (this is not a swap-two-items deal), and, on top of everything, change of order (usually made in English version) must propagate to 15 other languages.
1st method: Completely DELETE existing records and repopulate entire collection using the new data. Obviously this can work only if you're receiving from the front-end everything that you need to restore what you just deleted.
2st method: This solution is similar to using bogus values. In my situation, my reordered collection also includes original item position before it moved. Also, I had to preserve original index value in some way while UPDATEs are running. The trick was to manipulate bit-15 of the index column which is UNSIGNED SMALLINT in my case. If you have (signed) INT/SMALLINT data type you can just invert the value of the index instead of bitwise operations.
First UPDATE must run only once per call. This query raises 15th bit of the current index fields (I have unsigned smallint). Previous 14 bits still reflect original index value which is never going to come close to 32K range.
UPDATE *table* SET `index`=(`index` | 32768) WHERE *condition*;
Then iterate your collection extracting original and new index values, and UPDATE each record individually.
foreach( ... ) {
UPDATE *table* SET `index`=$newIndex WHERE *same_condition* AND `index`=($originalIndex | 32768);
}
This last UPDATE must also run only once per call. This query clears 15th bit of the index fields effectively restoring original index value for records where it hasn't changed, if any.
UPDATE *table* SET `index`=(`index` & 32767) WHERE *same_condition* AND `index` > 32767;
Third method would be to move relevant records into temporary table that doesn't have a primary key, UPDATE all indexes, then move all records back to first table.
Bogus value option:
Okay, so my query is similar and I've found a way to update in "one" query. My id column is PRIMARY and position is part of a UNIQUE group. This is my original query that doesn't work for swapping:
INSERT INTO `table` (`id`, `position`)
VALUES (1, 2), (2, 1)
ON DUPLICATE KEY UPDATE `position` = VALUES(`position`);
.. but position is an unsigned integer and it's never 0, so I changed the query to the following:
INSERT INTO `table` (`id`, `position`)
VALUES (2, 0), (1, 2), (2, 1)
ON DUPLICATE KEY UPDATE `position` = VALUES(`position`);
.. and now it works! Apparently, MYSQL processes the values groups in order.
Perhaps this would work for you (not tested and I know almost nothing about MYSQL):
UPDATE tasks
SET priority =
CASE
WHEN priority=3 THEN 0
WHEN priority=2 THEN 3
WHEN priority=0 THEN 2
END
WHERE priority IN (2,3,0);
Good luck.
Had a similar problem.
I wanted to swap 2 id's that were unique AND was a FK from an other table.
The fastest solution for me to swap two unique entries was:
Create a ghost entry in my FK table.
Go back to my table where I want to switch the id's.
Turned of the FK Check SET FOREIGN_KEY_CHECKS=0;
Set my first(A) id to the ghost(X) fk (free's A)
Set my second (B) id to A (free's B)
Set A to B (free's X)
Delete ghost record and turn checks back on. SET FOREIGN_KEY_CHECKS=1;
Not sure if this would violate the constraints, but I have been trying to do something similar and eventually came up with this query by combining a few of the answers I found:
UPDATE tasks as T1,tasks as T2 SET T1.priority=T2.priority,T2.priority=T1.priority WHERE (T1.task_id,T2.task_id)=($T1_id, $T2_id)
The column I was swapping did not use a unique, so I am unsure if this will help...
you can achieve swapping your values with your above mentioned update statement, with a slight change in your key indexes.
CREATE TABLE `tasks` ( `id` int(11) NOT NULL, `name` varchar(200) DEFAULT NULL, `priority` varchar(45) DEFAULT NULL, PRIMARY KEY (`id`,`priority`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
This will have a primary key index as a combination of id and priority. you cna then swap values.
UPDATE tasks
SET priority =
CASE
WHEN priority=2 THEN 3
WHEN priority=3 THEN 2
END
WHERE priority IN (2,3);
I dont see any need of user variables or temp variables here.
Hope this solves your issue :)

Mysql losing 11 records on insert

I download an XML file containing 1048 records, and then I successfully create a table($today) in my DB, and load the XML data into the MySQL table.
I then run a second script which contains this query:
INSERT INTO
t1
(
modelNumber,
salePrice
)
SELECT modelNumber,salePrice
FROM `'.$today.'`
ON DUPLICATE KEY UPDATE t1.modelNumber=`'.$today.'`.modelNumber,
t1.salePrice=`'.$today.'`.salePrice
");
It works, but I'm losing 11 records. The total count is 1037, while the $today table has the exact amount of records contained in the XML file (1048).
How can I correct this problem?
Runs some queries on the $today to find your 11 duplicates.
The ON DUPLICATE KEY clause will suppress these 11 records.
If there is a duplicate key in your file, you update the old row
ON DUPLICATE KEY UPDATE
Means that if the insert doesn't work because of a duplicate key, you get the update mentioned after that line.
There are probably 11 entries that are duplicate keys, and they update rather then insert. I would change it to this (a bit of a hack, but the quickest way I can think without any more info to find the culprints)
INSERT INTO
t1
(
modelNumber,
salePrice
)
SELECT modelNumber,salePrice
FROM `'.$today.'`
ON DUPLICATE KEY UPDATE t1.modelNumber=`'.$today.'`.modelNumber,
t1.salePrice= '999999999'
");
Then you can look for entries with that salePrice fo 9999999 , and you at least know what (or even if) duplicate keys you need to look for in your XML