Related
I have a table in MySQL (50 million rows) new data keep inserting periodically.
This table has following structure
CREATE TABLE values (
id double NOT NULL AUTO_INCREMENT,
channel_id int(11) NOT NULL,
val text NOT NULL,
date_time datetime NOT NULL,
PRIMARY KEY (id),
KEY channel_date_index (channel_id,date_time)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Two rows must never have duplicate channel_id and date_time, but if such insert occurs it is important to keep the newest value.
Is there a procedure to check for duplicates realtime before the insert or should I keep inserting all data while doing periodic checks for duplicity in a different cycle.
Realtime speed is important here, because 100 inserts occur per second.
To prevent future duplicates:
Change KEY channel_date_index (channel_id,date_time) to UNIQUE (channel_id,date_time)
Change the INSERT to INSERT ... ON DUPLICATE KEY UPDATE ... to change the timestamp when that pair exists.
To fix the existing table, you could do ALTER IGNORE TABLE ... ADD UNIQUE(...). However that would not give you the latest timestamps.
For minimum downtime (not maximum speed), use pt-online-schema-change.
I want to write a program add new item to table. This item has an unique key name and it can be created by one of 100 threads, so I need to make sure that it is inserted only once.
I have two ideas:
Use insert ignore
Fetch it from database via select then insert it to table if no returned row.
Which option is better? Is there an even more superior idea?
Late to the party, but I'm pondering something similar.
I created the following table to track active users on a license per day:
CREATE TABLE `license_active_users` (
`license_active_user_id` int(11) NOT NULL AUTO_INCREMENT,
`license_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`license_active_user_id`),
UNIQUE KEY `license_id` (`license_id`,`user_id`,`date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
In other words, 1 primary key and 1 unique index across the remaining 3 columns.
I then inserted 1 million unique rows into the table.
Attempting to re-insert a subset (10,000 rows) of the same data yielded the following results:
INSERT IGNORE: 38 seconds
INSERT ... ON DUPLICATE KEY UPDATE: 40 seconds
if (!rowExists("SELECT ...")) INSERT: <2 seconds
If those 10,000 rows aren't already present in the table:
INSERT IGNORE: 34 seconds
INSERT ... ON DUPLICATE KEY UPDATE: 41 seconds
if (!rowExists("SELECT ...")) INSERT: 21 seconds
So the conclusion must be if (!rowExists("SELECT ...")) INSERT is fastest by far - at least for this particular table configuration.
The missing test is if (rowExists("SELECT ...")){ UPDATE } else { INSERT }, but I'll assume INSERT ... ON DUPLICATE KEY UPDATE is faster for this operation.
For your particular case, however, I would go with INSERT IGNORE because (as far as I'm aware) it's an atomic operation and that'll save you a lot of trouble when working with threads.
SELECT + INSERT -- two round trips to the server, hence slower.
INSERT IGNORE -- requires a PRIMARY or UNIQUE key to decide whether to toss the new INSERT. If this works for you, it is probably the best.
REPLACE -- is a DELETE + an INSERT. This is rarely the best.
INSERT ... ON DUPLICATE KEY UPDATE -- This lets you either INSERT (if the PRIMARY/UNIQUE key(s) are not found) or UPDATE. This is the one to use if you have things you need to update in existing rows.
"Burning ids" -- Only the "select+insert" avoids a potential problem: running out of AUTO_INCREMENT ids (I call it "burning ids"). All the other techniques will allocate the next id before deciding whether it is needed.
If you have several names to conditionally insert into a normalization, then a 2-query technique can batch them quite efficiently, and not burn ids: http://mysql.rjweb.org/doc.php/staging_table#normalization
Best: SELECT + INSERT IGNORE.
Because it is use SELECT for check it do not need lock table or row in table.
Any INSERT need lock. So this can reduce performance on concurrent INSERT's.
I've searched around but didn't find if it's possible.
I've this MySQL query:
INSERT INTO table (id,a,b,c,d,e,f,g) VALUES (1,2,3,4,5,6,7,8)
Field id has a "unique index", so there can't be two of them. Now if the same id is already present in the database, I'd like to update it. But do I really have to specify all these field again, like:
INSERT INTO table (id,a,b,c,d,e,f,g) VALUES (1,2,3,4,5,6,7,8)
ON DUPLICATE KEY UPDATE a=2,b=3,c=4,d=5,e=6,f=7,g=8
Or:
INSERT INTO table (id,a,b,c,d,e,f,g) VALUES (1,2,3,4,5,6,7,8)
ON DUPLICATE KEY UPDATE a=VALUES(a),b=VALUES(b),c=VALUES(c),d=VALUES(d),e=VALUES(e),f=VALUES(f),g=VALUES(g)
I've specified everything already in the insert...
A extra note, I'd like to use the work around to get the ID to!
id=LAST_INSERT_ID(id)
I hope somebody can tell me what the most efficient way is.
The UPDATE statement is given so that older fields can be updated to new value. If your older values are the same as your new ones, why would you need to update it in any case?
For eg. if your columns a to g are already set as 2 to 8; there would be no need to re-update it.
Alternatively, you can use:
INSERT INTO table (id,a,b,c,d,e,f,g)
VALUES (1,2,3,4,5,6,7,8)
ON DUPLICATE KEY
UPDATE a=a, b=b, c=c, d=d, e=e, f=f, g=g;
To get the id from LAST_INSERT_ID; you need to specify the backend app you're using for the same.
For LuaSQL, a conn:getlastautoid() fetches the value.
There is a MySQL specific extension to SQL that may be what you want - REPLACE INTO
However it does not work quite the same as 'ON DUPLICATE UPDATE'
It deletes the old row that clashes with the new row and then inserts the new row. So long as you don't have a primary key on the table that would be fine, but if you do, then if any other table references that primary key
You can't reference the values in the old rows so you can't do an equivalent of
INSERT INTO mytable (id, a, b, c) values ( 1, 2, 3, 4)
ON DUPLICATE KEY UPDATE
id=1, a=2, b=3, c=c + 1;
I'd like to use the work around to get the ID to!
That should work — last_insert_id() should have the correct value so long as your primary key is auto-incrementing.
However as I said, if you actually use that primary key in other tables, REPLACE INTO probably won't be acceptable to you, as it deletes the old row that clashed via the unique key.
Someone else suggested before you can reduce some typing by doing:
INSERT INTO `tableName` (`a`,`b`,`c`) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE `a`=VALUES(`a`), `b`=VALUES(`b`), `c`=VALUES(`c`);
There is no other way, I have to specify everything twice. First for the insert, second in the update case.
Here is a solution to your problem:
I've tried to solve problem like yours & I want to suggest to test from simple aspect.
Follow these steps: Learn from simple solution.
Step 1: Create a table schema using this SQL Query:
CREATE TABLE IF NOT EXISTS `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(30) NOT NULL,
`password` varchar(32) NOT NULL,
`status` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `no_duplicate` (`username`,`password`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1;
Step 2: Create an index of two columns to prevent duplicate data using following SQL Query:
ALTER TABLE `user` ADD INDEX no_duplicate (`username`, `password`);
or, Create an index of two column from GUI as follows:
Step 3: Update if exist, insert if not using following queries:
INSERT INTO `user`(`username`, `password`) VALUES ('ersks','Nepal') ON DUPLICATE KEY UPDATE `username`='master',`password`='Nepal';
INSERT INTO `user`(`username`, `password`) VALUES ('master','Nepal') ON DUPLICATE KEY UPDATE `username`='ersks',`password`='Nepal';
Just in case you are able to utilize a scripting language to prepare your SQL queries, you could reuse field=value pairs by using SET instead of (a,b,c) VALUES(a,b,c).
An example with PHP:
$pairs = "a=$a,b=$b,c=$c";
$query = "INSERT INTO $table SET $pairs ON DUPLICATE KEY UPDATE $pairs";
Example table:
CREATE TABLE IF NOT EXISTS `tester` (
`a` int(11) NOT NULL,
`b` varchar(50) NOT NULL,
`c` text NOT NULL,
UNIQUE KEY `a` (`a`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
I know it's late, but i hope someone will be helped of this answer
INSERT INTO t1 (a,b,c) VALUES (1,2,3),(4,5,6)
ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);
You can read the tutorial below here :
https://mariadb.com/kb/en/library/insert-on-duplicate-key-update/
http://www.mysqltutorial.org/mysql-insert-or-update-on-duplicate-key-update/
You may want to consider using REPLACE INTO syntax, but be warned, upon duplicate PRIMARY / UNIQUE key, it DELETES the row and INSERTS a new one.
You won't need to re-specify all the fields. However, you should consider the possible performance reduction (depends on your table design).
Caveats:
If you have AUTO_INCREMENT primary key, it will be given a new one
Indexes will probably need to be updated
With MySQL v8.0.19 and above you can do this:
mysql doc
INSERT INTO mytable(fielda, fieldb, fieldc)
VALUES("2022-01-01", 97, "hello")
AS NEW(newfielda, newfieldb, newfieldc)
ON DUPLICATE KEY UPDATE
fielda=newfielda,
fieldb=newfieldb,
fieldc=newfieldc;
SIDENOTE: Also if you want a conditional in the on duplicate key update part there is a twist in MySQL. If you update fielda as the first argument and include it inside the IF clause for fieldb it will already be updated to the new value! Move it to the end or alike. Let's say fielda is a date like in the example and you want to update only if the date is newer than the previous:
INSERT INTO mytable(fielda, fieldb)
VALUES("2022-01-01", 97)
AS NEW(newfielda, newfieldb, newfieldc)
ON DUPLICATE KEY UPDATE
fielda=IF(fielda<STR_TO_DATE(newfielda,'%Y-%m-%d %H:%i:%s'),newfielda,fielda),
fieldb=IF(fielda<STR_TO_DATE(newfielda,'%Y-%m-%d %H:%i:%s'),newfieldb,fieldb);
in this case fieldb would never be updated because of the <! you need to move the update of fielda below it or check with <= or =...!
INSERT INTO mytable(fielda, fieldb)
VALUES("2022-01-01", 97)
AS NEW(newfielda, newfieldb, newfieldc)
ON DUPLICATE KEY UPDATE
fielda=IF(fielda<STR_TO_DATE(newfielda,'%Y-%m-%d %H:%i:%s'),newfielda,fielda),
fieldb=IF(fielda=STR_TO_DATE(newfielda,'%Y-%m-%d %H:%i:%s'),newfieldb,fieldb);
This works as expected with using = since fielda is already updated to its new value before reaching the if clause of fieldb... Personally i like <= the most in such a case if you ever rearrange the statement...
you can use insert ignore for such case, it will ignore if it gets duplicate records
INSERT IGNORE
... ; -- without ON DUPLICATE KEY
I am using MySQL 5.1.56, MyISAM. My table looks like this:
CREATE TABLE IF NOT EXISTS `my_table` (
`number` int(11) NOT NULL,
`name` varchar(50) NOT NULL,
`money` int(11) NOT NULL,
PRIMARY KEY (`number`,`name`)
) ENGINE=MyISAM;
It contains these two rows:
INSERT INTO `my_table` (`number`, `name`, `money`) VALUES
(1, 'S. Name', 150), (2, 'Another Name', 284);
Now I am trying to insert another row:
INSERT INTO `my_table` (`number`, `name`, `money`) VALUES
(2, 'S. Name', 240);
And MySQL just won't insert it while telling me this:
#1062 - Duplicate entry '2-S. Name' for key 'PRIMARY'
I really don't understand it. The primary key is on the first two columns (both of them), so the row I am trying to insert HAS a unique primary key, doesn't it?
I tried to repair the table, I tried to optimize the table, all to no avail. Also please note that I cannot change from MyISAM to InnoDB.
Am I missing something or is this a bug of MySQL or MyISAM? Thanks.
To summarize and point out where I think is the problem (even though there shouldn't be):
Table has primary key on two columns. I am trying to insert a row with a new combination of values in these two columns, but value in column one is already in some row and value in column two is already in another row. But they are not anywhere combined, so I believe this is supposed to work and I am very confused to see that it doesn't.
Your code and schema are OK. You probably trying on previous version of table.
http://sqlfiddle.com/#!2/9dc64/1/0
Your table even has no UNIQUE, so that error is impossible on that table.
Backup data from that table, drop it and re-create.
Maybe you tried to run that CREATE TABLE IF NOT EXIST. It was not created, you have old version, but there was no error because of IF NOT EXIST.
You may run SQL like this to see current table structure:
DESCRIBE my_table;
Edit - added later:
Try to run this:
DROP TABLE `my_table`; --make backup - it deletes table
CREATE TABLE `my_table` (
`number` int(11) NOT NULL,
`name` varchar(50) NOT NULL,
`money` int(11) NOT NULL,
PRIMARY KEY (`number`,`name`),
UNIQUE (`number`, `name`) --added unique on 2 rows
) ENGINE=MyISAM;
I know this wasn't the problem in this case, but I had a similar issue of "Duplicate Entry" when creating a composite primary key:
ALTER TABLE table ADD PRIMARY KEY(fieldA,fieldB);
The error was something like:
#1062 Duplicate entry 'valueA-valueB' for key 'PRIMARY'
So I searched:
select * from table where fieldA='valueA' and fieldB='valueB'
And the output showed just 1 row, no duplicate!
After some time I found out that if you have NULL values in these field you receive these errors. In the end the error message was kind of misleading me.
I had a similar issue, but in my case it turned out that I used case insensitive collation - utf8_general_ci.
Thus, when I tried to insert two strings which were different in a case-sensitive comparison, but the same in the case-insensitive one, MySQL fired the error and I couldn't understand what a problem, because I used a case-sensitive search.
The solution is to change the collation of a table, e.g. I used utf8_bin which is case-sensitive (or utf8_general_cs should be appropriate one too).
In case this helps anyone besides the OP, I had a similar problem using InnoDB.
For me, what was really going on was a foreign key constraint failure. I was referencing a foreign key that did not exist.
In other words, the error was completely off. The primary key was fine, and inserting the foreign key first fixed the problem. No idea why MySQL got this wrong suddenly.
Less common cases, but keep in mind that according to DOC https://dev.mysql.com/doc/refman/5.6/en/innodb-online-ddl-limitations.html
When running an online ALTER TABLE operation, the thread that runs the ALTER TABLE operation will apply an “online log” of DML operations that were run concurrently on the same table from other connection threads. When the DML operations are applied, it is possible to encounter a duplicate key entry error (ERROR 1062 (23000): Duplicate entry), even if the duplicate entry is only temporary and would be reverted by a later entry in the “online log”. This is similar to the idea of a foreign key constraint check in InnoDB in which constraints must hold during a transaction.
In my case the error was caused by the outdated schema, one column was originally varchar(50) but the dump I was trying to import was created from a modified version of the schema that has varchar(70) for that column (and some of the entries of that field where using more than 50 chars).
During the import some keys were truncated and the truncated version was not unique anymore. Took a while to figure that out, I was like "but this supposedly duplicated key doesn't even exist!".
Try with auto increment:
CREATE TABLE IF NOT EXISTS `my_table` (
`number` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL,
`money` int(11) NOT NULL,
PRIMARY KEY (`number`,`name`)
) ENGINE=MyISAM;
Your code is work well on this demo:
http://sqlfiddle.com/#!8/87e10/1/0
I think you are doing second query (insert...) twice. Try
select * from my_table
before insert new row and you will get that your data already exist or not.
i have just tried, and if you have data and table recreation wouldnt work, just alter table to InnoDB and try again, it would fix the problem
In case anyone else finds this thread with my problem -- I was using an "integer" column type in MySQL. The row I was attempting to insert had a primary key with a value larger than allowed by integer. Switching to "bigint" fixed the problem.
As per your code your "number" and "Name" are primarykey and you are inserting S.NAME in both row so it will make a conflict. we are using primarykey for accessing complete data. here you cant access the data using the primarykey 'name'.
im a beginner and i think it might be the error.
In my case the error was very misleading. The problem was that PHPMyAdmin uses "ALTER TABLE" when you click on the "make unique" button instead of "ALTER IGNORE TABLE", so I had to do it manually, like in:
ALTER TABLE mytbl ADD UNIQUE (columnName);
This problem is often created when adding a column or using an existing column as a primary key. It is not created due to a primary key existing that was never actually created or due to damage to the table.
What the error actually denotes is that a pending key value is blank.
The solution is to populate the column with unique values and then try to create the primary key again. There can be no blank, null or duplicate values, or this misleading error will appear.
For me a noop on table has been enough (was already InnoDB):
ALTER TABLE $tbl ENGINE=InnoDB;
tl;dr: my view showed my table was empty but the view excluded existing rows.
I had the same problem but mine was because I was inserting the same test rows I had used before. When I checked to see if my table was empty, I used a view that excluded different tenants so the search came back empty. When I checked the actual table, the previous records were still there.
Once I had deleted the existing records, the insert worked. Only half a day of frustration lost to this one...
Had this error, when adding a composite primary key that is ADD PRIMARY KEY (column1, column2, ...) The value of all the columns in that row must not be duplicated.
For Example:
You do ADD PRIMARY KEY (name, country, number)
name
country
number
collin
Uk
5
collin
Uk
5
This will throw an error #1062 - Duplicate entry 'collin-UK-5' for key 'PRIMARY' because the columns combined have duplicate
So if you see this format of error just check and ensure that the columns you want to add a composite primary key to combined don't have duplicates.
Another reason you may be getting this error is because the same restriction exists in another related table, and they Keyname on the related table has the exact same name. I've had this happen once and it was quite difficult to identify.
i.e. if you have a trigger that inserts data to a different table (the "related" table) with the same restriction and same Keyname, MySQL will not include the name of the table throwing the error, only the Keyname.
As looking on your error #1062 - Duplicate entry '2-S. Name' for key 'PRIMARY' it is saying that you use primary key in your number field that's why it is showing duplicate Error on Number Field.
So Remove this primary Key then it inset duplicate also.
I need to import data from one MySQL table into another. The old table has a different outdated structure (which isn't terribly relevant). That said, I'm appending a field to the new table called "imported_id" which saves the original id from the old table in order to prevent duplicate imports of the old records.
My question now is, how do I actually prevent duplicates? Due to the parallel rollout of the new system with the old, the import will unfortunately need to be run more than once. I can't make the "import_id" field PK/UNIQUE because it will have null values for fields that do not come from the old table, thereby throwing an error when adding new fields. Is there a way to use some type of INSERT IGNORE on the fly for an arbitrary column that doesn't natively have constraints?
The more I think about this problem, the more I think I should handle it in the initial SELECT. However, I'd be interested in quality mechanisms by which to handle this in general.
Best.
You should be able to create a unique key on the import_id column and still specify that column as nullable. It is only primary key columns that must be specified as NOT NULL.
That said, on the new table you could specify a unique key on the nullable import_id column and then handle any duplicate key errors when inserting from the old table into the new table using ON DUPLICATE KEY
Here's a basic worked example of what I'm driving at:
create table your_table
(id int unsigned primary key auto_increment,
someColumn varchar(50) not null,
import_id int null,
UNIQUE KEY `importIdUidx1` (import_id)
);
insert into your_table (someColumn,import_id) values ('someValue1',1) on duplicate key update someColumn = 'someValue1';
insert into your_table (someColumn) values ('someValue2');
insert into your_table (someColumn) values ('someValue3');;
insert into your_table (someColumn,import_id) values ('someValue4',1) on duplicate key update someColumn = 'someValue4';
where the first and last inserts represent inserts from the old table and the 2nd and 3rd represent inserts from elsewhere.
Hope this helps and good luck!