Duplicate Key Entey for key PRIMARY

Duplicate Key Entey for key PRIMARY - mysql

I am new to MySQL but learning fast. I am following a tutorial and all is well until I add a couple of records. I am doing this tutorial in Workbench 6.1.
This is what the tutorial ask me to do:
After creating a very simple table with emp_no, first_name and last_name where emp_no is the PK we insert three records:
no fn ln
--------------------
1 Daniel Lamarche
2 Paul Smith
3 Bobz Youruncle
Then the tutorial asks us to UPDATE the third record to:
5 Alan Youruncle
All is well. Then it asks us to confirm that LAST_INSERT_ID() is still equal to 3.
The table now looks like the following:
no fn ln
--------------------
1 Daniel Lamarche
2 Paul Smith
5 Alan Youruncle
Here where I have a problem that the tutorial does not address because it stops there. Adventurous as I am I wonder what will happen if I add three records. Since LAST_INSERT_ID = 3 will the emp_no will take 3, 4 then 6 I ask myself.
So when I insert three records with:
INSERT INTO employees (first_name, last_name)
VALUES ('Paul', 'Lalo'), ('Claude', 'Baker'), ('Alan', 'Brown');
I get the error ERROR Code: 1062. Duplicate entry '5' for key PRIMARY.
Now I do perfectly understand why there is an error. Anyone can help me understand how to deal with this. Is there a way to insert new records and skip the value that it encounters?
Now I also understand that maybe it is not good practice to do this that way or whatever. But let's pretend that this is a real life situation and not just a fun tutorial for beginners like me.
Just in case someone wants to the tut is at: http://www.mysqltutorial.org/mysql-sequence/
Thanks,
Daniel

In most real life situations you would never update or insert an auto_increment primary key value, you would just update or insert the other column values. It is only there as a pointer to the row.
When the tutorial asks you to UPDATE the third record to:
5 Alan Youruncle
It is only highlighting a point about the behaviour of LAST_INSERT_ID(), but should point out that this is not an UPDATE that you should generally run.
If you want to completely change a row, you would generally do a delete followed by an insert.
If you must, you can change the current auto_increment value on a table to one higher than the current maximum. This only becomes necessary if you have done something unusual however:
ALTER TABLE employees AUTO_INCREMENT=6;

Basically, when you perform an INSERT MySQL will update the table's AUTO_INCREMENT. When you perform an update on an "auto increment" column MySQL won't update the table's AUTO_INCREMENT.
Therefore,
INSERT INTO users(id, name) VALUES(20, 'John'); Will update the AUTO_INCREMENT to 21 so the next insert will have the 21st ID.
But if you perform an update UPDATE users SET id = 40 WHERE id = 20 the AUTO_INCREMENT still will be 21 not 41 and the next insert will have the 21st ID. If you keep inserting eventually you'll hit the 40th ID again and it will raise a duplicated primary key exception.
Also, FWIW, AUTO_INCREMENT updates are calculated and performed after inserts and not before.

Related

mysql unique constraint fails with null values? [duplicate]

This question requires some hypothetical background. Let's consider an employee table that has columns name, date_of_birth, title, salary, using MySQL as the RDBMS. Since if any given person has the same name and birth date as another person, they are, by definition, the same person (barring amazing coincidences where we have two people named Abraham Lincoln born on February 12, 1809), we'll put a unique key on name and date_of_birth that means "don't store the same person twice." Now consider this data:
id name date_of_birth title salary
1 John Smith 1960-10-02 President 500,000
2 Jane Doe 1982-05-05 Accountant 80,000
3 Jim Johnson NULL Office Manager 40,000
4 Tim Smith 1899-04-11 Janitor 95,000
If I now try to run the following statement, it should and will fail:
INSERT INTO employee (name, date_of_birth, title, salary)
VALUES ('Tim Smith', '1899-04-11', 'Janitor', '95,000')
If I try this one, it will succeed:
INSERT INTO employee (name, title, salary)
VALUES ('Jim Johnson', 'Office Manager', '40,000')
And now my data will look like this:
id name date_of_birth title salary
1 John Smith 1960-10-02 President 500,000
2 Jane Doe 1982-05-05 Accountant 80,000
3 Jim Johnson NULL Office Manager 40,000
4 Tim Smith 1899-04-11 Janitor 95,000
5 Jim Johnson NULL Office Manager 40,000
This is not what I want but I can't say I entirely disagree with what happened. If we talk in terms of mathematical sets,
{'Tim Smith', '1899-04-11'} = {'Tim Smith', '1899-04-11'} <-- TRUE
{'Tim Smith', '1899-04-11'} = {'Jane Doe', '1982-05-05'} <-- FALSE
{'Tim Smith', '1899-04-11'} = {'Jim Johnson', NULL} <-- UNKNOWN
{'Jim Johnson', NULL} = {'Jim Johnson', NULL} <-- UNKNOWN
My guess is that MySQL says, "Since I don't know that Jim Johnson with a NULL birth date isn't already in this table, I'll add him."
My question is: How can I prevent duplicates even though date_of_birth is not always known? The best I've come up with so far is to move date_of_birth to a different table. The problem with that, however, is that I might end up with, say, two cashiers with the same name, title and salary, different birth dates and no way to store them both without having duplicates.

A fundamental property of a unique key is that
it must be unique. Making part of that key Nullable destroys this property.
There are two possible solutions to your problem:
One way, the wrong way, would be to use some magic date to represent unknown. This just gets you past
the DBMS "problem" but does not solve the problem in a logical sense.
Expect problems with two "John Smith" entries having unknown dates
of birth. Are these guys one and the same or are they unique individuals?
If you know they are different then you are back to the same old problem -
your Unique Key just isn't unique. Don't even think about assigning a whole range of magic dates
to represent "unknown" - this is truly the road to hell.
A better way is to create an EmployeeId attribute as a surrogate key. This is just an
arbitrary identifier that you assign to individuals that you know are unique. This
identifier is often just an integer value.
Then create an Employee table to relate the EmployeeId (unique, non-nullable
key) to what you believe are the dependant attributers, in this case
Name and Date of Birth (any of which may be nullable). Use the EmployeeId surrogate key everywhere that you
previously used the Name/Date-of-Birth. This adds a new table to your system but
solves the problem of unknown values in a robust manner.

I think MySQL does it right here. Some other databases (for example Microsoft SQL Server) treat NULL as a value that can only be inserted once into a UNIQUE column, but personally I find this to be strange and unexpected behaviour.
However since this is what you want, you can use some "magic" value instead of NULL, such as a date a long time in the past

I recommend to create additional table column checksum which will contain md5 hash of name and date_of_birth. Drop unique key (name, date_of_birth) because it doesn't solve the problem. Create one unique key on checksum.
ALTER TABLE employee
ADD COLUMN checksum CHAR(32) NOT NULL;
UPDATE employee
SET checksum = MD5(CONCAT(name, IFNULL(date_of_birth, '')));
ALTER TABLE employee
ADD UNIQUE (checksum);
This solution creates small technical overhead, cause for every inserted pairs you need to generate hash (same thing for every search query). For further improvements you can add trigger that will generate hash for you in every insert:
CREATE TRIGGER before_insert_employee
BEFORE INSERT ON employee
FOR EACH ROW
IF new.checksum IS NULL THEN
SET new.checksum = MD5(CONCAT(new.name, IFNULL(new.date_of_birth, '')));
END IF;

Your problem of not having duplicates based on name is not solvable because you do not have a natural key. Putting a fake date in for people whose date of birth is unknown will not solve your problem. John Smith born 1900/01/01 is still going to be a differnt person than John Smithh born 1960/03/09.
I work with name data from large and small organizations every day and I can assure you they have two different people with the same name all the time. Sometimes with the same job title. Birthdate is no guarantee of uniqueness either, plenty of John Smiths born on the same date. Heck when we work with physicians office data we have often have two doctors with the same name, address and phone number (father and son combinations)
Your best bet is to have an employee ID if you are inserting employee data to identify each employee uniquely. Then check for the uniquename in the user interface and if there are one or more matches, ask the user if he meant them and if he says no, insert the record. Then build a deupping process to fix problems if someone gets assigned two ids by accident.

There is a another way to do it. Adding a column(non-nullable) to represent the String value of date_of_birth column. The new column value would be ""(empty string) if date_of_birth is null.
We name the column as date_of_birth_str and create a unique constraint employee(name, date_of_birth_str). So when two recoreds come with the same name and null date_of_birth value, the unique constraint still works.
But the efforts of maintenance for the two same-meaning columns, and, the performance harm of new column, should be considered carefully.

You can add a generated column where the NULL value is replaced by an unused constant, e.g. zero. Then you can apply the unique constraint to this column:
CREATE TABLE employee (
name VARCHAR(50) NOT NULL,
date_of_birth DATE,
uq_date_of_birth DATE AS (IFNULL(date_of_birth, '0000-00-00')) UNIQUE
);

The perfect solution would be support for function based UK's, but that becomes more complex as mySQL would also then need to support function based indexes. This would prevent the need to use "fake" values in place of NULL, while also allowing developers the ability to decide how to treat NULL values in UK's. Unfortunately, mySQL doesn't currently support such functionality that I am aware of, so we're left with workarounds.
CREATE TABLE employee(
name CHAR(50) NOT NULL,
date_of_birth DATE,
title CHAR(50),
UNIQUE KEY idx_name_dob (name, IFNULL(date_of_birth,'0000-00-00 00:00:00'))
);
(Note the use of the IFNULL() function in the unique key definition)

I had a similar problem to this, but with a twist. In your case, every employee has a birthday, although it may be unknown. In that case, it makes logical sense for the system to assign two values for employees with unknown birthdays but otherwise identical information. NealB's accepted answer is very accurate.
However, the problem I encountered was one in which the data field did not necessarily have a value. For example, if you added a 'name_of_spouse' field to your table, there wouldn't necessarily be a value for each row of the table. In that case, NealB's first bullet point (the 'wrong way') actually makes sense. In this case, a string 'None' should be inserted in the column name_of_spouse for each row in which there was no known spouse.
The situation where I ran into this problem was in writing a program with database to classify IP traffic. The goal was to create a graph of IP traffic on a private network. Each packet was put into a database table with a unique connection index based on its ip source and dest, port source and dest, transport protocol, and application protocol. However, many packets simply don't have an application protocol. For example, all TCP packets without an application protocol should be classed together, and should occupy one unique entry in the connections index. This is because I want those packets to form a single edge of my graph. In this situation, I took my own advice from above, and stored a string 'None' in the application protocol field to ensure that these packets formed a unique group.

I were looking for one solution and the Alexander Yancharuk suggested was good idea for me. But in my case my columns are foreign keys and employee_id can be null.
I have this structure:
+----+---------+-------------+
| id | room_id | employee_id |
+----+---------+-------------+
| 1 | 1 | NULL |
| 2 | 2 | 1 |
+----+---------+-------------+
And the room_id with employee_id NULL can not be duplicated
I solved adding a trigger before insert, like this:
DELIMITER $$
USE `db`$$
CREATE DEFINER=`root`#`%` TRIGGER `db`.`room_employee` BEFORE INSERT ON `room_employee` FOR EACH ROW
BEGIN
IF EXISTS (
SELECT room_id, employee_id
FROM room_employee
WHERE (NEW.room_id = room_employee.room_id AND NEW.employee_id IS NULL AND room_employee.employee_id IS NULL)
) THEN
CALL `The room Can not be duplicated on room employee table`;
END IF;
END$$
DELIMITER ;
I also added a constraint unique for room_id and employee_id

I think the fundamental question here is what you actually mean with
INSERT INTO employee (name, title, salary) VALUES ('Jim Johnson', 'Office Manager', '40,000')
Your own definition of a person is name AND birth date, so what does this statement mean in that context? I'd say that the solution to your problem is to prohibit inserting half identities, like the one above, by adding NOT NULL on both your name and date_of_birth columns. That way, the statement will fail and force you to enter complete identities and the unique key will do its job to prevent you from entering the same person twice.

In simple words,the role of Unique constraint is to make the field or column.
The null destroys this property as database treats null as unknown
Inorder to avoid duplicates and allow null:
Make unique key as Primary key

Insert into on duplicate key - auto increment id skipped

I currently have an SQL execution script which updates the row on duplicate key which looks like this.
$stmt = $dbCon->prepare("INSERT INTO videos_rating (videos_rating_video_fk, "
. " videos_rating_user_fk, "
. " videos_rating_rating) "
. " VALUES (:video_id, "
. " :user_id, "
. " :video_rating) "
. " ON DUPLICATE KEY UPDATE videos_rating_rating = :video_rating");
The script works fine but is there a way to prevent the auto increment column of getting out of sync?
Lets assume we start with an empty table, i then rate a video which then creates a row which will get the id of 1, then the user execute the SQL again by rating the same video a lower or higher rating and the row will be updated because its now a duplicate key, sure no problem.
The problem is this.
Next time another user rates a new new video the row will now begin at id 3 and not 2?
The table will then look like this
id | videos_rating_user_fk | videos_rating_rating
1 | 1 | 4
3 | 2 | 5
I were not able to find a similar question even tho i find it higly unlikely that no one else has been bothered with this, if so please refer me over to that post.
I know ids are not supposed to 'look good' but it is very annoying that ids jump from 30 - 51 - 82 - 85 - 89 etc and would there not be a problem at some point when the maximum UNSIGNED big int number is reached? im not saying i will ever go that high but still.

I assume that you are using the default InnoDB engine. In that case the "problem" is that the engine will "reserve" the id before it knows if it's a duplicate or not. Once the id is "reserved" it cannot be released, because another thread (another user) might perform an insert into the same table at the "same" time. There are also other ways to get gaps in the AUTO_INCREMENT column without deleting any rows. One is when you roll back a transaction.
You can try to "reset" the next AUTO_INCREMENT value after every insert with
alter table videos_rating auto_increment = 1;
But I can't say what problems you might run in executing this statement in a running live environment. And I'm not going to find that out.
Note that this is usually not an issue, because tables on which you run IODKU statemts (usually) don't need an AUTO_INCREMENT column. As Cid wrote in his answer, you can just drop the id column and define your unique key as primary key.

Let's assume your table is built this way :
videos_rating_video_fk | videos_rating_user_fk | videos_rating_rating
-----------------------+-----------------------+----------------------
The first key videos_rating_video_fk should be a foreign key and not a primary key with autoincrement.
If users 1 and 2 vote for the video that has the id 1, your table should looks like this :
videos_rating_video_fk | videos_rating_user_fk | videos_rating_rating
-----------------------+-----------------------+----------------------
1 | 1 | 4
1 | 2 | 5
For that kind of table, the primary key should be the combination of both foreign keys and will be unique. A user can vote only once for a video (unique vote = unique key). A video can be voted by multiples users and users can vote for multiples videos.
I suggest you to take a look at the Merise Method for building tables with integrity constraints and creation of primary keys.

Live with the "burning" of ids. AUTO_INCREMENT guarantees not to allow duplicate values, not does not provide any other guarantees.
There are about 4 other ways where ids may be 'burned': REPLACE, Multi-Master / Galera, IGNORE, DELETE, and possibly more.
IODKU quickly grabbed an id before discovering that the statement would turn into an UPDATE and not need the id. To do otherwise would probably be a significant performance hit.

To confirm, Paul Spiegel's answer helped my to resolve the issue. I had some an 'Upsert' SQL query that used ON DUPLICATE KEY UPDATE to determine whether to create a new row or update an existing row. Where a row was updated frequently, the jumps in assigned Id's where large.
the "problem" is that the engine will "reserve" the id before it knows if it's a duplicate or not.
I resolved the problem by breaking the SQL code into separate INSERT and UPDATE statements. I'm no longer seeing the issue.

Do not instert a record if a key is already present

I am a newbie to mysql. Please help.
I have a table people like this. The only primary key of people is id
id name age sex
1. John 16 M
2. Peter 18 K
I would like to write some sql to insert some rows to people. But if the name is already exist
in the table. I do not insert new row. For example, if I insert the row with the name John and
Peter. I do not insert rows.
I have a variable name as var_name;
I have search out for the web for a very long time.
I use the following sql recommend by the web
INSERT into People(name) values(var_name) where not exists(SELECT name from People
where name = var_name)
But the sql syntax error comes out. Why would this happen. And is there any fast way to acheieve my goal.

The best way to do this is to create a unique index on name:
create unique idx_people_name on people(name)
Then, when you insert, use on duplicate key update:
INSERT into People(name)
values(var_name)
on duplicate key update name = values(name);
The update piece does nothing -- it is a "no-op". But this puts the logic in the database and enforces that names need to be unique.
For your query to work, you need insert . . . select. The values clause doesn't take a where statement:
INSERT into People(name)
select var_name
from dual
where not exists(SELECT name from People where name = var_name);

If you have a unique constraint on the name, I believe you can use:
INSERT IGNORE People(name) VALUES (var_name);

Increment a database field by 1

With MySQL, if I have a field, of say logins, how would I go about updating that field by 1 within a sql command?
I'm trying to create an INSERT query, that creates firstName, lastName and logins. However if the combination of firstName and lastName already exists, increment the logins by 1.
so the table might look like this..
firstName----|----lastName----|----logins
John Jones 1
Steve Smith 3
I'm after a command that when run, would either insert a new person (i.e. Tom Rogers) or increment logins if John Jones was the name used..

Updating an entry:
A simple increment should do the trick.
UPDATE mytable
SET logins = logins + 1
WHERE id = 12
Insert new row, or Update if already present:
If you would like to update a previously existing row, or insert it if it doesn't already exist, you can use the REPLACE syntax or the INSERT...ON DUPLICATE KEY UPDATE option (As Rob Van Dam demonstrated in his answer).
Inserting a new entry:
Or perhaps you're looking for something like INSERT...MAX(logins)+1? Essentially you'd run a query much like the following - perhaps a bit more complex depending on your specific needs:
INSERT into mytable (logins)
SELECT max(logins) + 1
FROM mytable

If you can safely make (firstName, lastName) the PRIMARY KEY or at least put a UNIQUE key on them, then you could do this:
INSERT INTO logins (firstName, lastName, logins) VALUES ('Steve', 'Smith', 1)
ON DUPLICATE KEY UPDATE logins = logins + 1;
If you can't do that, then you'd have to fetch whatever that primary key is first, so I don't think you could achieve what you want in one query.

This is more a footnote to a number of the answers above which suggest the use of ON DUPLICATE KEY UPDATE, BEWARE that this is NOT always replication safe, so if you ever plan on growing beyond a single server, you'll want to avoid this and use two queries, one to verify the existence, and then a second to either UPDATE when a row exists, or INSERT when it does not.

You didn't say what you're trying to do, but you hinted at it well enough in the comments to the other answer. I think you're probably looking for an auto increment column
create table logins (userid int auto_increment primary key,
username varchar(30), password varchar(30));
then no special code is needed on insert. Just
insert into logins (username, password) values ('user','pass');
The MySQL API has functions to tell you what userid was created when you execute this statement in client code.

I not expert in MySQL but you probably should look on triggers e.g. BEFORE INSERT.
In the trigger you can run select query on your original table and if it found something just update the row 'logins' instead of inserting new values.
But all this depends on version of MySQL you running.

How to handle fragmentation of auto_increment ID column in MySQL

I have a table with an auto_increment field and sometimes rows get deleted so auto_increment leaves gaps. Is there any way to avoid this or if not, at the very least, how to write an SQL query that:
Alters the auto_increment value to be the max(current value) + 1
Return the new auto_increment value?
I know how to write part 1 and 2 but can I put them in the same query?
If that is not possible:
How do I "select" (return) the auto_increment value or auto_increment value + 1?

Renumbering will cause confusion. Existing reports will refer to record 99, and yet if the system renumbers it may renumber that record to 98, now all reports (and populated UIs) are wrong. Once you allocate a unique ID it's got to stay fixed.
Using ID fields for anything other than simple unique numbering is going to be problematic. Having a requirement for "no gaps" is simply inconsistent with the requirement to be able to delete. Perhaps you could mark records as deleted rather than delete them. Then there are truly no gaps. Say you are producing numbered invoices: you would have a zero value cancelled invoice with that number rather than delete it.

There is a way to manually insert the id even in an autoinc table. All you would have to do is identify the missing id.
However, don't do this. It can be very dangerous if your database is relational. It is possible that the deleted id was used elsewhere. When removed, it would not present much of an issue, perhaps it would orphan a record. If replaced, it would present a huge issue because the wrong relation would be present.
Consider that I have a table of cars and a table of people
car
carid
ownerid
name
person
personid
name
And that there is some simple data
car
1 1 Van
2 1 Truck
3 2 Car
4 3 Ferrari
5 4 Pinto
person
1 Mike
2 Joe
3 John
4 Steve
and now I delete person John.
person
1 Mike
2 Joe
4 Steve
If I added a new person, Jim, into the table, and he got an id which filled the gap, then he would end up getting id 3
1 Mike
2 Joe
3 Jim
4 Steve
and by relation, would be the owner of the Ferrari.

I generally agree with the wise people on this page (and duplicate questions) advising against reusing auto-incremented id's. It is good advice, but I don't think it's up to us to decide the rights or wrongs of asking the question, let's assume the developer knows what they want to do and why.
The answer is, as mentioned by Travis J, you can reuse an auto-increment id by including the id column in an insert statement and assigning the specific value you want.
Here is a point to put a spanner in the works: MySQL itself (at least 5.6 InnoDB) will reuse an auto-increment ID in the following circumstance:
delete any number rows with the highest auto-increment id
Stop and start MySQL
insert a new row
The inserted row will have an id calculated as max(id)+1, it does not continue from the id that was deleted.

As djna said in her/his answer, it's not a good practice to alter database tables in such a way, also there is no need to that if you have been choosing the right scheme and data types. By the way according to part od your question:
I have a table with an auto_increment field and sometimes rows get deleted so auto_increment leaves gaps. Is there any way to avoid this?
If your table has too many gaps in its auto-increment column, probably as a result of so many test INSERT queries
And if you want to prevent overwhelming id values by removing the gaps
And also if the id column is just a counter and has no relation to any other column in your database
, this may be the thing you ( or any other person looking for such a thing ) are looking for:
SOLUTION
remove the original id column
add it again using auto_increment on
But if you just want to reset the auto_increment to the first available value:
ALTER TABLE `table_name` AUTO_INCREMENT=1

not sure if this will help, but in sql server you can reseed the identity fields. It seems there's an ALTER TABLE statement in mySql to acheive this. Eg to set the id to continue at 59446.
ALTER TABLE table_name AUTO_INCREMENT = 59446;
I'm thinking you should be able to combine a query to get the largest value of auto_increment field, and then use the alter table to update as needed.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008