MySQL "Insert ... On Duplicate Key" with more than one unique key - mysql

I've been reading up on how to use MySQL insert on duplicate key to see if it will allow me to avoid Selecting a row, checking if it exists, and then either inserting or updating. As I've read the documentation however, there is one area that confuses me. This is what the documentation says:
If you specify ON DUPLICATE KEY UPDATE, and a row is inserted that would cause a duplicate value in a UNIQUE index or PRIMARY KEY, an UPDATE of the old row is performed
The thing is, I don't want to know if this will work for my problem, because the 'condition' I have for not inserting a new one is the existence of a row that has two columns equal to a certain value, not necessarily that the primary key is the same. Right now the syntax I'm imagining is this, but I don't know if it will always insert instead of replace:
INSERT INTO attendance (event_id, user_id, status) VALUES(some_event_number, some_user_id, some_status) ON DUPLICATE KEY UPDATE status=1
The thing is, event_id and user_id aren't primary keys, but if a row in the table 'attendance' already has those columns with those values, I just want to update it. Otherwise I would like to insert it. Is this even possible with ON DUPLICATE? If not, what other method might I use?

The quote includes "a duplicate value in a UNIQUE index". So, your values do not need to be the primary key:
create unique index attendance_eventid_userid on attendance(event_id, user_id);
Presumably, you want to update the existing record because you don't want duplicates. If you want duplicates sometimes, but not for this particular insert, then you will need another method.

If I were you, I would make a primary key out of event_id and user_id. That will make this extremely easy with ON DUPLICATE.
SQLFiddle
create table attendance (
event_id int,
user_id int,
status varchar(100),
primary key(event_id, user_id)
);
Then with ease:
insert into attendance (event_id, user_id, status) values(some_event_number, some_user_id, some_status)
on duplicate key
update status = values(status);

Maybe you can try to write a trigger that checks if the pair (event_id, user_id) exists in the table before inserting, and if it exists just update it.

To the broader question of "Will INSERT ... ON DUPLICATE respect a UK even if the PK changes", the answer is yes: SQLFiddle
In this SQLFiddle I insert a new record, with a new PK id, but its values would violate the UK. It performs the ON DUPLICATE and the original PK id is preserved, but the non-UK ON DUPLICATE KEY UPDATE value changes.

Related

Error: Duplicate entry '1' for key 'students.PRIMARY' Error Code: ER_DUP_ENTRY

CREATE TABLE IF NOT EXISTS students (
student_id INT,
name VARCHAR(24),
major VARCHAR(24),
PRIMARY KEY(student_id)
);
SELECT * FROM student;
INSERT INTO students VALUES(1,'Jack','Biology');
You're specifying the primary key (student_id) and from the error it already exists. You have a few options:
Don't specify the primary key. It should be set to autoincrement anyway, assuming that this is the primary table that students are entered into, and from the name of the table (students) it seems like it is. Then the query will be:
INSERT INTO students VALUES('Jack','Biology');
and then the table will autoincrement the primary key to the next pointer.
Use INSERT IGNORE. This will silently fail if you try to insert a student ID that already exists (or on any query that violates unique keys).
INSERT IGNORE INTO students VALUES(1, 'Jack','Biology');
This will not cause table changes, but it will also not cause an error that interrupts the script, and it will insert any rows that don't fail, say if you had multiple values inserted. The plain INSERT will fail for the entire list, not just the erroneous value.
Use ON DUPLICATE KEY UPDATE. This will update a list of values if it encounters a duplicate key.
INSERT INTO students VALUES(1, 'Jack','Biology')
ON DUPLICATE KEY UPDATE name = values(name), major = values(major);
In this case, you will change the values in the table that match the key. In this case, whichever student is student_id 1 will have its name and major updated to the supplied values. For instance, let's say that Jack changed his major to Chemistry. This would update student_id 1 to Jack, Chemistry and reflect his new major.
Use REPLACE INTO. I avoid this one. It is similar to ON DUPLICATE KEY UPDATE, but it removes the old entry and replaces it with a new one with a new ID. This can cause you problems with foreign keys, and also if you have a small primary key and you constantly replace into it, you can end up with a primary id that's bigger than the limits you set.
Well, your student_id is primary key, clearly that table is already exist with some data with student_id=1 hence you cannot insert another row with the same primary key value.

Sql Statement: Insert On Key Update is not working as expected when primary key is not specified in the fields to insert

Hello I am using the "INSERT ON DUPLICATE KEY UPDATE" sql statement to update my database.
All was working fine since I always inserted an unique id like this:
INSERT INTO devices(uniqueId,name)
VALUES (4,'Printer')
ON DUPLICATE KEY UPDATE name = 'Central Printer';
But for now, I need to insert elements but I don't insert a unique id, I only insert or update the values like this:
INSERT INTO table (a,b,c,d,e,f,g)
VALUES (2,3,4,5,6,7,8)
ON DUPLICATE KEY
UPDATE a=a, b=b, c=c, d=d, e=e, f=f, g=g;
Have to say that an autoincrement primary key is generated always that I insert a row.
My problem is that now the inserted rows are duplicated since I don't insert the primary key or unique id explicitly within the sql statement.
What I am supposed to do?
For example, maybe I need to insert the primary key explicitly? I would like to work with this primary autoincremented key.
For recommendation from Gordon I am adding a sample case the you can see in the next image
Rows Output
In this case I add the first three rows, and then I try to update the three first rows again with different information.... Ok I am seeing the error... There is no key to compare to...... :$
Thanks for your answers,
If you want to prevent columns from being duplicated, then create a unique index or constraint on them. For instance:
create unique index unq_table_7 on table(a, b, c, d, e, f, g);
This will guarantee that the 7 columns -- in combination -- are unique.

MYSQL: Getting existing primary key when inserting record with duplicate unique key?

I've got a mysql database with a table that has both a auto-increment primary key and unique string valued key (a sha-1 hash).
If I try to add a record that has the same sha-1 hash as an existing record, I just want to get the primary key of the existing record. I can use something like "INSERT ... ON DUPLICATE KEY UPDATE" or "INSERT IGNORE" to prevent an exception when trying to insert a record with a existing hash value.
However, when that happens, I need to retrieve the primary key of the existing record. I can't find a way to do that with a single SQL statement. If it matters, my code is in Java and I'm using JDBC.
Alternatively, I can do it with two statements (either a query followed by an insertion if not found, or a insertion followed by a query if a duplicate key exists). But I presume a single statement would be more efficient.
If I try to add a record that has the same sha-1 hash as an existing
record, I just want to get the primary key of the existing record. I
can use something like "INSERT ... ON DUPLICATE KEY UPDATE" or "INSERT
IGNORE" to prevent an exception when trying to insert a record with a
existing hash value.
If you have an UNIQUE index on a column, no matter what you tried, the RDMS will not allow duplicates in that column (except for the NULL value).
As you said, there is solution to prevent "error" if this appends. Probably INSERT IGNORE in your case.
Anyway, INSERT and UPDATE modify the database. MySQL never return values for these statements. The only way to read your DB is to use a SELECT statement.
Here the "workaround" is simple, since you have an UNIQUE column:
INSERT IGNORE INTO tbl (pk, sha_key) VALUES ( ... ), ( ... );
SELECT pk, sha_key FROM tbl WHERE sha_key IN ( ... );
-- ^^^
-- Here the list of the sha1 keys you *tried* to insert
Actually, INSERT...ON DUPLICATE KEY UPDATE is exactly the right statement to use in your situation. When you use ON DUPLICATE, if the insert happens without duplicate, JDBC returns count of 1 and the ID of the newly inserted row. If the action taken is an update due to duplicate, JDBC returns count of 2 and both the ID of the original row AND the newly generated ID, even though the new ID is never actually inserted into the table.
You can get the correct key by calling PreparedStatement.getGeneratedKeys(). The first key is pretty much always the one you are interested in. For this statement:
INSERT INTO table (a,b,c) VALUES (1,2,3) ON DUPLICATE KEY UPDATE c=3;
You can get the inserted or updated ID by calling:
Long key;
ResultSet keys = preparedStatement.getGeneratedKeys();
if (keys.next())
key = keys.getLong("GENERATED_KEY");

Maintaining a large table of unique values in MySQL

This is probably a common situation, but I couldn't find a specific answer on SO or Google.
I have a large table (>10 million rows) of friend relationships on a MySQL database that is very important and needs to be maintained such that there are no duplicate rows. The table stores the user's uids. The SQL for the table is:
CREATE TABLE possiblefriends(
id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
user INT,
possiblefriend INT)
The way the table works is that each user has around 1000 or so "possible friends" that are discovered and need to be stored, but duplicate "possible friends" need to be avoided.
The problem is, due to the design of the program, over the course of a day, I need to add 1 million rows or more to the table that may or not be duplicate row entries. The simple answer would seem to be to check each row to see if it is a duplicate, and if not, then insert it into the table. But this technique will probably get very slow as the table size increases to 100 million rows, 1 billion rows or higher (which I expect it to soon).
What is the best (i.e. fastest) way to maintain this unique table?
I don't need to have a table with only unique values always on hand. I just need it once-a-day for batch jobs. In this case, should I create a separate table that just inserts all the possible rows (containing duplicate rows and all), and then at the end of the day, create a second table that calculates all the unique rows in the first table?
If not, what is the best way for this table long-term?
(If indexes are the best long-term solution, please tell me which indexes to use)
Add a unique index on (user, possiblefriend) then use one of:
INSERT ... ON DUPLICATE KEY UPDATE ...
INSERT IGNORE
REPLACE
to ensure that you don't get errors when you try to insert a duplicate row.
You might also want to consider if you can drop your auto-incrementing primary key and use (user, possiblefriend) as the primary key. This will decrease the size of your table and also the primary key will function as the index, saving you from having to create an extra index.
See also:
“INSERT IGNORE” vs “INSERT … ON DUPLICATE KEY UPDATE”
A unique index will let you be sure that the field is indeed unique, you can add a unique index like so:
CREATE TABLE possiblefriends(
id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
user INT,
possiblefriend INT,
PRIMARY KEY (id),
UNIQUE INDEX DefUserID_UNIQUE (user ASC, possiblefriend ASC))
This will also speec up your table access significantly.
Your other issue with the mass insert is a little more tricky, you could use the in-built ON DUPLICATE KEY UPDATE function below:
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=c+1;
UPDATE table SET c=c+1 WHERE a=1;

In MySQL, how do I write a query to skip a duplicate row while inserting, when there's no unique field (primary key)?

I am trying to insert rows into a table that has no unique field or primary key. How can I write a query that will simply ignore the insert if there already exists a row with the exact same values on all fields -- a duplicate row?
Thanks.
You must have a primary key or unique key defined on some column or columns in the table for uniqueness to have any meaning. Every mechanism for detecting duplicates automatically relies on this being true.
You can't do the SELECT COUNT(*)... solution because it's subject to race conditions. That is, someone could insert a duplicate row in the moment after you select and before you insert. The only way around this is to lock the table with SELECT ... FOR UPDATE or LOCK TABLES.
Uh, why not make a primary key?
Otherwise, you have to basically do SELECT COUNT(*) FROM table WHERE field1=value AND ... AND fieldN=value for before EVERY insert.