Manually increament primary key - Transaction and racing condition - mysql

This may not be a real world issue but is more like a learning topic.
Using PHP, MySQL and PDO, I know all about auto_increment and lastInsertId(). Consider that the primary key has no auto_incerment attribute and we have to use something like SELECT MAX(id) FROM table in order to retrieve last id, increment it manually and then INSERT INTO table (id) VALUES (:lastIdPlusOne). Wrap whole code in beginTransaction and commit.
Is this approach safe? If user A and B at the same time load this script what will happens at the end? both transaction will be failed? Or both will be successful (for instance, if the last id was 10, A will insert 11 and B will insert 12)?
Note that since I am a PHP & MySQL developer, therefor I am more interested in MySQL behavior in this case.

If both got the same max, then the one that inserts first will succeed, and other(s) will fail.
To overcome this issue without using using auto_increment fields, you may use a trigger before insert that does the job (new.id=max) i.e. same logic, but in a trigger, so the DB server is the one who controls it.
Not sure though if this is 100% safe in a master-master replication environment in case of a server failure.

This is #eggyal comment, that I quote here:
You must ensure that you use a locking read to fetch the MAX() in the first (select) query; it will then block until the transaction is committed. However, this is very poor design and should not be used in a production system.

Related

Is there any disadvantages of unique column in MYSQL

i'd like to ask a question regarding Unique columns in MySQL.
Would like to ask experts on which is a better way to approach this problem, advantages or disadvantages if there is any.
Set a varchar column as unique
Do a SQL INSERT IGNORE
If affected rows > 0 proceed with running the code
versus
Leave a varchar column as not-unique
Do a search query to look for identical value
If there is no rows returned in query, Do a SQL INSERT
proceed with running the code
Neither of the 2 approaches is good.
You don't do INSERT IGNORE nor do you search. The searching part is also unreliable, because it fails at concurrency and compromises the integrity. Imagine this scenario: you and I try to insert the same info into the database. We connect at the same time. Code in question determines that there's no such record in the database, for both of us. We both insert the same data. Now your column isn't unique, therefore we'll end up with 2 records that are the same - your integrity now fails.
What you do is set the column to unique, insert and catch the exception in the language of your choice.
MySQL will fail in case of duplicate record, and any proper db driver for MySQL will interpret this as an exception.
Since you haven't mentioned what the language is, it's difficult to move forward with examples.
Defining a column as an unique index has a few advantages, first of all when you define it as an "unique index" MySQL can optimize your index for unique values (same as a primary key) because mysql doesn't have to check if there are more rows with the same value so it can use an optimized algoritme for the lookups.
Also you are assured that there never will be a double entry in your database instead of handeling this in multiple places in your code.
When you don't define it as UNIQUE you first need to check if an records exists in your table, and then insert something wich requires 2 queries (and even a full table lock) instead of 1 wich decreases your performance and is more error prone
http://dev.mysql.com/doc/refman/5.0/en/constraint-primary-key.html
I'm leaving the fact that you would use the INSERT IGNORE wich IGNORES the exception when the entry allready exists in the database (Still you could use it for high performance operations maybe in some sort of special case). A normal INSERT will give you the feedback if an entry allready exists
Putting a constraint like UNIQUE is better when it comes to query performance and data reliability. But there is also a trade-off when it comes to writing. So It's up to you which do you prefer. But in your case, since you also do INSERT IF NOT EXIST query, so I guess, it's better to just use the Constraint.

SQL standard UPSERT call

I'm looking for a standard SQL "UPSERT" statement. A one call for insert and update if exists.
I'm looking for a working, efficient and cross platform call.
I've seen MERGE, UPSERT, REPLACE, INSERT .. ON DUPLICATE UPDATE but no statement meets the needs.
BTW I use MYSQL and HSQLDB for unitests. I understand that HSQLDB is limited and may not cover what I need, but I couldn't find a standard way even without it.
A statement that only MYSQL and HSQLDB will also be enough for now.
I've been looking around for a while and couldn't get an answer.
My table:
CREATE TABLE MY_TABLE (
MY_KEY varchar(50) NOT NULL ,
MY_VALUE varchar(50) DEFAULT NULL,
TIME_STAMP bigint NOT NULL,
PRIMARY KEY (MY_KEY)
);
Any idea?
The only solution that is supported by both MySQL and HSQLDB is to query the rows you intend to replace, and conditionally either INSERT or UPDATE. This means you have to write more application code to compensate for the differences between RDBMS implementations.
START TRANSACTION.
SELECT ... FOR UPDATE.
If the SELECT finds rows, then UPDATE.
Else, INSERT.
COMMIT.
MySQL doesn't support the ANSI SQL MERGE statement. It supports REPLACE and INSERT...ON DUPLICATE KEY UPDATE. See my answer to "INSERT IGNORE" vs "INSERT ... ON DUPLICATE KEY UPDATE" for more on that.
Re comments: Yes, another approach is to just try the INSERT and see if it succeeds. Otherwise, do an UPDATE. If you attempt the INSERT and it hits a duplicate key, it'll generate an error, which turns into an exception in some client interfaces. The disadvantage of doing this in MySQL is that it generates a new auto-increment ID even if the INSERT fails. So you end up with gaps. I know gaps in auto-increment sequence are not ordinarily something to worry about, but I helped a customer last year who had gaps of 1000-1500 in between successful inserts because of this effect, and the result was that they exhausted the range of an INT in their primary key.
As #baraky says, one could instead attempt the UPDATE first, and if that affects zero rows, then do the INSERT instead. My comment on this strategy is that UPDATEing zero rows is not an exception -- you'll have to check for "number of rows affected" after the UPDATE to know whether it "succeeded" or not.
But querying the number of rows affected returns you to the original problem: you have to use different queries in MySQL versus HSQLDB.
HSQLDB:
CALL DIAGNOSTICS(ROW_COUNT);
MySQL:
SELECT ROW_COUNT();
The syntax for doing an upsert in a single command varies by RDBMS.
MySQLINSERT…ON DUPLICATE KEY UPDATE
HSQLDBMERGE
PostgresINSERT…ON CONFLICT…
See Wikipedia for more.
If you want a cross platform solution, then you'll need to use multiple commands. First check for the existing row, then conditionally insert or update as appropriate.

MySQL and implementing something close to sequences?

I am recently in the process of moving from oracle to mysql and would like some advice if how i am implementing something similar to sequences in mysql is a good way.
Essentially how i am currently going to implement it is by having a separate table in mysql for each sequence in oracle and have a single column which represents the last_number and increment this column when ever i insert a new row, that's one way another way i could go about doing it is by creating a single table with several rows representing each sequence and increment each row separately whenever i do an insert.
Another simpler way of doing it i could just do a select max()+1 on the relevant column when inserting data.
I'm basically thinking of switching to the select max()+1 option as it seems simpler to implement, but i would like to get some advice on what you think would be the best way of doing it out of these options, and if there is any pitfalls that i am currently not aware of when using select max()+1.
Also the reason im am not using auto_increment and the function last_insert_id() is i want to follow the ansi standard.
Thanks.
First of all: The max()+1 version is NOT guaranteed to give you a sequence, if you use transactions in a high isolation level.
The way we typically use sequences (if we can't avoid them) is to create a table with an AUTO_INCREMENT value, INSERT INTO it, SELECT last_insert_id(), DELETE FROM table WHERE field<$LASTINSERTID. This is ofcourse done in a stored procedure.
There is a read consistency problem, in that two sessions both running ...
insert into ... select max(..)+1 from ...
... at the same time both see the same value of max(...), hence they both try to insert the same new value.
You have the same problem with your table of maxima method, and you have to use a locking mechanism to avoid multiple session reading the same value. This leads to a concurrency problem where inserts to the table are serialised.

Is there any alternative to "last_update_ID()" for mySQL?

I am currently working on a big web project using ASP and MySQL.
When inserting into multiple tables I've been using last_update_ID(), but after some research I've found that that SQL statement isn't safe.
So. the problem:
I use two different computers, with different internet connections.
Both computers are logged onto the system I am currently building. I have made a page that prints the connection_id(), and last_update_id.
If I update any table with one of the computers the other one also gets that last_update_ID.
Both computers have the same connection_ID.
What can I do to get around this?
I don't want to (if it's not necessary) do a select statement after the first INSERT; to search for the row that I inserted, to get the correct ID of that row.
It's not my server I am using so I can't make any large changes of the database.
I guess that this problem occurs because the webpages use the same loginName & password to connect to the database, is that true?
Is there any other alternative to get the last update ID? that is totally safe..
I close every connection at the end of the asp page. but that doesn't change the connection_ID.
The connection ID is the for a few minutes even thou I open up different web pages on the server.
I believe the LAST_INSERT_ID() is correct for the current session. So each session receives it's own correct value. Either I don't understand your question or you think you have a problem but you don't.
I am not aware of any LAST_UPDATE_ID() function, on an update you can easily retrieve the updated rows by SELECTing them with the same WHERE clause (before the update)?
reference: http://dev.mysql.com/doc/refman/5.0/en/getting-unique-id.html
For LAST_INSERT_ID(), the most
recently generated ID is maintained in
the server on a per-connection basis.
It is not changed by another client.
It is not even changed if you update
another AUTO_INCREMENT column with a
nonmagic value (that is, a value that
is not NULL and not 0). Using
LAST_INSERT_ID() and AUTO_INCREMENT
columns simultaneously from multiple
clients is perfectly valid. Each
client will receive the last inserted
ID for the last statement that
client executed.
If you want to retrieve the LAST_INSERT_ID from an INSERT query with an ON DUPLICATE KEY UPDATE clause, you can also use the LAST_INSERT_ID() function to retrieve the value of the AUTO_INCREMENT column that was updated:
reference: http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html
If a table contains an AUTO_INCREMENT
column and INSERT ... UPDATE inserts a
row, the LAST_INSERT_ID() function
returns the AUTO_INCREMENT value. If
the statement updates a row instead,
LAST_INSERT_ID() is not meaningful.
However, you can work around this by
using LAST_INSERT_ID(expr). Suppose
that id is the AUTO_INCREMENT column.
To make LAST_INSERT_ID() meaningful
for updates, insert rows as follows:
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE id=LAST_INSERT_ID(id), c=3;
Your server appears to have connection pooling turned on. What this means is that the database connection is held open after a script finishes, and the next script that comes along uses it, and thus can see any variables that were set on that connection, including LAST_INSERT_ID().
What can't happen is two script instances sharing a connection at the same time. Thus, if your server is busy enough to need to run two script instances at exactly the same time, it will simply create a second database connection, with its own separate LAST_INSERT_ID() variable, and won't interfere with the first.
In short, as long as the INSERT and the LAST_INSER_ID() request happen within the same script (and you don't somehow close the database connection between them), they're completely safe, as your script has exclusive use of that connection.

Easy mysql question regarding primary keys and an insert

In mysql, how do I get the primary key used for an insert operation, when it is autoincrementing.
Basically, i want the new autoincremented value to be returned when the statement completes.
Thanks!
Your clarification comment says that you're interested in making sure that LAST_INSERT_ID() doesn't give the wrong result if another concurrent INSERT happens. Rest assured that it is safe to use LAST_INSERT_ID() regardless of other concurrent activity. LAST_INSERT_ID() returns only the most recent ID generated during the current session.
You can try it yourself:
Open two shell windows, run mysql
client in each and connect to
database.
Shell 1: INSERT into a table with an
AUTO_INCREMENT key.
Shell 1: SELECT LAST_INSERT_ID(),
see result.
Shell 2: INSERT into the same table.
Shell 2: SELECT LAST_INSERT_ID(),
see result different from shell 1.
Shell 1: SELECT LAST_INSERT_ID()
again, see a repeat of earlier
result.
If you think about it, this is the only way that makes sense. All databases that support auto-incrementing key mechanisms must act this way. If the result depends on a race condition with other clients possibly INSERTing concurrently, then there would be no dependable way to get the last inserted ID value in your current session.
MySQL's LAST_INSERT_ID()
The MySQL Docs describe the function: LAST_INSERT_ID()
[select max(primary_key_column_name) from table_name]
Ahhh not nessecarily. I am not an MySQL guy but there are specific way to get the last inserted id for the last completed action that are a little more robust than this. What if an insert has happened between you writing to the table and querying it? I know about because it stung me many moons ago (so yeah it does happen).
If all else fails read the manual: http://dev.mysql.com/doc/refman/5.0/en/getting-unique-id.html