MySQL and PDO: Could PDO::lastInsertId theoretically fail? - mysql

I have been pondering on this for a while.
Consider a web application of huge proportions, where, let's say, millions of SQL queries are performed every second.
I run my code:
$q = $db->prepare('INSERT INTO Table
(First,Second,Third,Fourth)
VALUES (?,?,?,?)');
$q->execute(array($first,$second,$third,$fourth));
Then immediately after, I want to fetch the auto incremented ID of this last query:
$id = $db->lastInsertId();
Is it possible for lastInsertId to fail, i.e. fetch the ID of some SQL insert query that was executed between my two code blocks?
Secondary:
If it can fail, what would be the best way to plug this possible leak?
Would it be safer to create another SQL query to fetch the proper ID from the database, just to be sure?

It will always be safe provided that the PDO implementation is not doing something really bone-headed. The following is from the MySQL information on last_insert_id:
The ID that was generated is maintained in the server on a per-connection basis. This means that the value returned by the function to a given client is the first AUTO_INCREMENT value generated for most recent statement affecting an AUTO_INCREMENT column by that client. This value cannot be affected by other clients, even if they generate AUTO_INCREMENT values of their own. This behavior ensures that each client can retrieve its own ID without concern for the activity of other clients, and without the need for locks or transactions.

No. lastInsertId is per-connection, and doesn't require a request to the server - mysql always sends it back in its response packet.
So if the execute method doesn't throw an exception, then you are guaranteed to have the right value in lastInsertId.
It won't ever give you the insert ID of anything else, unless your query failed for some reason (e.g. invalid syntax) in which case it might give you the insert ID from the previous one on the same connection. But not anybody else's.

Related

MYSQL last_insert_id() and concurrency

I have a simple MYSQL question. If I make a query that contains LAST_INSERT_ID() right after an INSERT QUERY running on a web page which has many concurrent users accessing other pages that perform INSERT operations would the value of LAST_INSERT_ID() be adulterated/corrupted?
No, it will return the insert id from the current connection. As long as your script hasn't made any other inserts, you will get the one you want.
Also be aware that this will only return a generated ID (e.g. an auto-increment). If you are creating your own ID's it won't return this to you.

Is it possible to get LAST_INSERT_ID() from different database?

Suppose, that we have 2 databases: a and b, and tables a.test1 and b.test2.
If I need to insert a row into table a.test1, and return LAST_INSERT_ID() to insert into b.test2, will LAST_INSERT_ID() return value from another database? Is it reliable?
I didn't find anything in the manual, but ##IDENTITY depends on client session, so it should be portable between two databases. Isn't it?
LAST_INSERT_ID() always gives you the id of the row inserted by the last INSERT statement you executed on the current connection, irrespective of what table (and what database!) that row went into.
From Mysql documentation:The ID that was generated is maintained in the server on a per-connection basis. This means that the value returned by the function to a given client is the first AUTO_INCREMENT value generated for most recent statement affecting an AUTO_INCREMENT column by that client. This value cannot be affected by other clients, even if they generate AUTO_INCREMENT values of their own. This behavior ensures that each client can retrieve its own ID without concern for the activity of other clients, and without the need for locks or transactions.
Inshort Both *LAST_INSERT_ID()* and *mysql_insert_id()* work as advertised i.e.: they will retrieve the last id inserted into any table during the current session/connection.

MySQL perfomance: letting a UNIQUE field generate an error or manually checking it

Theoretical question about the impact on performance.
One of the fields in my table is unique. For instance, email_address in the Users table.
What has less of an impact on performance? Attempting to add an already existing email address and getting the error, or doing a search on the email field?
The UNIQUE field will probably be faster.
If you tell MySQL that a certain field is unique, it may perform some optimizations.
Additionally, if you want to insert the record if it isn't in the table already you might run into some concurrency issues. Assume there are two people trying to register with the same email address. Now, if you perform the uniqueness check yourself something like so:
bool exists = userAlreadyExists(email);
if (exists)
showWarning();
else
insertUser(email);
something like the following might happen:
User 1 executes userAlreadyExists("foo#example.com") // returns false
User 2 executes userAlreadyExists("foo#example.com") // returns false
User 1 executes insertUser("foo#example.com")
User 2 executes insertUser("foo#example.com") // which is now a duplicate
If you let MySQL perform the uniqueness check, the above won't happen.
If you check then update, you have to query the database twice. And its turn it will check the table index twice. You have both network overhead and database processing overhead.
My point of view is you have to be optimistic: update and handle gracefully the potential failure if there is some duplicate values.
The two-steps approach has one other drawback: don't forget there will be concurrent access to your database. Depending on you database setup (isolation level, database engine), there is a potential that DB was modified by an other connection between the SELECT and your UPDATE.

Manually increament primary key - Transaction and racing condition

This may not be a real world issue but is more like a learning topic.
Using PHP, MySQL and PDO, I know all about auto_increment and lastInsertId(). Consider that the primary key has no auto_incerment attribute and we have to use something like SELECT MAX(id) FROM table in order to retrieve last id, increment it manually and then INSERT INTO table (id) VALUES (:lastIdPlusOne). Wrap whole code in beginTransaction and commit.
Is this approach safe? If user A and B at the same time load this script what will happens at the end? both transaction will be failed? Or both will be successful (for instance, if the last id was 10, A will insert 11 and B will insert 12)?
Note that since I am a PHP & MySQL developer, therefor I am more interested in MySQL behavior in this case.
If both got the same max, then the one that inserts first will succeed, and other(s) will fail.
To overcome this issue without using using auto_increment fields, you may use a trigger before insert that does the job (new.id=max) i.e. same logic, but in a trigger, so the DB server is the one who controls it.
Not sure though if this is 100% safe in a master-master replication environment in case of a server failure.
This is #eggyal comment, that I quote here:
You must ensure that you use a locking read to fetch the MAX() in the first (select) query; it will then block until the transaction is committed. However, this is very poor design and should not be used in a production system.

Is there any alternative to "last_update_ID()" for mySQL?

I am currently working on a big web project using ASP and MySQL.
When inserting into multiple tables I've been using last_update_ID(), but after some research I've found that that SQL statement isn't safe.
So. the problem:
I use two different computers, with different internet connections.
Both computers are logged onto the system I am currently building. I have made a page that prints the connection_id(), and last_update_id.
If I update any table with one of the computers the other one also gets that last_update_ID.
Both computers have the same connection_ID.
What can I do to get around this?
I don't want to (if it's not necessary) do a select statement after the first INSERT; to search for the row that I inserted, to get the correct ID of that row.
It's not my server I am using so I can't make any large changes of the database.
I guess that this problem occurs because the webpages use the same loginName & password to connect to the database, is that true?
Is there any other alternative to get the last update ID? that is totally safe..
I close every connection at the end of the asp page. but that doesn't change the connection_ID.
The connection ID is the for a few minutes even thou I open up different web pages on the server.
I believe the LAST_INSERT_ID() is correct for the current session. So each session receives it's own correct value. Either I don't understand your question or you think you have a problem but you don't.
I am not aware of any LAST_UPDATE_ID() function, on an update you can easily retrieve the updated rows by SELECTing them with the same WHERE clause (before the update)?
reference: http://dev.mysql.com/doc/refman/5.0/en/getting-unique-id.html
For LAST_INSERT_ID(), the most
recently generated ID is maintained in
the server on a per-connection basis.
It is not changed by another client.
It is not even changed if you update
another AUTO_INCREMENT column with a
nonmagic value (that is, a value that
is not NULL and not 0). Using
LAST_INSERT_ID() and AUTO_INCREMENT
columns simultaneously from multiple
clients is perfectly valid. Each
client will receive the last inserted
ID for the last statement that
client executed.
If you want to retrieve the LAST_INSERT_ID from an INSERT query with an ON DUPLICATE KEY UPDATE clause, you can also use the LAST_INSERT_ID() function to retrieve the value of the AUTO_INCREMENT column that was updated:
reference: http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html
If a table contains an AUTO_INCREMENT
column and INSERT ... UPDATE inserts a
row, the LAST_INSERT_ID() function
returns the AUTO_INCREMENT value. If
the statement updates a row instead,
LAST_INSERT_ID() is not meaningful.
However, you can work around this by
using LAST_INSERT_ID(expr). Suppose
that id is the AUTO_INCREMENT column.
To make LAST_INSERT_ID() meaningful
for updates, insert rows as follows:
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE id=LAST_INSERT_ID(id), c=3;
Your server appears to have connection pooling turned on. What this means is that the database connection is held open after a script finishes, and the next script that comes along uses it, and thus can see any variables that were set on that connection, including LAST_INSERT_ID().
What can't happen is two script instances sharing a connection at the same time. Thus, if your server is busy enough to need to run two script instances at exactly the same time, it will simply create a second database connection, with its own separate LAST_INSERT_ID() variable, and won't interfere with the first.
In short, as long as the INSERT and the LAST_INSER_ID() request happen within the same script (and you don't somehow close the database connection between them), they're completely safe, as your script has exclusive use of that connection.