How to test MySQL transactions? - mysql

I have a question about testing the queries in a transaction. I've been using MySQL transactions for quite some time now, and everytime I do this, I use something like:
$doCommit = true;
$error = "";
mysql_query("BEGIN");
/* repeat this part with the different queries in the transaction
this often involves updating of and inserting in multiple tables */
$query = "SELECT, UPDATE, INSERT, etc";
$result = mysql_query($query);
if(!$result){
$error .= mysql_error() . " in " . $query . "<BR>";
$doCommit = false;
}
/* end of repeating part */
if($doCommit){
mysql_query("COMMIT");
} else {
echo $error;
mysql_query("ROLLBACK");
}
Now, it often happens that I want to test my transaction, so I change mysql_query("COMMIT"); to mysql_query("ROLLBACK");, but I can imagine this is not a very good way to test this kind of stuff. It's usually not really feasable to copy every table to a temp_table and update and insert into those tables and delete them afterwards (for instance because tables maybe very large). Of course, when the code goes into production relevant error-handling (instead of just printing the error) is put into place.
What's the best way to do stuff like this?

First of all, there is a bug in your implementation. If a query errors out, the current transaction is automatically rolled back and then closed. So as you continue to execute queries, they will not be within a transaction (they will be commited to the DB). Then, when you execute Rollback, it'll silently fail. From the MySQL docs:
Rolling back can be a slow operation that may occur implicitly without the user
having explicitly asked for it (for example, when an error occurs).
The explicit command ROLLBACK should only be used if you determine in the application that you need to rollback (for reasons other than a query error). For example, if you're deducting funds from an account, you'd explicitly rollback if you found out the user didn't have enough funds to complete the exchange...
As far as testing the transactions, I do copy the database. I create a new database and install a set of "dummy data". Then I run all the tests using an automated tool. The tool will actually commit the transactions and force rollbacks, and check that the expected database state is maintained throughout the tests. Since it's harder to programatically know the end state from a transaction if you have an unknown input to the transaction, testing off of live (or even copied-from-live) data is not going to be easy. You can do it (and should), but don't depend upon those results for determining if your system is working. Use those results to build new test cases for the automated tester...

Maybe you could refactor your first example and use some DB access wrapper class?
In that wrapper class you can have a variable $normalCommit = true;
and a method SetCommitMode() which sets that $normalCommit variable.
And you have a method Commit() which commits if($normalCommit == true)
Or even have a variable $failTransaction which calls mysql_query("ROLLBACK"); if you wish (so you could pass/fail many sequential tests).
Then when you run the test, you can set somewhere in the test code file:
$myDBClass->SetCommitMode(false);
or
$myDBClass->RollBackNextOperation(true);
before the operation which you wish to fail, and it will just fail. In such a way the code which you are testing will not contain those fail/commit checks, only the DB class will contain them.
And normally ONLLY the test code (especially if you do unit testing) should call those SetCommitMode and RollBackNextOperation methods, so you accidentally do not leave those calls in the production code.
Or you could pass some crazy data to your method (if you are testing a method), like negative variables to save in UNSIGNED fields, and then your transaction should fail 100% if your code does not do commit after such an SQL error (but it should not).

Generally I use something like (I use pdo for my example):
$db->beginTransaction();
try {
$db->exec('INSERT/DELETE/UPDATE');
$db->commit();
}
catch (PDOException $e) {
$db->rollBack();
// rethrow the error or
}
Or if you have your own exception handler, use a special clause for your PDOExceptions, where to rollback the execution. Example:
function my_exception_handler($exception) {
if($exception instanceof PDOException) {
// assuming you have a registry class
Registry::get('database')->rollBack();
}
}

Related

Cannot Persist Further Operations After Rolling Back a Dry Run Transaction

I have an artisan command, in which I am cleaning up some data which has gone bad. Before I actually delete the data, I want to do a dry run and show some of the implications that deleting that data may present.
The essesnce of my command is:
public function handle()
{
...
$this->dryRun($modelsToDelete); // Prints info to user
if ($this->confirm('Are you sure you want to delete?') {
$modelsToDelete->each->forceDelete();
}
...
}
public function dryRun($modelsToDelete)
{
...
DB::connection($connection)->beginTransaction();
$before = $this->findAllOrphans($models);
$modelsToDelete->each(function ($record) use ($bar) {
$record->forceDelete();
});
$after = $this->findAllOrphans($models);
DB::connection($connection)->rollBack();
// Print info about diff
...
}
The problem is that when I do the dry run, and confirm to delete, the actual operation is not persisting in the database. If I comment out the dry run and do the command, the operation does persist. I have checked DB::transactionLevel() before and after the dry run and real operation, and everything seems correct.
I have also tried using DB::connection($connection)->pretend(...), but still the same issue. I also tried doing DB::purge($connection) and DB::reconnect($connection) after rolling back.
Does anyone have any thoughts as to what is going on?
(Using Laravel v6.20.14)
after digging the source code, I found out that laravel set property "exists" to false after you call delete on model instance and it will not perform delete query again. you can reference:
https://github.com/laravel/framework/blob/9edd46fc6dcd550e4fd5d081bea37b0a43162165/src/Illuminate/Database/Eloquent/Model.php#L1173
https://github.com/laravel/framework/blob/9edd46fc6dcd550e4fd5d081bea37b0a43162165/src/Illuminate/Database/Eloquent/Model.php#L1129
and to make model instance can be deleted after dryRun, you should pass a deep copy to dryRun, for example:
$this->dryRun(unserialize(serialize($modelsToDelete)));
note: don't use php clone because it's create a shallow copy
Turn on MySQL's "General log".
Run the experiment that is giving you trouble.
Turn off that log.
The problem may be obvious in the log; if not show us the log.

During a transaction, am I able to use the updated (but not yet committed) values?

I am most likely overcomplicating this. I am fairly confident with MySQL but never used transactions before. I know the concept is begin(), do stuff, commit() or rollback() on a failure and I am pretty sure I can structure that with ease.
What I want to find out is during a transaction I want to update a table then use that updated value in another query during the same transaction. Heres an outline:
begin()
INSERT
SELECT FROM INSERT
UPDATE BASED ON SELECT
commit()
Obviously I have slimmed down the code here and this on its own means nothing. I would like to know if this concept works before I go too deep into transactions and find it doesn't work.
My actual transaction is going to be about 5 times larger and parts of it rely on other parts of the unfinished transaction as above.
I am using Laravel so my code uses DB::beginTransaction() DB::commit() and DB::rollback() if this makes any difference to the question.
So, the simple answer to this is yes.
I ran through a number of tests with the code I had been developing and this is the state, at least in my case.
DB::beingTransaction()
Now we do our thing. ALL STATEMENTS HERE ARE RELATIVE TO EACH OTHER AND PERFORMED IN ORDER.
DB::commit()
This then runs all the statements in order as long as the simulated run through was a success
DB::rollback()
This is called on ALL methods where changes could potentially fail within a try catch block
So let me provide a working (simplified) Laravel example to demonstrate the basics:
public function store(Request $request)
{
DB::beginTransaction();
try
{
if(!$this->setAuthorisation($request))
throw new Exception('Failed to set authorisation');
DB::commit();
return response()->json(['status' => 'OK'], 200);
}
catch (Exception $exception)
{
DB::rollBack();
return response()->json(['status' => 'Failed', 'error'=>$exception->getMessage()], 500);
}
}
Now from here we can access the changes made in the setAuthorisation() method say in a getAuthorisation() method and then apply the data from $data = getAuthorisation() to another set of code say for example in setData($data).
The entire transaction works in order of simulated events before commiting so it "applies" the changes before "actually applying" the changes. Its really hard to explain any more than that so I hope this answers my own question.

use multiple go routines for different operations on mysql

I have a piece of Go code with 3 different functions, insertIntoMysql, updateRowMysql and deleteRowmysql. I check for operation type and run one of these functions as needed.
I want to convert my normal functions into go routines to be able to handle more operations.
But here is the issue:
If I convert into goroutines, I will lose the sequence of operations.
For example, the insert operations are much more frequent than the delete operations and insert operations are being queued in the insert channel while the delete channel is empty it is possible for my code to try to delete a row before it gets inserted (e.g. a row is inserted and the deleted 1 sec later).
Any ideas on how to make sure the sequence of my operations on mysql is the same as the received operations.
here is the code:
go insertIntoMysql(insertChan, fields, db, config.DestinationTable)
go updatefieldMysql(updateChan, fields, db, config.DestinationTable)
go deleteFromMysql(deleteChan, fields, db, config.DestinationTable)
for !opsDone {
select {
case op, open := <-mysqlChan:
if !open {
opsDone = true
break
}
switch op.Operation {
case "i":
//fmt.Println("got insert operation")
insertChan <- op
break
case "u":
fmt.Println("got update operation")
updateChan <- op
break
case "d":
fmt.Println("got delete operation")
deleteChan <- op
break
case "c":
fmt.Println("got command operation")
//no logic yet
break
}
break
}
}
close(done)
close(insertChan)
close(deleteChan)
close(updateChan)
}
Looking at your code and your question/requirement: Stay in order as the original go channel delivered the data.
Staying in sync and still firing off multiple go routines to process the data, could be done like this:
func processStatements(mysqlChan chan *instructions) {
var wg sync.WaitGroup
prevOp := "" // Previous operation
for {
op, open := <-mysqlChan
if !open {
break
}
if prevOp!=op.Operation {
// Next operation type, unknown side effects on previous operations,
// wait for previous to finish before continuing
wg.Wait()
}
switch op.Operation {
case "i":
wg.Add(1)
go processInsert(op,wg)
break
case "u":
wg.Add(1)
go processUpdate(op,wg)
break
case "d":
wg.Add(1)
go processDelete(op,wg)
break
case "c":
break
}
prevOp = op.Operation
}
// Previous go routines might still be running, wait till done
wg.Wait()
// Close any channels
}
func processInsert(c chan *sqlCall,wg *sync.WaitGroup) {
defer wg.Done()
// Do actual insert etc
}
The main differences with your program are:
Processing is in order, safe against your base scenario
Process of next operation type will wait until previous operation type is finished
Multiple of the same operations run in parallel (as for in your code a single operation per type would run in parallel). Depending on the data distribution either of them can be faster or slower (i,i,i,d,u = 3 waits in this code, while it would also be 3 waits in your code, however in your code the 3 i would run sequential, here they run parallel) (Look into insert or update for mysql, since now your inserts could suddenly be updates depending on your data).
How would the one that calls delete know about the row in the first place? From a SQL perspective: if it is not committed, it does not exist. Something calls your code to insert. Your code should return only after the transaction is completed. At which point the caller could start select, update or delete.
For example, in a web server: Only the client that called insert knows of the existence to the record initially. Other clients will only learn of its existence after running select. That select will only return the new record if it was committed in the transaction that inserted it. Now they might decide to upsert or delete.
Many SQL databases take care of proper row locking, which will ensure data integrity during concurrent access. So if you use Go routines to write to the same record, they will become sequential (in any given order) on the DB side. Concurrent reads could still happen.
Please note that net/http already runs a Go routine for every request. And so do most other server front-ends for Go. If you are writing something different altogether, like a custom TCP listener, you could initiate your own Go routine for each request.
As far as I understand the problem, the following can be done as a resolution :
Call Insert, Delete or Update as go routines and they will run concurrently and performing all these operations.
Ensure you have row-level locking in your MySQL DB (I am sure InnoDb provides that)
For your delete and update operations, have an isExist check to check if the row you are updating/deleting exists before deleting and updating the row, this allows for some level of control.
You can implement a retry mechanism in your channel (with jitter preferably) to ensure that even if Delete Operation goes before insert operation, it will fail the exist check and then can be retried(after a second or 0.5 second or some config) and this will ensure insert record is already performed before delete or update operation is retried.
Hope this helps.

Do MySQL database transactions break Laravel PHPUnit tests using RefreshDatabase or DatabaseTransactions?

I'm belatedly writing some PHPUnit feature tests for a project. One of these hits a number of routes which modify the database - and until now, I've been using database transactions in some of these routes.
If I use either the RefreshDatabase or DatabaseTransactions trait in my test, it fails with a strange error:
Illuminate\Database\Eloquent\ModelNotFoundException: No query results for model [App\Models\ClassM]
If I remove the traits (so the data will persist), the test passes.
I went through and removed all my database transactions from the relevant routes, and the test now does pass with RefreshDatabase.
My suspicion is that the problem is that MySQL doesn't support nested transactions, and both traits use transactions. But that would mean that if I want to run tests with RefreshDatabase (by far the best option!), I can't use transactions at all in my code.
Is that correct? It seems a major limitation.
I've found a workaround - and it seems a little strange. I was mostly using the "manual" transaction methods - e.g.
DB::beginTransaction();
try {
$user = User::create([...]); etc.
DB::commit();
}
catch (...) {
DB::rollBack();
}
I simply switched to the closure method:
DB::transaction(function () {
$user = User::create([...]); etc.
});
No idea why that would have fixed it!

Does last_insert_id return the correct auto_increment id in a multiprocessing environment?

Here's some simplified code in my web application:
sub insert {
my $pid = fork();
if ($pid > 0) {
return;
}
else {
&insert_to_mysql();
my $last_id = &get_last_inserted(); # call mysql last_inserted_id
exit(0);
}
}
for my $i (1..10) {
&insert();
}
Since insert is called in a multiprocessing environment, the order of get_last_inserted might be uncertain. Will it always return a correct last id corresponding to insert_to_mysql subroutine? I read some documents saying that as long as the processes don't share the same mysql connection, the returned id will be always the right one. However, these processes are spawned from the same session, so I'm not sure if they share the mysql connection or not. Thanks in advance.
these processes are spawned from the same session
Are you saying you're forking and using the same connection in more than one process? That doesn't work at all, never mind LAST_INSERT_ID(). You can't have two processes reading and writing from the same connection! The response for one could end up in the other, assuming the two clients didn't clobber each other's request.
Does last_insert_id return the correct auto_increment id in a multiprocessing environment?
According to MySQL's documentation for LAST_INSERT_ID(),
The ID that was generated is maintained in the server on a per-connection basis.
It would be useless otherwise. Since connections can't shared across processes, yes, it's safe.
I don't know about MySql and perl, but in PHP that's quite the same issue, since it depends on the environment and not on the language. In PHP, last_insert_id expects one parameter: current connection! As long as multiple instances do not share the same connection ressource, passing the connection resource to the current mysql session should do the trick.
That's what I've found googling around: http://www.xinotes.org/notes/note/179/