how to simply handle a (very) short mysql replication lag - mysql

I have an application (php) running with MySQL 5.5 (1 master and 1 slave)
I use to dispatch read/write on master / slave.
When I create a new record (a user or something like that) I write it on the master and when I reload the page, I load it from the slave.
Example:
...
if ($_GET['id'])
{
#Load user
$user = $sql->load('user', $_GET['id']);
if ($user == false)
{
throw exception('User not found');
}
}
else if ($_POST['create]')
{
#Create a new user
$user_id = $sql->insert('user', $_POST);
$mvc->reload('?id=' . $user_id);
exit();
}
...
But when the master is really performant (quick insert) and the replication is not (lag = 0.3 - 1 sec), the reload will not work...
What are the best practice to handle that
Some solutions:
Database optimisation for reducing the lag (very difficult)
sleep(1) before reading or after writing ... not very elegant

First, you need to define if your application can work with a lag or not.
If not, then you need to ensure that the data you want to fetch are already available on slave. For example fetch the last id from slave and compare it with id you are about to fetch; or try slave first and if row is not there, fallback to master (but that will overload the master with requests for new data).
Usually web application can work with stale data. There is no problem if other visitors will see a new post 10 seconds later. But as you mentioned, it is bad if author of the post doesn't see it immediately. So you can act differently based on the data/reason you are fetching (for example cache the info about recent post in session and in that case fetch from master)

Related

Is there a way to store database modifications with a versioning feature (for eventual versions comparaison)?

I'm working on a project where users could upload excel files into a MySQL database. Those files are the main source of our data as they come directly from the contractors working with the company. They contain a large number of rows (23000 on average for each file) and 100 columns for each row!
The problem I am facing currently is that the same file could be changed by someone (either the contractor or the company) and when re-uploading it, my system should detect changes, update the actual data, and save the action (The fact that the cell went from a value to another value :: oldValue -> newValue) so we can go back and run a versions comparison (e.g 3 re-uploads === 3 versions). (oldValue Version1 VS newValue Version5)
I developed a tiny mechanism for saving the changes => I have a table to save Imports data (each time a user import a file a new row will be inserted in this table) and another table for saving the actual changes
Versioning data
I save the id of the row that have some changes, as well as the id and the table where the actual data was modified (Uploading a file results in a insertion in multiple tables, so whenever a change occurs, I need to know in which table that happened). I also save the new value and the old value which is gonna help me with restoring the "archives data".
To restore a version : SELECT * FROM 'Archive' WHERE idImport = ${versionNumber}
To restore a version for one row : SELECT * FROM 'Archive' WHERE idImport = ${versionNumber} and rowId = ${rowId}
To restore all version for one row : SELECT * FROM 'Archive' WHERE rowId = ${rowId}
To restore version for one table : SELECT * FROM 'Archine' WHERE tableName = ${table}
Etc.
Now with this structure, I'm struggling to restore a version or to run a comparaison between two versions, which makes think that I've came up with a wrong approach since it makes it hard to do the job! I am trying to know if anyone had done this before or what a good approach would look like?
Cases when things get really messy :
The rows that have changed in a version might not have changed in the other version (I am working on a time machine to search in other versions when this happens)
The rows have changed in both versions but not the same fields. (Say we have a user table, the data of the user with id 15 have changed in 2nd and 5th upload, great! Now for the second version only the name was changed, but for the fifth version his address was changed! When comparing these two versions, we will run into a problem constrcuting our data array. name went from "some"-> NULL (Name was never null. No name changes in 5th version) and address went from NULL -> "some' is which obviously wrong).
My actual approach (php)
<?php
//Join records sets and Compare them
foreach ($firstRecord as $frecord) {
//Retrieve first record fields that have changed
$fFields = $frecord->fieldName;
//Check if the same record have changed in the second version as well
$sId = array_search($frecord->idRecord, $secondRecord);
if($sId) {
$srecord = $secondRecord[$sId];
//Retrieve straversee fields that have changed
$sFields = $srecord->fieldName;
//Compare the two records fields
foreach ($fFields as $fField) {
$sfId = array_search($fField, $sFields);
//The same field for the same record was changed in both version (perfect case)
if($sfId) {
$sField = $sFields[$sfId];
$deltaRow[$fField]["oldValue"] = $frecord->deltaValue;
$deltaRow[$fField]["newValue"] = $srecord->deltaValue;
//Delete the checked field from the second version traversee to avoid re-checking
unset($sField[$sfId]);
}
//The changed field in V1 was not found in V2 -> Lookup for a value
else {
$deltaRow[$fField]["oldValue"] = $frecord->deltaValue;
$deltaRow[$fField]["newValue"] = $this->valueLookUp();
}
}
$dataArray[] = $deltaRow;
//Delete the checked record from the second version set to avoid re-checking
unset($secondRecord[$srecord]);
}
I don't know how to deal with that, as I said I m working on a value lookup algorithm so when no data found in a version I will try to find it in the versions between theses two so I can construct my data array. I would be very happy if anyone could give some hints, ideas, improvements so I can go futher with that.
Thank you!
Is there a way to store database modifications with a versioning feature (for eventual versions comparaison [sic!])?
What constitutes versioning depends on the database itself and how you make use of it.
As far as a relational database is concerned (e.g. MariaDB), this boils down to the so called Normal Form which is in numbers.
On Database Normalization: 5th Normal Form and Beyond you can find the following guidance:
Beyond 5th normal form you enter the heady realms of domain key normal form, a kind of theoretical ideal. Its practical use to a database designer os [sic!] similar to that of infinity to a bookkeeper - i.e. it exists in theory but is not going to be used in practice. Even the most demanding owner is not going to expect that of the bookkeeper!
One strategy to step into these realms is to reach the 5th normal form first (do this just in theory, by going through all the normal forms, and study database normalization).
Additionally you can construe versioning outside and additional to the database itself, e.g. by creating your own versioning system. Reading about what you can do with normalization will help you to find better ways to decide on how to structure and handle the database data for your versioning needs.
However, as written it depends on what you want and need. So no straight forward "code" answer can be given to such a general question.

Issue with concurrent requests in CakePHP 2.0

Thanks in advance for attempting to asssist me with this issue.
I'm using CakePHP 2 (2.10.22).
I have a system which creates applications. Each application that gets created has a unique application number. The MySQL database column that stores this application number is set to 'Not null' and 'Unique'. I'm using CakePHP to get the last used application number from the database to then build the next application number for the new application that needs to be created. The process that I have written works without any problem when a single request is received at a given point in time. The problem arises when two requests are received to create an application at the exact same time. The behaviour that I have observed is that the the request that gets picked up first gets the last application number - e.g. ABC001233 and assigns ABC001234 as the application number for the new application it needs to create. It successfully saves this application into the database. The second request which is running concurrently also gets ABC001233 as the last application number and tries to create a new application with ABC001234 as the application number. The MySQL database returns an error saying that the application number is not unique. I then put the second request to sleep for 2 seconds by which time the first application has successfully saved to the database. I then re-attempt the application creation process which first gets the last application number which should be ABC001234 but instead each database read keeps returning ABC001233 even though the first request has long been completed. Both requests have transactions in the controller. What I have noticed is that when I remove these transactions, the process works correctly where for the second request after the first attempt fails, the second attempt works correctly as the system correctly gets ABC001234 as the last application number and assigns ABC001235 as the new application number. I want to know what I need to be doing so as to ensure the process works correctly even with the transaction directives in the controller.
Please find below some basic information on how the code is structured -
Database
The last application number is ABC001233
Controller file
function create_application(){
$db_source->begin(); //The process works correctly if I remove this line.
$result = $Application->create_new();
if($result === true){
$db_source->commit();
)else{
$db_source->rollback();
}
}
Application model file
function get_new_application_number(){
$application_record = $this->find('first',[
'order'=>[
$this->name.'.application_number DESC'
],
'fields'=>[
$this->name.'.application_number'
]
]);
$old_application_number = $application_record[$this->name]['application_number'];
$new_application_number = $old_application_number+1;
return $new_application_number;
}
The above is where I feel the problem originates. For the first request that gets picked up, this find correctly finds that ABC001233 is the last application number and this function then returns ABC001234 as the next application number. For the second request, it also picks up ABC001233 as the last application number but will fail when it tries to save ABC001234 as the application number as the first request has already saved an application with that number. As a part of the second attempt for the second request (which occurs because of the do/while loop) this find is requested again, but instead of returning ABC001234 as the last application number (per the successfuly save of the first request), it keeps returning ABC001233 resulting in a failure to correctly save. If I remove the transaction from the controller, this then works correctly where it will return ABC001234 in the second attempt. I couldn't find any documentation as to why that is and what can be done about the same and is where I need some assistance. Thank you!
function create_new(){
$new_application_number = $this->get_new_application_number();
$save_attempts = 0;
do{
$save_exception = false;
try{
$result = $this->save([$this->name=>['application_number'=>$new_application_number]], [
'atomic'=>false
]);
}catch(Exception $e){
$save_exception = true;
sleep(2);
$new_application_number = $this->get_new_application_number();
}
}while($save_exception === true && $save_attempts++<5);
return !$save_exception;
}
You just have to lock the row with the previous number in a transaction using SELECT ... FOR UPDATE. It's much better than the whole table lock as said in the comments.
According to documentation https://book.cakephp.org/2/en/models/retrieving-your-data.html you just have to add 'lock' => true to get_new_application_number function:
function get_new_application_number(){
$application_record = $this->find('first',[
'order'=>[
$this->name.'.application_number DESC'
],
'fields'=>[
$this->name.'.application_number'
],
'lock'=>true
]);
$old_application_number = $application_record[$this->name]['application_number'];
$new_application_number = $old_application_number+1;
return $new_application_number;
}
How does it work:
The second transaction will wait on that request while the first transaction is ended.
P.S. According to documentation lock option was added in the 2.10.0 version of CakePHP.

OrmLite (ServiceStack): Only use temporary db-connections (use 'using'?)

For the last 10+ years or so, I have always opened a connection the database (mysql) and kept it open, until the application closed. All queries was executed on the connection.
Now, when I see examples on Servicestack webpage, i always see the using-block being used, like:
using (var db = dbFactory.Open())
{
if (db.CreateTableIfNotExists<Poco>())
{
db.Insert(new Poco { Id = 1, Name = "Seed Data"});
}
var result = db.SingleById<Poco>(1);
result.PrintDump(); //= {Id: 1, Name:Seed Data}
}
In my current test-project, I got OrmLite to work in my normal way (one db-connection, no using-statements), so I basically had a class-wide _db, like this:
_dbFactory = new OrmLiteConnectionFactory($"Uid={dbAccount.Username};Password={dbAccount.Password};Server={dbAccount.Address};Port={dbAccount.Port};Database={dbAccount.Database}", MySqlDialect.Provider);
_db = _dbFactory.Open(); // var kept in memory, and used for all queries
It worked in the beginning, but now I suddenly got the Exception:
There is already an open DataReader associated with this Connection which must be closed first
Some code might run a SELECT here and there, and if I understand it correctly, if a SELECT and an INSERT would occur at the same time, this error appears?
If so, is it best-practice to always open a new connection for every single query (say, inside a using-statement)? Isnt that a big overhead, to do what for every query?
Having 1 DB Connection is not ThreadSafe so holding on to the connection is only an option if there’s at most 1 thread accessing the DB connection.
Most ADO.NET providers enable connection pooling by default so it’s more efficient to close the connection when you’re done with it as the connection gets returned back to the pool which reduces the number of active connections in use.

Distribute records on different MySQL databases - MySQL Proxy alternative

My scenario is the following:
Right now I am using one big MySQL database with multiple tables to store user data. Many tables contain auto increment columns.
I would like to split this into 2 or more databases. The distribution should be done by user_id and is determined (cannot be randomized). E.g. user 1 and 2 should be on database1, user 3 on database2, user 4 on database3.
Since I don't want to change my whole frontend, I would like to still use one db adapter and kind of add a layer between the query generation (frontend) and the query execution (on the right database). This layer should distribute the queries to the right database based on the user_id.
I have found MySQL Proxy which sounds exactly like what I need. Unfortunately, it's in alpha and not recommended to be used in a production environment.
For php there is MySQL Native Driver Plugin API which sounds promising but then I need a layer that supports at least php and java.
Is there any other way I can achieve my objectives? Thanks!
This site seems to offer the service you're looking for (for a price).
http://www.sqlparser.com/
It lets you parse and modify queries and results. However what you're looking to do seems like it will only require a couple lines of code to distinguish between different user id's, so even though mysql-proxy is still in alpha your needs are simple enough that I would just use the proxy.
Alternatively, you could user whatever server-side language you're using to grab their user.id info, and then create a mysql connection to the appropriate database based on that info. Here's some php I scrabbled together which in spirit does what I think you're looking to do.
</php
// grab user.id from wherever you store it
$userID = get_user_id($clientUserName);
$userpass = get_user_pass($clientUserName);
if ($userID % 4 == 0) { // every 4th user
$db = new mysqli('localhost', $clientUserName, $userPass, 'db4');
}
else if ($userID % 3 == 0) { // every 3th user
$db = new mysqli('localhost', $clientUserName, $userPass, 'db3');
}
else if ($userID % 2 == 0) { // every 2nd user
$db = new mysqli('localhost', $clientUserName, $userPass, 'db2');
}
else // every other user
$db = new mysqli('localhost', $clientUserName, $userPass, 'db1');
}
$db->query('SELECT * FROM ...;');
?>

Auto update prices in database, mysql

I am currently getting products from one site, storing them in a database, and then having their prices display on another site. I am trying to get the prices from the one site to update daily in my database so the new updated prices can be displayed onto my other site.
Right now I am getting the products using an item number but have to manually go in and update any prices that have changed.
I am guessing I am going to have to use some kind of cronjob but not sure how to do this. I have no experience with cronjobs and am a noob with php.
Any ideas?
Thanks!
I have done some reading on the foreach loop and have written some code. But my foreach loop is only running once for the first item number. The foreach loop runs then goes to the "api.php" page but then stops. It doesn't continually loop for each item number. How do I tell it to go through all of item numbers in my database?
Also if you see anything else wrong in my code please let me know.
Thanks
....
$itemnumber = array("".$result['item_number']."");
foreach ($itemnumber as $item_number) {
echo "<form method=\"post\" action=\"api.php\" name=\"ChangeSubmit\" id=\"ChangeSubmit\">";
echo "<input type=\"text\" name=\"item_number\" value=\"{$item_number}\" />";
echo "<script type=\"text/javascript\">
function myfunc () {
var frm = document.getElementById(\"ChangeSubmit\");
frm.submit();
}
window.onload = myfunc;
</script></form>";
}
}
If you already retrieve the product data from an external site and store it in a local database, updating the prices from the same source should be no problem to you. Just retrieve the data, iterate through it in a foreach loop or similar and update the prices to the database based on the item number.
Once you have created the update script and run it manually, adding it as a cronjob will be as simple as running the command `crontab -e´ and adding this row to execute your script every midnight:
0 0 * * * /usr/local/bin/php /path/to/your/script.php
Don't forget to use the correct path for PHP for your system, running which php in the shell will tell you the path.
If you have cronjob's on your server, it'll be very apparent- You make a PHP script that updates it, and throw it in a daily cronjob.
However, I do it this way:
Method 1: At the beginning of every page request, check the last "update" time (you choose how to store it). If it's been more than a day, do the update and set the "update" time to the current time.
This way, every time someone loads a page and it's been a day since the last update, it updates for them. However, this means it's slower for random users, once a day. If this isn't acceptable, there's a little change:
Method 2: If you need to update (via the above method of checking), start an asyncronous request for the data, handle the rest of the page, flush it to the user, then in a while loop wait until the request finishes and update it.
The downside to method 2 is that the user won't see the updated values, but, the benefit is that it won't be any more of a wait for them.