No specific reads performance difference in LAMP and MEAN stacks - mysql

These are the codes that Iam running on AWS T2.medium 2 core/ 4GB instances. Used same config different instance,same subnet to benchmark.The benchmarking result shows same throughput and response time of both the stacks. I used Jmeter for performance metrics evaluation. Results shows that MEAN is slower than LAMP. Tested upto 600 concurrent requests and throughput was around 35req/sec for MEAN and 37req/sec for LAMP. Why Iam getting such a low throughput and why MEAN is performing worst for just reading.
LAMP stack code:
<?php
$con = mysql_connect("localhost", "root", "root");
if(!$con) die('connection failed');
$db = mysql_select_db("mms-php", $con);
$result = mysql_query("SELECT * FROM comments LIMIT 10", $con);
$rows = array();
while( $row = mysql_fetch_assoc($result) ) {
array_push($rows, $row);
}
header("Content-type:application/json");
echo json_encode($rows);
//print_r($rows);
//$result = mysql_query("INSERT INTO msgs (msg) VALUES ('" . rand() . time() . "')", $con);
mysql_close($con);
?>
mysql has one table comments with 1000 entries. Similarly MongoDB has one collection with 1000 entries.
MEAN stack code:
var MongoClient = require('mongodb').MongoClient
, assert = require('assert');
var http = require('http');
var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;
if (cluster.isMaster) {
// Fork workers.
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('death', function(worker) {
console.log('worker ' + worker.pid + ' died');
});
} else {
// Connection URL
var url = 'mongodb://localhost:27017/test';
// Use connect method to connect to the Server
MongoClient.connect(url, function(err, db) {
console.log("Connected correctly to server");
http.globalAgent.maxSockets = 10;
http.createServer(function (req, res) {
//process.nextTick(function(){
res.writeHead(200, {'Content-Type': 'text/plain'});
db.collection('comments').find({},function(err, docs) {
if(err) {
console.log(err);
res.write('an error occurred');
res.end();
} else {
// console.log(docs);
docs.each(function(err, doc){
if(doc)
res.write(JSON.stringify(doc));
else
res.end();
//console.log(JSON.stringify(doc));
});
}
});
// });
}).listen(1337, '0.0.0.0');
});
}

In summary, synthetic benchmarks are mostly for marketing. For read-only work (analytics & search where the write step is typically once per day/overnight) I've used MySQL with all the data sitting in a 60Gb+ squashfs RAM-disk (LAMP) and Elasticsearch (JavaScript & PHP clients on Linux). The latter was a workable option for real/near-time updates (I like Elasticsearch despite a Java prejudice) coupled to a MySQL database for the transactional part (and its normalised model to permit later ETL for analytics).
I would guess that you aren't exactly benchmarking a MEAN vs LAMP implementation of your application. I would guess that you are more likely benchmarking the start-up, printing (JSON encoding) a tousand-ish lines, and tear-down costs (establish a process/thread instance, make a connection to the database and so on) rather than the database transaction cost proper. I suggest this is the case as you have: a very small database (1000 elements) that will be cached; you're not asking the database to do anything other than return a cursor & the entire table/table-equivalent; and then just regurgitating it to the output stream (with no transformation -- unless you client side means to do this). Lastly, as others hint, you're not controlling the platform on which you are performing the comparison. You could run LAMP & MEAN on the most meagre of computers where you can have more certainty of a fair comparison.
Your benchmark would also be easily cheated with, say, the Apache configuration (as is possible with most other stacks) to cache results.
Have you tried a variety of clients to exercise the two implementations? Apache's ab is trivial but otherwise meaningless for applications I've been involved in historically. Great for discovering how the system behaves at saturation but not as informative as, say, httperf that I've successfully used to stretch the application services and database layers (where it won't all fit into RAM and I can make a distribution frequency that fits real scenarios).

Related

Laravel file upload timing out - adding chunk to updateOrCreate?

I'm trying to upload a users csv file and pass the data into a database but because the size of the csv / rows, it keeps timing out. The data must be checked to see if it's already in the database and update or create.
I have applied a chunk to the CSV for reading the data but didn't know if it's possible to add a chunk to the upload to database section?
Here is my function
public function import(Request $request) {
if($request->file('imported-file')) {
$path = $request->file('imported-file')->getRealPath();
$data = Excel::filter('chunk')->load($path)->chunk(200, function($results) {
foreach($results as $row) {
if(!empty($row['postcode'])) {
$url = "https://maps.googleapis.com/maps/api/geocode/xml?address=".urlencode($row['postcode'])."&region=uk&key=";
$tmp = file_get_contents($url);
$xml = simplexml_load_string($tmp);
if((string)$xml->status == 'OK' && isset($xml->result[0])) {
$lat = 0;
$lng = 0;
if(isset($xml->result[0]->geometry->location->lat)) {
$lat = (string)$xml->result[0]->geometry->location->lat;
}
if(isset($xml->result[0]->geometry->location->lng)) {
$lng = (string)$xml->result[0]->geometry->location->lng;
}
}
Import::updateOrCreate(
[
'sitecode' => $row['sitecode']
],
[
'sitecode' => $row['sitecode'],
'sitename' => $row['sitename'],
'address_1' => $row['address_1'],
'address_2' => $row['address_2'],
'address_town' => $row['address_town'],
'address_postcode' => $row['postcode'],
'charity' => $row['charity'],
'latitude' => $lat,
'longitude' => $lng,
'approved' => 1
]
);
} else {
// Postcode not valid!!!
}
} // endforeach
Session::flash('sucess', 'Import was sucessful.');
return redirect()->route('locations');
});
} else {
Session::flash('error', 'Please select a file to upload!');
return back();
}
}
Your problem is related to the configuration of your server and you must understand that many things can go wrong when you are doing long-running tasks in real time.
If you are using a Nginx/PHP-FPM setup, you must look at the Nginx, the PHP and the PHP-FPM config file.
PHP configuration
Let's start with PHP. Open the /etc/php/<phpversion>/fpm/php.ini file and search for max_execution_time. You should find something like
max_execution_time = 30
which means that each request can last no longer than 30 seconds. If you need more time, increase this number, e.g.
max_execution_time = 300
for 5 minutes.
Then let's examine the PHP-FPM configuration. Open your pool configuration, such as /etc/php/<phpversion>/fpm/pool.d/www.conf and search for request_terminate_timeout. In my configuration, I have it disabled:
; Default Value: 0
;request_terminate_timeout = 0
Default value is 0 (disabled), but if you have it enabled you should increase the number, e.g.
request_terminate_timeout = 400
for 400 seconds before PHP-FPM kills a child process. If you give a number, use something higher than max_execution_time otherwise your process will be killed by PHP-FPM ignoring the maximum execution time.
Nginx configuration
Finally, look at Nginx configuration in /etc/nginx/sites-available/yoursite.conf. There you should find a section that configures the communication between Nginx and PHP-FPM. There you find the fastcgi_read_timeout, which is the maximum time Nginx will wait for PHP-FPM to return some data:
location ~ \.php$ {
# ...
fastcgi_read_timeout 300;
# ...
}
If after 300 seconds PHP-FPM didn't returned anything, Nginx will kill the connection. In your case, you send data back to the webserver after the long-running process, so it could not take longer than 300 seconds. You should change this number to something compatible to the numbers you put in PHP configurations.
To sum up
If you think your processing could take up to 30 minutes, use numbers like these:
in /etc/php/<phpversion>/fpm/php.ini:
max_execution_time = 1800 ; 1800 secs = 30 minutes
in /etc/php/<phpversion>/fpm/pool.d/www.conf:
request_terminate_timeout = 0 ; no timeout, or greater than 1800
in /etc/nginx/sites-available/yoursite.conf:
fastcgi_read_timeout = 2000; # 2000 secs, or at least greater than the previous twos
With this combination, max_execution_time will rule among the others and you will know that your process has 30 minutes of running time, because PHP-FPM and Nginx timeouts should happen after the PHP one.
Don't forget the client-side
If you are using an AJAX uploading library, please check its configuration too, because it could impose another timeout to the full AJAX upload request.
For example, dropzonejs uses a 30 seconds timeout by default. Your server could run for ages, but after that short amount of time your javascript library will kill the connection.
Usually you can change that value. With dropzone, you can setup a 2100 seconds of timeout with
var myDropzone = new Dropzone("#uploader", {
// ...
timeout: "2100",
// ...
});
Again, use a higher value than the one on Nginx.
Long running tasks: the right way
However, your approach is quick and dirty and even if I'm sure it's ok for your case, it would be better to follow another path:
don't do the CSV processing just after the upload
instead, upload the file, put it in a queue and notify the client to check back later
do the CSV processing in background with a queue worker
Please check the Laravel documentation on queues (https://laravel.com/docs/5.5/queues).
With this approach your user will have an immediate feedback, and you won't have timeout problems anymore! (Well, in theory: background jobs from queues can be timeouted too, but that's another story.)
I hope this can help :)

phpMyAdmin mysql database has huge traffic

So im doing a project for which i have made an android application which communicates with a mysql database on phpMyAdmin. The android app sends requests to a php file which in turn communicates with the database. these php files are on the same server (000webhost) I have a bit of a problem though. Sometimes the connection doesnt work (eg. the data doesnt come through or i get a time-out error or something)
I have looked at the data traffic on phpMyAdmin and I think it is abnormally high (im not sure). The app is used by about 10 persons and the total queries should be about a 1000 a day. The queries are simple select and update actions. Though when I look at the data traffic (status) on phpMyadmin it says there has been 2.2 TiB data traffic total. Which I think is ridiculous. Also phpMyAdmin gives a lot of errors like "the rate of reading the first index entry is high" "and the rate of opening tables is high". And when I look at the monitor on phpMyAdmin it has random peaks of sudden 150 MiB data sent. Like, everything is incredibly high.
this is an example of a query I send to the database via php:
<?php
$con = mysqli_connect("*************");
$ID = $_POST["ID"];
$naam = $_POST["NAME"];
$inbier = $_POST["INBIER"];
$outbier = $_POST["OUTBIER"];
$pof = $_POST["POF"];
$adjust = $_POST["ADJUST"];
$statement = mysqli_prepare($con, "SELECT * FROM Bierlijst WHERE NAME = ?");
mysqli_stmt_bind_param($statement, "s", $naam);
mysqli_stmt_execute($statement);
mysqli_stmt_store_result($statement);
mysqli_stmt_bind_result($statement, $ID, $naam, $inbier, $outbier, $pof, $adjust);
$response = array();
$response["success"] = false;
while(mysqli_stmt_fetch($statement)){
$response["success"] = true;
$response["NAME"] = $naam;
$response["INBIER"] = $inbier;
$response["OUTBIER"] = $outbier;
$response["POF"] = $pof;
$response["ADJUST"] = $adjust;
}
echo json_encode($response);
?>
Is it possible that I'm running bad queries which are failing and looping and stuff?

Too many sql connections error : due to long polling

I have designed a coding platform just like Spoj and Codeforces for competitions to be organised in my college on LAN.
I have used long polling there so that any announcements from the Admin can be broadcasted to all users with a JavaScript alert message. When anything is posted on the forum then the admin also gets a notification.
But for just 16 users (including the 1 Admin) accessing the site, the server went down showing too many sql connections. I restarted my laptop (server) and it continued for a while, then again went down; giving the same error message as before.
When I removed both long-poll processes everything continued smoothly.
Server-side code for long-poll:
include 'dbconnect.php';
$old_ann_id = $_GET['old_ann_id'];
$resultann = mysqli_query($con,"SELECT cmntid FROM announcements ORDER BY cmntid DESC LIMIT 1");
while($rowann = mysqli_fetch_array($resultann)){
$last_ann_id = $rowann['cmntid'];
}
while($last_ann_id <= $old_ann_id){
usleep(10000000);
clearstatcache();
$resultann = mysqli_query($con,"SELECT cmntid FROM announcements ORDER BY cmntid DESC LIMIT 1");
while($rowann = mysqli_fetch_array($resultann)){
$last_ann_id = $rowann['cmntid'];
}
}
$response = array();
$response['msg'] = 'new';
$response['old_ann_id'] = $last_ann_id;
$resultann = mysqli_query($con, "Select announcements from announcements where cmntid = $last_ann_id");
while($rowann = mysqli_fetch_array($resultann)){
$response['announcement'] = $rowann['announcements'];
}
echo json_encode($response);
Max connections is defined. Think the default is 100 or 151 connections depending on the version of MySQL. You can see the value in "Server variables and settings" in phpmyadmin (or directly by executing *show variables like "max_connections";* ).
If that is set to something very low (say 10) and you have (say) 15 users you will hit the limit rapidly. You are giving each long polling script its own connection, and that connection is probably sitting open until that long polling script ends. You could likely reduce this by having the script disconnect after each time it checks the database, then reconnect the next time it checks (ie, if your long polling script checks the db every 5 seconds you probably have well over 4.5 seconds of that 5 seconds currently where there is a connection to the db but where the connection is not being used)
However you could have a larger number of connections, but if you trigger the ajax polling multiple times per user, each could have several simultaneous connections. This is probably quite easy to do with a minor bug in your javascript.
Possibly worse if you are using a persistent connections you might leave connections open after the user has left the page that calls the long polling script.
EDIT - update based on your script.
Note I am not sure exactly what your dbconnect.php include is doing. I might be possible to easily call a connect / disconnect function in that include, but I have just put it in this example code as using the mysqlu_close and mysqli_connect functions.
<?php
include 'dbconnect.php';
$old_ann_id = $_GET['old_ann_id'];
$resultann = mysqli_query($con,"SELECT MAX(cmntid) FROM announcements");
if($rowann = mysqli_fetch_array($resultann))
{
$last_ann_id = $rowann['cmntid'];
}
$timeout = 0;
while($last_ann_id <=$old_ann_id and $timeout < 6)
{
$timeout++;
mysqli_close($con);
usleep(10000000);
clearstatcache();
$con = mysqli_connect("myhost","myuser","mypassw","mybd");
$resultann = mysqli_query($con,"SELECT MAX(cmntid) FROM announcements");
if($rowann = mysqli_fetch_array($resultann))
{
$last_ann_id = $rowann['cmntid'];
}
}
if ($last_ann_id >$old_ann_id)
{
$response = array();
$response['msg'] = 'new';
$response['old_ann_id'] = $last_ann_id;
$resultann=mysqli_query($con,"SELECT cmntid, announcements FROM announcements WHERE cmntid>$old_ann_id ORDER BY cmntid");
while($rowann = mysqli_fetch_array($resultann))
{
$response['announcement'][]=$rowann['announcements'];
$response['old_ann_id'] = $rowann['cmntid'];
}
mysqli_close($con);
echo json_encode($response);
}
else
{
echo "No annoucements - resubmit";
}
?>
I have added a count to the main loop. But it will drop out of the loop whether anything is found once it has executed 6 times. This way even if someone leaves the page the script will only be running for a short time afterwards (max a minute). You will have to amend you javascript to catch this and resubmit the ajax call.
Also I have changed the announcement in the response to be an array. This way if there are several announcements while the script is running all will be brought back.

Reducing number of MySQL processes

I need to try to understand how MySQL processes/connections work. I have googled and dont see anything in laymans terms so I'm asking here. Here is the situation.
Our host is giving us grief over "too many MySQL processes". We are on a shared server. We are allowed .2 of the server mySQL processes - which they claim is 50 connections - and they say we are using .56.
From the technical support representative:
"Number of MySQL procs (average) - 0.59 meant that you were using
0.59% of the total MySQL connections available on the shared server. The acceptable value is 0.20 which is 50 connections. "
Here is what we are running:
Zen Cart: 1.5.1 35K products. Auto updating of 1-20
products every 10 hours via cron.
PHP version 5.3.16
MySQL version 5.1.62-cll
Architecture i686
Operating system linux
We generally have about 5000 hits per day on the site and Google bot loves to visit even though I have the crawl rate set to minimum in Google webmaster tools.
I'm hoping someone can explain MySQL processes to me in terms of what this host is talking about. Every time I ask them I get an obfuscated answer that is vague and unclear. Is a new MySQL process created every time a visitor visits the site? That does not seem right.
According to the tech we were using 150 connections at that particular time.
EDIT:
here is the connection function in zencart
function connect($zf_host, $zf_user, $zf_password, $zf_database, $zf_pconnect = 'false', $zp_real = false) {
$this->database = $zf_database;
$this->user = $zf_user;
$this->host = $zf_host;
$this->password = $zf_password;
$this->pConnect = $zf_pconnect;
$this->real = $zp_real;
if (!function_exists('mysql_connect')) die ('Call to undefined function: mysql_connect(). Please install the MySQL Connector for PHP');
$connectionRetry = 10;
while (!isset($this->link) || ($this->link == FALSE && $connectionRetry !=0) )
{
$this->link = #mysql_connect($zf_host, $zf_user, $zf_password, true);
$connectionRetry--;
}
if ($this->link) {
if (#mysql_select_db($zf_database, $this->link)) {
if (defined('DB_CHARSET') && version_compare(#mysql_get_server_info(), '4.1.0', '>=')) {
#mysql_query("SET NAMES '" . DB_CHARSET . "'", $this->link);
if (function_exists('mysql_set_charset')) {
#mysql_set_charset(DB_CHARSET, $this->link);
} else {
#mysql_query("SET CHARACTER SET '" . DB_CHARSET . "'", $this->link);
}
}
$this->db_connected = true;
if (getenv('TZ') && !defined('DISABLE_MYSQL_TZ_SET')) #mysql_query("SET time_zone = '" . substr_replace(date("O"),":",-2,0) . "'", $this->link);
return true;
} else {
$this->set_error(mysql_errno(),mysql_error(), $zp_real);
return false;
}
} else {
$this->set_error(mysql_errno(),mysql_error(), $zp_real);
return false;
}
I wonder if it is a problem with connection pooling. Try changing this line:
$this->link = #mysql_connect($zf_host, $zf_user, $zf_password, true);
to this:
$this->link = #mysql_connect($zf_host, $zf_user, $zf_password);
The manual is useful here - the forth parameter is false by default, but your code is forcing it to be true, which creates a new connection even if an existing one is already open (this is called connection pooling and saves creating new connections unnecessarily i.e. saves both time and memory).
I would offer a caveat though: modifying core code in a third-party system always needs to be done carefully. There may be a reason for the behaviour they've chosen, though there's not much in the way of comments to be able to tell. It may be worth asking a question via their support channels to see why it works this way, and whether they might consider changing it.

Mysql Server has gone away error on PHP script

I've wrote a script to batch process domains and retrieve data on each one. For each domain retrieved, it connects to a remote page via curl and retrieves the data required for 30 domains at a time.
This page typical takes between 2 - 3 mins to load and return the curl result, at this point, the details are parsed and placed into an array (page rank tools function).
Upon running this script via CRON, I keep getting the error 'MySQL server has gone away'.
Can anyone tell me if I'm missing something obvious that could be causing this?
// script dies after 4 mins in time for next cron to start
set_time_limit(240);
include('../include_prehead.php');
$sql = "SELECT id, url FROM domains WHERE (provider_id = 9 OR provider_id = 10) AND google_page_rank IS NULL LIMIT 30";
$result = mysql_query($sql);
$row = mysql_fetch_assoc($result);
do {
$url_list[$row['id']] = $row['url'];
} while ($row = mysql_fetch_assoc($result));
// curl domain information page - typically takes about 3 minutes
$pr = page_rank_tools($url_list);
foreach ($pr AS $p) {
// each domain
if (isset($p['google_page_rank']) && isset($p['alexa_rank']) && isset($p['links_in_yahoo']) && isset($p['links_in_google'])) {
$sql = "UPDATE domains SET google_page_rank = '".$p['google_page_rank']."' , alexa_rank = '".$p['alexa_rank']."' , links_in_yahoo = '".$p['links_in_yahoo']."' , links_in_google = '".$p['links_in_google']."' WHERE id = '".$p['id']."'";
mysql_query($sql) or die(mysql_error());
}
}
Thanks
CJ
This happens because MySQL connection has its own timeout and while you are parsing your pages, well, it ends. You can try to increase this timeout with
ini_set('mysql.connect_timeout', 300);
ini_set('default_socket_timeout', 300);
(as mentioned in MySQL server has gone away - in exactly 60 seconds)
Or just call mysql_connect() again.
Because the curl take too long time, you can consider to connect again your database before entering the LOOP for update
There are many reasons why this error occurs. See a list here, it may be something you can fix quite easily
MySQL Server has gone away