Wordpress site -- Error connecting to database - only on high traffic - mysql

I have a wordpress 3.5.1 site that's getting a little bit of traffic (80 concurrent visitors, ~5 clicks/pageviews per second). The site runs fine until it spikes to ~85 visitors. There are usually 4 admins logged in at the same time. The site has 8 custom post types--Posts has 3500 posts, and Press Releases (a custom post type) has 13000 posts. All of this is paginated on the site, so there are never more than 20 posts showing on any one page.
The only plugins I'm using are wp-pagenavi and w3totalcache.
I set WP_DEBUG_LOG to true and logged the errors. The main error I'm getting (besides various notices and warnings unrelated to this issue) is mysql has reached the max_user_connections limit.
My current max_user_connections is set to 75. I tried to set it higher, but the cpu can't handle the load (4 Quad Core CPUs at 2.17GHz each and 4GB of RAM).
What could be causing so many connections?
I've viewed the mysql processes running when the errors happened and there are many connections that say "SLEEPING" or "SLEEP" (probably 15-20). I also noticed via the process logs that httpd is restarting quite often during the peak traffic times.
Any ideas on how to resolve the issue?
NOTE: Please do not respond if your answer is to check my wp-config.php file and make sure my username, password, and host are correct. This is NOT the issue. The site works fine under normal traffic.

The database server is dropping out due to high loads. What's in the error logs? What's the size of the database? Have you used mysqltuner.pl to check the query load and caching settings for the MySQL server in my.cnf? https://github.com/rackerhacker/MySQLTuner-perl
WordPress can really bog down the database with many post/page revisions. Delete them with a plugin http://wordpress.org/extend/plugins/revision-control/ I've dropped databases to 10% of their original size with a resulting huge increase in performance.

Related

Applications downs due to heavy MySQL server load

We have a 2GB Digital Ocean server, and it is dedicated for a MySQL server of other two PHP servers. we are using Percona MySQL Server 5.6 on this server. We configured MySQL replication and these configuration is working fine
Our issue is sometime our site monitoring tools reporting that some of the URL hosted with this server is down (May be this is happening once in a week or two). When I am checking, I could see that Mysql Master server load is too much high (May be 35 - 40), so the MySQL server was not responded. # that I usually do a MySQl service restart, this restart cause to server load become normal and the sites started working after service restart.
This is a back-end MySQL database server of 20-25 PHP applications (WordPress, Drupal and some custom applications server).
Here are my questions,
Why this server load automatically goes down, after a spikes happens?
Is there any way in which database is causing issues? So that I can identify the application too.
How can I identify the root cause of this issues
Depending upon your working dataset, a 2GB server providing access for 20-25 PHP applications (WordPress, Drupal and some custom applications server) could be the issue.
For example, if you have a 1.4GB buffer pool (assuming all tables are InnnoDB) and 10GB of data, then your various applications could end up competing for resources, such as I/O, buffer pool pages, Adaptive Hash Index, query cache. They could also, assuming caching is used, be invalidating theit caches within a similar timeframe, thus sending expensive queries to the database.
Whilst a load of 50 is something that you would normally want to avoid, the load average is not something that you should concern yourself with if showing in isolation.
The use of the uninterruptible state has since grown in the Linux
kernel, and nowadays includes uninterruptible lock primitives. If the
load average is a measure of demand in terms of running and waiting
threads (and not strictly threads wanting hardware resources), then
they are still working the way we want them to.
http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html
If the issue is happening once per week then it is starting to sound like a batch process, or cache expiration issue - too much happening at once for the resources available.
The best thing to do is to monitor and look for the cause. Since you are already using Percona Server, using PMM should give you the perfect insight to find the cause, although it works with Oracle MySQL, MariaDB, Aurora, etc. You can try a demo to see the insights that you can gain:
https://pmmdemo.percona.com. The software is Open Source and free to use.
You can look in QAN to find the most expensive queries, whilst looking at Prometheus data to give an insight into the host itself. There are some recommendations to get the most from PMM, depending upon your flavour of MySQL.

Lost connection to MySQL server during query on random simple queries

FINAL UPDATE: We fixed this problem by finding a way to accomplish our goals without forking. But forking was the cause of the problem.
---Original Post---
I'm running a ruby on rails stack, our mysql server is separate, but housed at the same site as our app servers. (we've tried swapping it out for a different mysql server with double the specs, but no improvement was seen.
during business hours we get a handful of these from no particular query.
ActiveRecord::StatementInvalid: Mysql2::Error: Lost connection to MySQL server during query
most of the queries that fail are really simple, and there seems to be no pattern between one query and another. This all started when I upgraded from Rails 4.1 to 4.2.
I'm at a loss as to what to try. Our database server is less than 5% CPU throughout the day. I do get bug reports from users who have random interactions fail due to this, so it's not queries that have been running for hours or anything like that, of course when they retry the exact same thing it works.
Our servers are configured by cloud66.
So in short: our mysql server is going away for some reason, but it's not because of lack of resources, it's also a brand new server as we migrated from another server when this problem started.
this also happens to me on localhost while developing features sometimes, so I don't believe it's a load issue.
We're running the following:
ruby 2.2.5
rails 4.2.6
mysql2 0.4.8
UPDATE: per the first answer below I increased our max_connections variable to 500 last night, and confirmed the increase via
show global variables like 'max_connections';
I'm still getting dropped connection, the first one today was dropped only a few minutes ago....
ActiveRecord::StatementInvalid: Mysql2::Error: Lost connection to MySQL server during query
I ran select * from information_schema.processlist; and I got 36 rows back. Does this mean my app servers were running 36 connections at that moment? or can a process be multiple connections?
UPDATE: I just set net_read_timeout = 60 (it was 30 before) I'll see if that helps
UPDATE: It didn't help, I'm still looking for a solution...
Heres my Database.yml with credentials removed.
production:
adapter: mysql2
encoding: utf8
host: localhost
database:
username:
password:
port: 3306
reconnect: true
The connection to MySQL can be disrupted by a number of means, but I would recommend revisiting Mario Carrion's answer since it's a very wise answer.
It seems likely that connection is disrupted because it's being shared with the other processes, causing communication protocol errors...
...this could easily happen if the connection pool is process bound, which I believe it is, in ActiveRecord, meaning that the same connection could be "checked-out" a number of times simultaneously in different processes.
The solution is that database connections must be established only AFTER the fork statement in the application server.
I'm not sure which server you're using, but if you're using a warmup feature - don't.
If you're running any database calls before the first network request - don't.
Either of these actions could potentially initialize the connection pool before forking occurs, causing the MySQL connection pool to be shared between processes while the locking system isn't.
I'm not saying this is the only possible reason for the issue, as stated by #sloth-jr, there are other options... but most of them seem less likely according to your description.
Sidenote:
I ran select * from information_schema.processlist; and I got 36 rows back. Does this mean my app servers were running 36 connections at that moment? or can a process be multiple connections?
Each process could hold a number of connections. In your case, you might have up to 500X36 connections. (see edit)
In general, the number of connections in the pool can often be the same as the number of threads in each process (it shouldn't be less than the number of thread, or contention will slow you down). Sometimes it's good to add a few more depending on your application.
EDIT:
I apologize for ignoring the fact that the process count was referencing the MySQL data and not the application data.
The process count you showed is the MySQL server data, which seems to use a thread per connection IO scheme. The "Process" data actually counts active connections and not actual processes or threads (although it should translate to the number of threads as well).
This means that out of possible 500 connections per application processes (i.e., if you're using 8 processes for your application, that would be 8X500=4,000 allowed connections) your application only opened 36 connections so far.
This indicates a timeout error. It's usually a general resource or connection error.
I would check your MySQL config for max connections on MySQL console:
show global variables like 'max_connections';
And ensure the number of pooled connections used by Rails database.yml is less than that:
pool: 10
Note that database.yml reflects number of connections that will be pooled by a single Rails process. If you have multiple processes or other servers like Sidekiq, you'll need to add them together.
Increase max_connections if necessary in your MySQL server config (my.cnf), assuming your kit can handle it.
[mysqld]
max_connections = 100
Note other things might be blocking too, e.g. open files, but looking at connections is a good starting point.
You can also monitor active queries:
select * from information_schema.processlist;
as well as monitoring the MySQL slow log.
One issue may be a long-running update command. If you have a slow-running command that affects a lot of records (e.g. a whole table), it might be blocking even the simplest queries. This means you could see random queries timeout, but if you check MySQL status, the real cause is another long-running query.
Things you did not mention but you should take a look:
Are you using unicorn? If so, are your reconnecting and disconnecting in your after_fork and before_fork?
Is reconnect: true set in your database.yml configuration?
Well,at first glance this sounds like your webserver is keeping the mysql sessions open and sometimes a user runs into a timeout. Try disabling the keep mysql sessions alive.
It will be a hog but you only use 5% ...
other tipps:
Enable the mysql "Slow Query Log" and take a look.
write a short script which pulls and logs the mysql processlist every minute and cross check the log with timeouts
look at the pool size in your db connection or set one!
http://guides.rubyonrails.org/configuring.html#database-pooling
should be equal to the max-connections mysql likes to have!
Good luck!
Find out if your database is limited in terms of multiple connections. Because normally a SQL database is supposed to have more than one active connection.
(Contact your network provider)
Would you mind posting some of your queries? The MySQL documentation has this to say about it:
https://dev.mysql.com/doc/refman/5.7/en/error-lost-connection.html
TL;DR:
Network problems; are any of your boxes renewing leases
periodically, or experiencing other network connection errors
(netstat / ss), firewall timeouts, etc. Not sure how managed your
hosts are by cloud66....
Query timed out. This can happen if you've got commands backed up
behind blocking statements (eg, alters/locking backups on MyISAM
tables). How simple are your queries? No cartesian products in-play?
EXPLAIN query could help.
Exceeding MAX_PACKET_SIZE. Are you storing pictures, video content, etc.?
There are lots of possibilities here, and without more information, will be difficult to pinpoint this.
Would look first at mysql_error.log, then work your way from the DB server back to your application.
UPDATE: this didn't work.
Heres the solution, special thanks to #Myst for pointing out that forking can cause issues, I had no idea to look at this particular code. As the errors seemed random because we forked in this fashion in several places.
It turns out that when I was forking processes, rails was using the same database connection for all forked processes, This created a situation where when one of the processes (the parent process?) terminated the database connection, the remaining process would have its connection interrupted.
The solution was to change this code:
def recalculate_completion
Process.fork do
if self.course
self.course.user_groups.includes(user:[:events]).each do |ug|
ug.recalculate_completion
end
end
end
end
into this code:
def recalculate_completion
ActiveRecord::Base.remove_connection
Process.fork do
ActiveRecord::Base.establish_connection
if self.course
self.course.user_groups.includes(user:[:events]).each do |ug|
ug.recalculate_completion
end
end
ActiveRecord::Base.remove_connection
end
ActiveRecord::Base.establish_connection
end
Making this change stopped the errors from our servers and everything appears to be working well now. If anyone has any more info as to why this worked I would be happy to hear it, as I would like to have a deeper understanding of this.
Edit: it turns out this didn't work either.... we still got dropped connections but not as often.
If you have query cache enabled, please reset it and it should work.
RESET QUERY CACHE;

Media wiki randomly cannot access the database

I have media wiki running on a VPS with 256mb ram. From time to time I get the follow page
Sorry! This site is experiencing technical difficulties.
Try waiting a few minutes and reloading.
(Cannot access the database: Connection refused (localhost))
Would this be because of low ram? I turned on php error reporting in .htaccess but dont get any further info.
I upgraded my ram from 256 to 768 on the VPS and do not have the error anymore.

MySQL hanging in Writing to Net

I have problem, when MySQL thread sometime stuck at status "Writing to net".
I have 4 Apache server (2.4) (requests are load-balanced on them) a 1 MySQL (MariaDB 10). Apache is executing php56. All Apache servers have same configuration. All servers runs on CentOS 7. SElinux is disabled on Apache servers for debug reasons. No problems in audit logs on DB server. All servers are virtual and located on same cluster (VMware).
Problem appear only on specific pages and specific queries to DB.
Usually there is around 100-200 separate queries on page and most of them takes 0.0001-0.0010 s. But then I have one query that takes around 1-2sec. The query itself take much lesser time (around 0.0045s).
Problematic query returns around 8984 rows and when executed from CLI from debug script, it is executed fast as expected.
Strange is that in time some Apache servers execute that page quickly, and some slowly. It changes (during day). Also I tried remove one Apache server from cluster and then send same request. If server is not under any load, it usually responds fast.
All server have enough resources (CPU and RAM) so it is definitely not load issue. They usually have around 4-10 active Apache workers (prefork) and have capacity for 100 active workers.
I tried debugging with tcpdump and when requesting page, I can see packet flow for fast queries and then it stops for a while and resumes. Not sure if the problem is on MySQL server or on Apache server.
My guess is that I am hitting some kind of limit, but I have no idea which one.
The solution is quite odd.
First few more details:
All Apache severs have same application data (PHP files, images, etc.) Mounted from NFS. The NFS share was working fine (low latency, no data corruption).
Solution:
When I was desperate I went through every possible log. Then I noticed that iptables are dropping some packets from NFS server. Well I said to myself that I should probably fix that, even when its not related.
But after I allowed all traffic from NFS to my Apache servers, MySQL status "writing to net" disappeared and all websites started to respond quickly.

MySQL database is loading very slow

I have a wordpress installation which has over 5000 posts in it. For the last few days, the database is loading very slowly. I have used the database optimization plugin as well as optimized db tables from back end. But still the issue exist. After I restart the MySQL server everything seems to work just fine. The same issue exists for my other website which is a Joomla installation on another server(Amazon) and has over 50,000 articles. Even here I regularly do database optimization but still the pages load very slowly. Sometimes to over 1 minute. There is page cache on both the sites, still I am getting this issue. There are other websites also running on the same server but they are relatively new and less content compared to these two sites, they load very fast. The problem is with these two websites only.
Check out:
Connection pool. Maybe you're running out of connections and get bottlenecked
Server's cache
slow queries audit
... this are just hints, to get real help you should indicate more system performance's info
I'm not sure about the Wordpress site configuration settings, but for Joomla, if you are not already running MySQLi, I would recommend enabling in in the Global Configuration. Try clearing the site cache and if you are running on Joomla 1.5, then if possible, upgrade to Joomla 2.5 as it is a little faster when running queries. In the end, it might just be due to the fact that you are on a shared host, so if you are willing to, a VPS server would speed things up.
There can be many reason for the same like : "Hosting Server Issue (use dedicated server for such sites)" , "Use good cache plugin availabe for joomla & wordpress" , "Use db optimize plugins"
You can also refer to following sites -
http://www.sparringmind.com/speed-up-wordpress/
http://wpmu.org/too-many-wordpress-plugins/
http://www.joomlaperformance.com/articles/performance/so_you_want_to_speed_up_joomla_3_14.html