I have a SELECT query that takes around 2 minutes to run. It is causing our app to hang on the new cloud DB we migrated it to. The new cloud DB has only 3.5 GB of memory and 1 vCPU.
On our old VM DB it takes only 0.6 seconds which has around 16GB of memory.
Sometimes the SELECT query causes 100% CPU usage usually. And it looks like other queries don't get executed when this long running query is running.
569 rows in set (1 min 52.23 sec)
Is there anything I can configure to tune the my.cnf to return better results and mainly to prevent app from hanging. These are the only settings I have right now.
open_files_limit = 102400
max_connections = 5000
innodb_flush_log_at_trx_commit = 0
innodb_thread_concurrency = 8
log_bin_trust_function_creators=1
innodb_buffer_pool_size=2800M
innodb_log_file_size=600M
innodb_rollback_on_timeout=ON
innodb_log_buffer_size=16M
Its a query that returns the number of friends. And some of them might have around 600 friends and getting that list is what causing the issue. We can't change the query at the moment since its hardcoded to the app. But looking at the query it seems optimzed.
Update: Had to rebuild indexes and that fixes the issue. After the dump was imported I ran the command mysqlcheck database_name -p --optimize then the issue was resolved.
The query is now performing well, CPU usage has decreased, memory is being consumed, and query caching is also working.
FINAL UPDATE: We fixed this problem by finding a way to accomplish our goals without forking. But forking was the cause of the problem.
---Original Post---
I'm running a ruby on rails stack, our mysql server is separate, but housed at the same site as our app servers. (we've tried swapping it out for a different mysql server with double the specs, but no improvement was seen.
during business hours we get a handful of these from no particular query.
ActiveRecord::StatementInvalid: Mysql2::Error: Lost connection to MySQL server during query
most of the queries that fail are really simple, and there seems to be no pattern between one query and another. This all started when I upgraded from Rails 4.1 to 4.2.
I'm at a loss as to what to try. Our database server is less than 5% CPU throughout the day. I do get bug reports from users who have random interactions fail due to this, so it's not queries that have been running for hours or anything like that, of course when they retry the exact same thing it works.
Our servers are configured by cloud66.
So in short: our mysql server is going away for some reason, but it's not because of lack of resources, it's also a brand new server as we migrated from another server when this problem started.
this also happens to me on localhost while developing features sometimes, so I don't believe it's a load issue.
We're running the following:
ruby 2.2.5
rails 4.2.6
mysql2 0.4.8
UPDATE: per the first answer below I increased our max_connections variable to 500 last night, and confirmed the increase via
show global variables like 'max_connections';
I'm still getting dropped connection, the first one today was dropped only a few minutes ago....
ActiveRecord::StatementInvalid: Mysql2::Error: Lost connection to MySQL server during query
I ran select * from information_schema.processlist; and I got 36 rows back. Does this mean my app servers were running 36 connections at that moment? or can a process be multiple connections?
UPDATE: I just set net_read_timeout = 60 (it was 30 before) I'll see if that helps
UPDATE: It didn't help, I'm still looking for a solution...
Heres my Database.yml with credentials removed.
production:
adapter: mysql2
encoding: utf8
host: localhost
database:
username:
password:
port: 3306
reconnect: true
The connection to MySQL can be disrupted by a number of means, but I would recommend revisiting Mario Carrion's answer since it's a very wise answer.
It seems likely that connection is disrupted because it's being shared with the other processes, causing communication protocol errors...
...this could easily happen if the connection pool is process bound, which I believe it is, in ActiveRecord, meaning that the same connection could be "checked-out" a number of times simultaneously in different processes.
The solution is that database connections must be established only AFTER the fork statement in the application server.
I'm not sure which server you're using, but if you're using a warmup feature - don't.
If you're running any database calls before the first network request - don't.
Either of these actions could potentially initialize the connection pool before forking occurs, causing the MySQL connection pool to be shared between processes while the locking system isn't.
I'm not saying this is the only possible reason for the issue, as stated by #sloth-jr, there are other options... but most of them seem less likely according to your description.
Sidenote:
I ran select * from information_schema.processlist; and I got 36 rows back. Does this mean my app servers were running 36 connections at that moment? or can a process be multiple connections?
Each process could hold a number of connections. In your case, you might have up to 500X36 connections. (see edit)
In general, the number of connections in the pool can often be the same as the number of threads in each process (it shouldn't be less than the number of thread, or contention will slow you down). Sometimes it's good to add a few more depending on your application.
EDIT:
I apologize for ignoring the fact that the process count was referencing the MySQL data and not the application data.
The process count you showed is the MySQL server data, which seems to use a thread per connection IO scheme. The "Process" data actually counts active connections and not actual processes or threads (although it should translate to the number of threads as well).
This means that out of possible 500 connections per application processes (i.e., if you're using 8 processes for your application, that would be 8X500=4,000 allowed connections) your application only opened 36 connections so far.
This indicates a timeout error. It's usually a general resource or connection error.
I would check your MySQL config for max connections on MySQL console:
show global variables like 'max_connections';
And ensure the number of pooled connections used by Rails database.yml is less than that:
pool: 10
Note that database.yml reflects number of connections that will be pooled by a single Rails process. If you have multiple processes or other servers like Sidekiq, you'll need to add them together.
Increase max_connections if necessary in your MySQL server config (my.cnf), assuming your kit can handle it.
[mysqld]
max_connections = 100
Note other things might be blocking too, e.g. open files, but looking at connections is a good starting point.
You can also monitor active queries:
select * from information_schema.processlist;
as well as monitoring the MySQL slow log.
One issue may be a long-running update command. If you have a slow-running command that affects a lot of records (e.g. a whole table), it might be blocking even the simplest queries. This means you could see random queries timeout, but if you check MySQL status, the real cause is another long-running query.
Things you did not mention but you should take a look:
Are you using unicorn? If so, are your reconnecting and disconnecting in your after_fork and before_fork?
Is reconnect: true set in your database.yml configuration?
Well,at first glance this sounds like your webserver is keeping the mysql sessions open and sometimes a user runs into a timeout. Try disabling the keep mysql sessions alive.
It will be a hog but you only use 5% ...
other tipps:
Enable the mysql "Slow Query Log" and take a look.
write a short script which pulls and logs the mysql processlist every minute and cross check the log with timeouts
look at the pool size in your db connection or set one!
http://guides.rubyonrails.org/configuring.html#database-pooling
should be equal to the max-connections mysql likes to have!
Good luck!
Find out if your database is limited in terms of multiple connections. Because normally a SQL database is supposed to have more than one active connection.
(Contact your network provider)
Would you mind posting some of your queries? The MySQL documentation has this to say about it:
https://dev.mysql.com/doc/refman/5.7/en/error-lost-connection.html
TL;DR:
Network problems; are any of your boxes renewing leases
periodically, or experiencing other network connection errors
(netstat / ss), firewall timeouts, etc. Not sure how managed your
hosts are by cloud66....
Query timed out. This can happen if you've got commands backed up
behind blocking statements (eg, alters/locking backups on MyISAM
tables). How simple are your queries? No cartesian products in-play?
EXPLAIN query could help.
Exceeding MAX_PACKET_SIZE. Are you storing pictures, video content, etc.?
There are lots of possibilities here, and without more information, will be difficult to pinpoint this.
Would look first at mysql_error.log, then work your way from the DB server back to your application.
UPDATE: this didn't work.
Heres the solution, special thanks to #Myst for pointing out that forking can cause issues, I had no idea to look at this particular code. As the errors seemed random because we forked in this fashion in several places.
It turns out that when I was forking processes, rails was using the same database connection for all forked processes, This created a situation where when one of the processes (the parent process?) terminated the database connection, the remaining process would have its connection interrupted.
The solution was to change this code:
def recalculate_completion
Process.fork do
if self.course
self.course.user_groups.includes(user:[:events]).each do |ug|
ug.recalculate_completion
end
end
end
end
into this code:
def recalculate_completion
ActiveRecord::Base.remove_connection
Process.fork do
ActiveRecord::Base.establish_connection
if self.course
self.course.user_groups.includes(user:[:events]).each do |ug|
ug.recalculate_completion
end
end
ActiveRecord::Base.remove_connection
end
ActiveRecord::Base.establish_connection
end
Making this change stopped the errors from our servers and everything appears to be working well now. If anyone has any more info as to why this worked I would be happy to hear it, as I would like to have a deeper understanding of this.
Edit: it turns out this didn't work either.... we still got dropped connections but not as often.
If you have query cache enabled, please reset it and it should work.
RESET QUERY CACHE;
I host my CakePHP 1.3.x application on a shared host (hostmonster). I received a DNS errors from Google's webmasters tools and by contacting the technical support of my host, they indicated that there are CPU throttling occurs for my account and they guided me to check out this document about CPU Throttling.
From the above document, I checked tmp/mysql_slow_queries and I founded some queries takes more than 2 seconds and some of those queries are simple like:
# Sat Dec 14 02:00:38 2013
# Query_time: 3.286778 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
use twoindex_quran;
SET timestamp=1387011638;
SET NAMES utf8
I need to know, why CakePHP applies a query such as SET timestamp and how could I prevent CakePHP to make such query. Also I need to know what's making such simple query slow?
There are a few things that may be worth noting:
You may need to upgrade CakePHP to use the PDO version of PHP MySQL connections since the older version of MySQL connect are being deprecated.
Check the version of PHP your host is using. Make sure there is nothing strange between the host PHP version and the CakePHP PHP version requirements. Did the PHP version change recently causing these issues where they didn't exist before? If so, can you alter the .htaccess and use the previous version or did they cease to support it?
A three second query should not cause a throttle condition on the host. The query information you are listing does not look like a CakePHP specific query. It looks like the PHP connection trying to connect to a specific database. There is nowhere in the code I know of that calls SET timestamp or SET names. Maybe someone can enlighten us on that?
Recently, I moved from standard MySQL to Percona, and used the Percona Wizard to generate my.cnf.
However, I can see that, by default, the generated settings for my.cnf use query_cache_type = 0. (query cache is disabled).
The only thing I run on the server is a Wordpress blog. My questions are:
May I enable query cache?
There are some Wordpress plugins that offer database cache. Is the result similar of enabling query cache?
MySQL query cache is a cache mechanism that stores the text of the query (e.g. 'SELECT * FROM users WHERE deleted = 0') and the result of the query into memory. Please check this link to know how to enable mysql query cache in your server.
The wordpress DB cache plugins on the other hand, decreases count of queries to DB by caching queries in temp files (Check your cache directory wp-content/tmp/ for cache files).
Above two paragraphs prove that Wordpress db cache AND mysql query cache are different.
mysql query cache you should enable ONLY IF your site does more mysql reads than writes. since yours is a wordpress site, YES you can try by enabling mysql query cache.
Hope I answered your 2 questions.
Recently we changed app server of our rails website from mongrel to passenger [with REE and Rails 2.3.8]. The production setup has 6 machines pointing to a single mysql server and a memcache server. Before each machine had 5 mongrel instance. Now we have 45 passenger instance as the RAM in each machine is 16GB with 2, 4 core cpu. Once we deployed this passenger set up in production. the Website became so slow. and all the request starting to queue up. And eventually we had to roll back.
Now we suspect that the cause should be the increased load to the Mysql server. As before there where only 30 mysql connection and now we have 275 connection. The mysql server has the similar set up as our website machine. bUt all the configs were left to the defaul limit. The buffer_pool_size is only 8 mb though we have 16GB ram. and number of Concurrent threads is 8.
Will this increased simultaneous connection to mysql would have caused mysql to respond slowly than when we had only 30 connections? If so, how can we make mysql perform better with 275 simultaneous connection in place.
Any advice greatly appreciated.
UPDATE:
More information on the mysql server:
RAM : 16GB CPU: two processors each having 4 cores
Tables are innoDB. with only default innodb config values.
Thanks
An idle MySQL connection uses up a stack and a network buffer on the server. That is worth about 200 KB of memory and zero CPU.
In a database using InnoDB only, you should edit /etc/sysctl.conf to include vm.swappiness = 0 to delay swapping out processes as long as possible. You should then increase innodb_buffer_pool_size to about 80% of the systems memory assuming a dedicated database server machine. Make sure the box does not swap, that is, VSIZE should not exceed system RAM.
innodb_thread_concurrency can be set to 0 (unlimited) or 32 to 64, if you are a bit paranoid, assuming MySQL 5.5. The limit is lower in 5.1, and around 4-8 in MySQL 5.0. It is not recommended to use such outdated versions of MySQL in a machine with 8 or 16 cores, there are huge improvements wrt to concurrency in MySQL 5.5 with InnoDB 1.1.
The variable thread_concurrency has no meaning inside a current Linux. It is used to call pthread_setconcurrency() in Linux, which does nothing. It used to have a function in older Solaris/SunOS.
Without further information, the cause for your performance problems cannot be determined with any security, but the above general advice may help. More general advice geared at my limited experience with Ruby can be found in http://mysqldump.azundris.com/archives/72-Rubyisms.html That article is the summary of a consulting job I once did for an early version of a very popular Facebook application.
UPDATE:
According to http://pastebin.com/pT3r6A9q , you are running 5.0.45-community-log, which is awfully old and does not perform well under concurrent load. Use a current 5.5 build, it should perform way better than what you have there.
Also, fix the innodb_buffer_pool_size. You are going nowhere with only 8M of pool here.
While you are at it, innodb_file_per_table should be ON.
Do not switch on innodb_flush_log_at_trx_commit = 2 without understanding what that means, but it may help you temporarily, depending on your persistence requirements. It is not a permanent solution to your problems in any way, though.
If you have any substantial kind of writes going on, you need to review the innodb_log_file_size and innodb_log_buffer_size as well.
If that installation is earning money, you dearly need professional help. I am no longer doing this as a profession, but I can recommend people. Contact me outside of Stack Overflow if you want.
UPDATE:
According to your processlist, you have very many queries in state Sending data. MySQL is in this state when a query is being executed, that is, the main interior Join Loop/Query Execution loop is busy. SHOW ENGINE INNODB STATUS\G will show you something like
...
--------------
ROW OPERATIONS
--------------
3 queries inside InnoDB, 0 queries in queue
...
If that number is larger than say 4-8 (inside InnoDB), 5.0.x is going to have trouble. 5.5.x will perform a lot better here.
Regarding the my.cnf: See my previous comments on your InnoDB. See also my comments on thread_concurrency (without innodb_ prefix):
# On Linux, this does exactly nothing.
thread_concurrency = 8
You are missing all innodb configuration at all. Assuming that you ARE using innodb tables, you are not performing well, no matter what you do.
As far as I know, it's unlikely that merely maintaining/opening the connections would be the problem. Are you seeing this issue even when the site is idle?
I'd try http://www.quest.com/spotlight-on-mysql/ or similar to see if it's really your database that's the bottleneck here.
In the past, I've seen basic networking craziness lead to behaviour similar to what you describe - someone had set up the new machines with an incorrect submask.
Have you looked at any of the machine statistics on the database server? Memory/CPU/disk IO stats? Is the database server struggling?