Application details :
Rails 3.1.0
Ruby 1.9.2
unicorn 4.2.0
resque 1.20.0
nginx/1.0.14
redis 2.4.8
I am using active_admin gem, for all URL's getting response 200,
but only one URL giving 502 error on production.
rake routes :
admin_links GET /admin/links(.:format) {:action=>"index", :controller=>"admin/links"}
And its working on local(development).
localhost log : response code 200
Started GET "/admin/links" for 127.0.0.1 at 2013-02-12 11:05:21 +0530
Processing by Admin::LinksController#index as */*
Parameters: {"link"=>{}}
Geokit is using the domain: localhost
AdminUser Load (0.2ms) SELECT `admin_users`.* FROM `admin_users` WHERE `admin_users`.`id` = 3 LIMIT 1
(0.1ms) SELECT 1 FROM `links` LIMIT 1 OFFSET 0
(0.1ms) SELECT COUNT(*) FROM `links`
(0.2ms) SELECT COUNT(count_column) FROM (SELECT 1 AS count_column FROM `links` LIMIT 10 OFFSET 0) subquery_for_count
CACHE (0.0ms) SELECT COUNT(count_column) FROM (SELECT 1 AS count_column FROM `links` LIMIT 10 OFFSET 0) subquery_for_count
Link Load (0.6ms) SELECT `links`.* FROM `links` ORDER BY `links`.`id` desc LIMIT 10 OFFSET 0
Link Load (6677.2ms) SELECT `links`.* FROM `links`
Rendered /usr/local/rvm/gems/ruby-1.9.2-head/gems/activeadmin-0.4.2/app/views/active_admin/resource/index.html.arb (14919.0ms)
Completed 200 OK in 15663ms (Views: 8835.0ms | ActiveRecord: 6682.8ms | Solr: 0.0ms)
production log : 502 response
Started GET "/admin/links" for 103.9.12.66 at 2013-02-12 05:25:37 +0000
Processing by Admin::LinksController#index as */*
Parameters: {"link"=>{}}
NGinx error log
2013/02/12 07:36:16 [error] 32401#0: *1948 upstream prematurely closed connection while reading response header from upstream
don't know what's happening, could some buddy help me out.
You have a timeout problem.
Tackling it
HTTP/1.1 502 Bad Gateway
Indicates, that nginx had a problem to talk to its configured upstream.
http://en.wikipedia.org/wiki/List_of_HTTP_status_codes#502
2013/02/12 07:36:16 [error] 32401#0: *1948 upstream prematurely closed connection while reading response header from upstream
Nginx error log tells you Nginx was actually able to connect to the configured upstream but the process closed the connection before the answer was (fully) received.
Your development environment:
Completed 200 OK in 15663ms
Apparently you need around 15 seconds to generate the response on your development machine.
In contrast to proxy_connect_timeout, this timeout will catch a server
that puts you in it's connection pool but does not respond to you with
anything beyond that. Be careful though not to set this too low, as
your proxy server might take a longer time to respond to requests on
purpose (e.g. when serving you a report page that takes some time to
compute). You are able though to have a different setting per
location, which enables you to have a higher proxy_read_timeout for
the report page's location.
http://wiki.nginx.org/HttpProxyModule#proxy_read_timeout
On the nginx side the proxy_read_timeout is at a default of 60 seconds, so that's safe
I have no idea how ruby (on rails) works, check the error log - the timeout happens in that part of your stack
Related
I have created a web application using jsp/tiles/struts/mysql/tomcat. I created new project on Openshift 3 console (Openshift online) https://console.preview.openshift.com/console/ then added tomcat/mySql. I was getting 503 error sometimes and other times, same page was working as expected. 503 error came randomly for any page from my project. When I get 503 error, I refresh some no of times and it goes away, and my page is correctly displayed.
Error that I see is:
"503 Service Unavailable
No server is available to handle this request. "
I did some research:
What I understand from this openshift 2 link:
https://blog.openshift.com/how-to-host-your-java-ee-application-with-auto-scaling/
is that to correct 503 error:
SSH into your application gear using rhc ssh --app <app_name>
Change directory to haproxy/conf
change the following in haproxy.cfg option httpchk GET / to option httpchk GET /api/v1/ping
Restart the HAProxy cartridge from your local machine using RHC rhc cartridge-restart --cartridge haproxy
I dont know if it is also applicable to openshift 3. In openshift 3 where is haproxy.log, haproxy.cfg, haproxy/conf or its slightly different in openshift 3. (Nut thanks to Warrens comments, yes he saw 503 error in openshift related to HAProxy)
Now after 1 week after posting this question:
I am getting Quota Reached Error. I am able to build my project but all deployments are failing. I wonder if 503 error that I was getting earlier(either completely or partially) was related to Quota reached. How should I proceed now.
curl -i localhost:8080/GEA
HTTP/1.1 302 Found Server:
Apache-Coyote/1.1
Location: http://localhost:8080/GEA/
Transfer-Encoding: chunked Date: Tue, 11 Apr 2017 18:03:25 GMT
Tomcat logs do not show any application error.
Will Readiness Probe and Liveness Probe help me? I have not set them yet.
Nor do I know how to set them.
Will scaling help me (I dont know how to set it either)
Do I have to set memory/... all at maximum allowed to ensure project runs smooth?
For me I had a similar situation of getting 503's sometimes and sometimes getting my actual page. the reason was because you have haproxy on the frontend handling the requests. Depending on your setup you may even have a few haproxy pods and your request could be funneled between one of the pods. So as in my case one pod was working and the other not.
So basically
oc get pods -n default
NAME READY STATUS RESTARTS AGE
docker-registry-7-i02rh 1/1 Running 0 75d
registry-console-12-wciib 1/1 Running 0 67d
router-1-533cg 1/1 Running 3 76d
router-1-9utld 1/1 Running 1 76d
router-1-uwf64 1/1 Running 1 76d
As you can see in my output default namespace is where my router(haproxy) pods live. If I change to that namespace
oc project default
Then run
oc logs -f router-1-533cg
on each of the pods you will most likely find a sepcific pod that is behaving bad. You can simply delete, and the replication controller will create a new one
I am developing application on Rails and have troubles with performance. Below I have described and provided all technologies, configurations and methods I used. I tried many methods to solve this problem but nothing helped. This is also described below. Can anyone see what the problem is? Thank you!
The problem
Let's open site several times and see log:
INFO -- : Completed 200 OK in 171ms (Views: 149.3ms | ActiveRecord: 7.3ms)
INFO -- : Completed 200 OK in 217ms (Views: 183.7ms | ActiveRecord: 8.9ms)
INFO -- : Completed 200 OK in 221ms (Views: 188.2ms | ActiveRecord: 11.7ms)
INFO -- : Completed 200 OK in 165ms (Views: 143.3ms | ActiveRecord: 7.1ms)
Fine.
Now let's generate load with wrk -t10 -c10 -d10s http://example.com and see log again:
INFO -- : Completed 200 OK in 178ms (Views: 157.6ms | ActiveRecord: 8.1ms)
INFO -- : Completed 200 OK in 270ms (Views: 241.6ms | ActiveRecord: 8.8ms)
INFO -- : Completed 200 OK in 505ms (Views: 460.7ms | ActiveRecord: 12.0ms)
INFO -- : Completed 200 OK in 501ms (Views: 350.8ms | ActiveRecord: 136.0ms)
INFO -- : Completed 200 OK in 777ms (Views: 468.1ms | ActiveRecord: 269.1ms)
INFO -- : Completed 200 OK in 936ms (Views: 624.7ms | ActiveRecord: 285.1ms)
...
INFO -- : Completed 200 OK in 881ms (Views: 617.4ms | ActiveRecord: 251.6ms)
...
INFO -- : Completed 200 OK in 3289ms (Views: 2614.2ms | ActiveRecord: 326.4ms)
INFO -- : Completed 200 OK in 3369ms (Views: 2624.0ms | ActiveRecord: 409.4ms)
INFO -- : Completed 200 OK in 3203ms (Views: 2369.2ms | ActiveRecord: 382.1ms)
...
INFO -- : Completed 200 OK in 431ms (Views: 281.1ms | ActiveRecord: 126.3ms)
...
What are this terrible slowdowns?
wrk -t10 -c10 -d10s http://example.com
Running 10s test # http://example.com
10 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.26s 96.38ms 1.39s 75.00%
Req/Sec 0.00 0.00 0.00 100.00%
36 requests in 10.09s, 0.97MB read
Socket errors: connect 0, read 0, write 0, timeout 32
Requests/sec: 3.57
Transfer/sec: 98.62KB
Rails
I am using latest Rails version - 4.2.4.
bundle list output - github gist
application.rb - github gist
production.rb - github gist
database.yml - github gist
I am using acts_as_paranoid for all my models. (WHERE deleted_at IS NULL)
All N+1 queries are eliminated (analyzed them manually and with bullet gem).
All tables are InnoDB and have a lot of indices. 99% of tables have 50 or less rows, the rest 1% - 200 or less. That's very low amount of data.
Page I am benchmarking requires around 30-35 SQL queries and around 30 partials.
Queries using...
- COUNT - 0
- GROUP BY - 1
- ORDER BY - 10
- LIMIT - 10
- JOIN - 1 (joins one table)
- Use of MAX, MIN, CONCAT and other functions - NO
My templates are written in Slim.
Server configuration
CPU - Intel one core Sandy Bridge CPU
RAM - 1GM
Disk Type - SSD
Operating System
Ubuntu 14.04 LTS 64bit
Swap 1GB. Optimized swap settings:
cat /proc/sys/vm/swappiness => 10
cat /proc/sys/vm/vfs_cache_pressure => 50
Database
MySQL 5.5. Tried to optimize with mysqltuner and tuning-primer.
Current configuration: github gist
Latest mysqltuner report: perl mysqltuner.pl --buffers --dbstat --outputfile result_mysqltuner.txt github gist
Latest tuning-primer report: ./tuning-primer.sh github gist
I have notices the warning Temporary tables created on disk. That's because I have TEXT columns in MySQL. Googling I found that mounting MySQL temporary directory in memory is good idea. So I did it: tmpdir /var/mysqltmp. Followed this guide.
mytop output:
Queries: 37.4k qps: 0 Slow: 1.9k Se/In/Up/De(%): 91/00/00/00
Sorts: 0 qps now: 1 Slow qps: 0.0 Threads: 4 ( 1/ 2) 00/00/00/00
Cache Hits: 29.0k Hits/s: 0.4 Hits now: 0.0 Ratio: 85.5%
Ratio now: 0.0%
Key Efficiency: 99.4% Bps in/out: 75.8/ 1.9k Now in/out: 22.6/ 2.1k
Id User Host/IP DB Time Cmd State Query
-- ---- ------- -- ---- --- ----- ----------
1444 root localhost site 73737 Sleep
2337 root localhost site 658 Sleep
2340 root localhost site 322 Sleep
2342 root localhost 0 Query show full processlist
Ruby interpreter
Rbenv: rbenv -v => rbenv 0.4.0-154-g9e664b5
Ruby: ruby -v => ruby 2.2.3p173 (2015-08-18 revision 51636) [x86_64-linux]
My ruby is compiled with jemalloc (better memory allocation):
RbConfig::CONFIG['LIBS'] => "-lpthread -ljemalloc -ldl -lcrypt -lm "
Followed this guide.
I have tried both ruby with default and custom memory allocation. Ruby with jemalloc seems to be faster.
Web server
nginx -v => nginx version: nginx/1.8.0
Current nginx.conf (simplified): github gist
Current nginx site conf (simplified): github gist
Application server
I have tried both Passenger and Puma. But without result.
Phusion Passenger
passenger -v => Phusion Passenger version 5.0.18
Configuration: github gist
Puma
bundle exec puma -v => puma version 2.13.4
Also I have tried Puma server in both cluster (1 worker) and thread models.
Here is latest config for Puma in thread mode: github gist
Our Rails 4.0 application (Ruby 2.1.2) is running on Nginx with Puma 2.9.0.
I recently noticed that all requests to our application hang after a while (usually 1 or 2 days).
When checking the log, which is set to debug mode, I noticed the following log stack up:
[2014-10-11T00:02:31.727382 #23458] INFO -- : Started GET "/" for ...
It does mean that requests actually hit the Rails app but somehow it isn't proceeded, while normally it would be:
I, [2014-10-11T00:02:31.727382 #23458] INFO -- : Started GET "/" for ....
I, [2014-10-11T00:02:31.729393 #23458] INFO -- : Processing by HomeController#index as HTML
My puma config is the following:
threads 16,32
workers 4
Our application is only for internal usage as now, so the RPM is very low, and none of the requests are take longer than 2s.
What is/are the reasons that could lead to this problem? (puma config, database connection, etc.)
Thank you in advance.
Update:
After installing the gem rack_timer to log the time spent on each middleware, I realized that our requests has been stuck at the ActiveRecord::QueryCache when the hang occurred, with huge amount of time on it:
Rack Timer (incoming) -- ActiveRecord::QueryCache: 925626.7731189728 ms
I removed this middleware for now and it seems to be back to normal. However, I understand the purpose of this middleware is to increase the performance, so removing it is just a temporary solution. Please help me find out the possible cause of this issue.
FYI, we're using mysql (5.1.67) with adapter mysql2 (0.3.13)
It could be a symptom of RAM starvation due to the query cache getting too big. We saw this in one of our apps running on Heroku. The default query cache is set to 1000. Lowering the limit eased the RAM usage for us with no noticeable performance degradation:
database.yml:
default: &default
adapter: postgresql
pool: <%= ENV["DB_POOL"] || ENV['MAX_THREADS'] || 5 %>
timeout: 5000
port: 5432
host: localhost
statement_limit: <%= ENV["DB_STATEMENT_LIMIT"] || 200 %>
However searching for "activerecord querycache slow" returns other causes, such as perhaps outdated versions of Ruby or Puma or rack-timeout: https://stackoverflow.com/a/44158724/126636
Or maybe a too large value for read_timeout: https://stackoverflow.com/a/30526430/126636
I have placed an after_commit callback in the RequestToken model that outputs "Committed Request Token xx". You can see in the log I included below, that the token record is committed and the next request the lookup on the object says it cannot be found. The issue occurs intermittently and if I refresh the page the record is found and the request goes through.
Environment
AWS EC2 + RDS, Ubuntu 10.04, Rails 3.2.8, MySQL2 0.3.11 gem, apache2 2.2.14, phusion passenger 3.0.11
Has anyone seen this before? Any suggestions?
Committed Request Token S8j311QckvEjnDftNW0e7FPHsavGWTelONcsE3X1
Rendered text template (0.0ms)
Completed 200 OK in 28ms (Views: 0.6ms | ActiveRecord: 21.8ms | Sphinx: 0.0ms)
Started GET "/oauth/authorize?oauth_token=S8j311QckvEjnDftNW0e7FPHsavGWTelONcsE3X1" for 96.236.148.63 at 2012-10-15 22:07:32 +0000
Processing by OauthController#authorize as HTML
Parameters: {"oauth_token"=>"S8j311QckvEjnDftNW0e7FPHsavGWTelONcsE3X1"}
Completed 500 Internal Server Error in 5ms
ActiveRecord::RecordNotFound (Couldn't find RequestToken with token = S8j311QckvEjnDftNW0e7FPHsavGWTelONcsE3X1):
200 Doesn't mean it saved. Probably failed a validation.
I am developing a simple iOS app, which uses a Rails app as a server backend. I'm using the Restkit framework for the management of the server networking / requests and on the iOS side all seems to be OK.
When I make a PUT request, I get a 200 response back, and the logs in XCode seem to suggest all is well. When I look at the logs for the Rails app, it also seems to suggest all is well, with the following output:
2011-12-19T18:15:17+00:00 app[web.1]: Started PUT "/lists/3/tasks/6" for 109.156.183.65 at 2011-12-19 18:15:17 +0000
2011-12-19T18:15:17+00:00 app[web.1]: Parameters: {"created_at"=>"2011-12-12 22:37:00 +0000", "id"=>"6", "updated_at"=>"2011-12-12 22:37:00 +0000", "description"=>"Create a home page", "list_id"=>"3", "completed"=>"1"}
2011-12-19T18:15:17+00:00 app[web.1]: Task Load (4.3ms) SELECT "tasks".* FROM "tasks" WHERE "tasks"."id" = $1 LIMIT 1 [["id", "6"]]
2011-12-19T18:15:17+00:00 app[web.1]: Processing by TasksController#update as JSON
2011-12-19T18:15:17+00:00 app[web.1]: (4.7ms) BEGIN
2011-12-19T18:15:17+00:00 app[web.1]: (1.5ms) COMMIT
2011-12-19T18:15:17+00:00 app[web.1]: Completed 200 OK in 48ms (Views: 1.1ms | ActiveRecord: 16.0ms)
2011-12-19T18:15:17+00:00 heroku[nginx]: 109.156.183.65 - - [19/Dec/2011:10:15:17 -0800] "PUT /lists/3/tasks/6 HTTP/1.1" 200 154 "-" "TaskM8/1.0 CFNetwork/485.13.9 Darwin/11.2.0" taskm8.com
2011-12-19T18:15:17+00:00 heroku[router]: PUT taskm8.com/lists/3/tasks/6 dyno=web.1 queue=0 wait=0ms service=114ms status=200 bytes=154
However, when I make another get request, or use the standard web views to look at the data, the change I was expecting from the PUT request (Completed = 1 - which is a BOOL field), no change has been made.
I can see from the rails log that my iOS app is passing the correct parameters, so it seems to be something on the rails side. I've been through the loop of overcoming the CSRF error message, so don't think it's that.
On a local version of the rails app, I've also run general logging against the MySql database to monitor the queries being run, trying to see if the PUT does anything at all, or anything which would fail... in the log, you don't see anything other than:
BEGIN
COMMIT
The same as the rails log.
So, does anyone have any idea about why the PUT is not making the changes to the data, or how I can debug the PUT further?
Apologies if this is a real simple question, I'm slowly getting back into development and am somewhat rusty!
Are you using any kind of automated tests? If not, start there.
Another way (though rubbish) would be to call your controller action from a webpage and see if it works.
You can also add logger.debug in your rails code to add traces.
If you have control over the MySQL server I would suggest that you enable the general log aka the query log, that way you can see what really happens.
SET GLOBAL general_log_file = '/tmp/query.log'
SET GLOBAL general_log = 'ON';