Node.js Server become unresponsive after certain time period - mysql

I've recently been having problems with my server which become unresponsive after certain period of time.
Basically after a certain amount of usage & time my node.js app stops responding to requests. I don't even see routes being fired on my console and the HTTP calls from my client (Android app) don't reach the server anymore. But after restart my node.js app server everything starts working again, until things inevitable stop again. The app never crashes, it just stops responding to requests.
I'm not getting any errors, and I've made sure to handle and log all DB connection errors so I'm not sure where to start.
Any clue as to what might be happening and how I can solve this problem?
Here's my stack:
Node.js on Digital Ocean server with Ubutnu 14.04 and Nginx (using Express 4.15.2 + PM2 2.4.6)
Database running MySQL (using node-mysql)

Related

Connect Timeout Error on cloudhub : Mule version:4.2.2

I am trying to hit an https client api which is working fine on postman(gives response in 800ms) and in local mule flow but it is not working fine on cloudhub . I am getting Connect Timeout error. It tries connecting for 30 secs(as per logs) and then gives HTTP:CONNECTIVITY error.
failed: Connect timeout.
errorType=HTTP:CONNECTIVITY
cause=org.mule.extension.http.api.error.HttpRequestFailedException
Response Timeout that I have set is 5 mins.
The flow was working fine when deployed on cloudhub before.It stopped working a few days ago though I didn't make any changes to my code.I am unable to debug this issue as it is not reproducible on my local env(it works perfectly). Any help would be appreciated.
There are 4 different types of general timeouts mule HTTP calls offer. Each has its own differences.
Connection Idle Timeout
Response Timeout
Max Idle Timeout
Query or Transactions Timeout ( Applies for DB Connectors)
Since you are getting
HTTP:CONNECTIVITY ERROR.
Applying a 5 min Response Timeout doesn't help.
Response Timeout (means taking longer time to respond) should be worried only after Establishing a Connection Handshake.
Your problem is with the Connection itself.
The only possible way you could try fixing this is by Applying a Connection Idle Timeout and a Reconnection Strategy with some frequency gaps.
Since you are so sure about tests in local. I suggest you the below two steps:
1. Try using the same HTTP connector configuration in a separate new mule APP. Try with a simple listener and the failing requestor. Also add one more freely available online REST services into your code in other extra flow. Now try to test both. See which one is working and which is failing.
This would tell if it's a real HTTP CONNECTIVITY problem or anything else related to some mule bug.
2. Check your configurations once again and make sure if your hitting the same endpoint in the cloudhub version.
Finally, I hope you did not accidentally put any proxy conf in the local version.
If it was working, probably there was a networking change in the other side that prevents access from the CloudHub application. You didn't share the URL so it is not clear if it is an internal host or a public host. We also don't know if there is some kind of whitelisting on the server side.
You can test connectivity to the HTTP host and port using the Network Tools application, to see if it accessible from your CloudHub environment.

Azure "MySQL server has gone away" for one minute only

I'm using Azure App Services to run about 15 PHP web apps. Most of these apps connect to my 'Azure Database for MySQL server' instance. This is a Basic-tier instance (1 vCore & 2GB memory).
The MySQL instance hosts about 30 small databases (ranging between 1 to 100MB in size).
The load on the MySQL instance is stable and low. CPU is constantly under 20%, memory is constantly under 50% and IO does not even show up in the metrics in the Azure Portal.
My problem is this:
Every once in a while the server goes offline for about 1 or 2 minutes (max 5 min). I see that client applications try to connect, they hang for a while to finally get the error:
SQLSTATE[HY000] [2006] MySQL server has gone away
It seems to happen randomly. Sometimes a few times a week or even a day. But sometimes it doesn't happen for weeks.
What's noticeable though, when it happens I see a downward spike in memory and an upward spike in CPU in the metrics graph on the portal like this:
Does anyone experience the same issue on Azure Database for MySQL? And did anyone find a solution?
I'm starting to think that it's caused by a resources movement on the Azure side but I don't have any evidence to back that up. If so, shouldn't that happen without any downtime?
Scaling up from the Basic 1 core tier with Compute Gen 4 to Basic 2 core tier with Compute Gen 5 seemed to resolve the problem.
Not sure though what was causing the issue though.
I started experiencing this error in May 2019.
If I happen to be connected on the mariadb server with ssh at the time it occurs and htop is running, I can see rsyslog suddenly going crazy. It bogs down the CPU and the network connection becomes unresponsive. The CPU and network activity doesn't show up in Azure but running w in the ssh session after the network recovers shows the CPU load was definitely very high during the last 15 minutes.
I traced it back to OMS agent. When that service is killed on the mariadb server, the server runs without any problem. As soon as OMS agent is started, "Mysql has gone away" pops up on the clients within 24 hours due to unresponsive network connection with the server machine.
It is possible to uninstall OMS agent from the Azure portal but it comes back within 48H.
The only way I found of getting rid of OMS agent is to stop walinuxagent too on the linux server.
Scaling the server up may solve the problem as you have more CPU power to process the extra CPU load induced by OMS agent. I prefer to kill OMS agent and walinuxagent instead of spending more money on an expansive server.
Edit:
It turns out OMS is installed because the VM is part of a Log Analytics workspace (search for Log Analytics workspaces in the search bar). Removing the VM from the workspace immediately uninstall OMS. There is no need to stop walinuxagent.

Azure App Service - Outbound IPs Changed?

I have a site running on an Azure App Service. It connects to a MySQL DB on Google CloudSQL.
All of a sudden I am getting an error when I hit a page on my site. The error is:
Configuration Error - Reading from the stream has failed.
I know this is related to MySQL and an attempt to read from the DB.
The DB itself is fine - minimal connections, no stress.
The site runs fine from my local VS connecting to said database.
This makes me think I have hit some kind of 'outbound' connection limit on Azure. Can anyone confirm?
The Azure site is up and running but as soon as it tries to connect to the DB it falls over.
Thanks for any help you can give!
Update - IP Changed??
It appears that the App Service outbound IP addresses changed at some point yesterday so our external MySQL DB started blocking the connection attempts. Has anyone experienced this? Every single outbound IP changed. Nothing has been changed on the setup of the app (no scaling etc)

Intermittent Errors connecting to MySQL server

My app seems to run fine sometimes, and other times it says it cannot connect to any of the MySQL servers.
I started out with the MySQL server being hosted in azure as well, but I moved it external due to connectivity issues.
I finally moved the ASP.Net app to a real VM instead of being hosted as a website. I tested a manual connection to the MySQL server when it became unresponsive and it failed as well. I then did a trace route to the server and it failed as well.
Is this a known issue? Is this a duplicate of: Classic ASP site on Azure web site, remote mysql database

Web server and MySQL server on different machines, causing latency on websites

I am currently running a virtualized environment for my web and db server. When I access the web server or the MySQL server individually, they are both fast. I also have websites running on the web server that do not require the db server and those all load quickly. However, when I access my hosted website that requires the web server to call from the db server, there is about a 5-7 second latency for every page load. This has been confirmed with both a very simple site and with a Word Press setup as well. Here is the config:
Web server - CentOS 6.5, Apache 2.2.15
DB server - CentOS 6.5, MySQL 5.1.73
My question is, are the servers continuously authenticating with one another (and thus causing latency) on every single db call? If that is the case, does anyone know how to permanently authenticate between the two?
I might be way off on this assumption and authentication could have nothing to do with it. I am completely open to any and all ideas at this point. Thank you very much.
V/R,
Tony
To me it seems to be a network issue.
and obviously the db-server will need authentication every time there is a hit.