Google Compute Engine health checks failing - google-compute-engine

I have a node.js app on two VM instances that I'm trying to load balance with network load balancing. To test that my servers are up and serving, I have the health check request '/health.txt' on my app internal listening port. I have two instances configured identically with the same tags, firewall rules, etc, but the health check fails to one instance continuously, I can do the check using curl on my internal network or from outside and the test works fine on both instances, but the network load balancer always reports one instance as down.
I used ngrep and running from the health instance, I see:
T 169.254.169.254:65374 -> my.pub.ip.addr:3000 [S]
#
T my.pub.ip.addr:3000 -> 169.254.169.254:65374 [AS]
#
T 169.254.169.254:65374 -> my.pub.ip.addr:3000 [A]
#
T 169.254.169.254:65374 -> my.pub.ip.addr:3000 [AP]
GET /health.txt HTTP/1.1.
Host: my.pub.ip.addr:3000.
.
#
T my.pub.ip.addr:3000 -> 169.254.169.254:65374 [A]
#
T my.pub.ip.addr:3000 -> 169.254.169.254:65374 [AP]
HTTP/1.1 200 OK.
X-Powered-By: NitroPCR.
Accept-Ranges: bytes.
Date: Fri, 14 Nov 2014 20:00:40 GMT.
Cache-Control: public, max-age=86400.
Last-Modified: Thu, 24 Jul 2014 17:58:46 GMT.
ETag: W/"2198506076".
Content-Type: text/plain; charset=UTF-8.
Content-Length: 13.
Connection: keep-alive.
.
#
T 169.254.169.254:65374 -> my.pub.ip.addr:3000 [AR]
But on the instance GCE claims is unhealthy, I see this:
T 169.254.169.254:61179 -> my.pub.ip.addr:3000 [S]
#
T 169.254.169.254:61179 -> my.pub.ip.addr:3000 [S]
#
T 169.254.169.254:61180 -> my.pub.ip.addr:3000 [S]
#
T 169.254.169.254:61180 -> my.pub.ip.addr:3000 [S]
#
T 169.254.169.254:61180 -> my.pub.ip.addr:3000 [S]
But if I curl the same file from my healthy instance > unhealthy instance, my 'unhealthy' instance responds fine.

I got this back working, after making contact with the Google Compute Engine team. There is a service process on a GCE VM that needs to run on boot, and continue running while the VM is alive. The process is named google-address-manager. It should run at runlevels 0-6. For some reason this service stopped and will not start when one of my VMs boots/reboots. Starting the service manually worked. Here are the steps we went through to determine what was wrong: (This is a Debian VM)
sudo ip route list table all
This will display your route table. In the table, there should be a route to your Load Balancer Public IP:
local lb.pub.ip.addr dev eth0 table local proto 66 scope host
If there is not, check that google-address-manager is running:
sudo service google-address-manager status
If it not running, start it:
sudo service google-address-manager start
If it starts ok, check your route table, and you should now have a route to your load balancer IP. You can also manually add this route:
sudo /sbin/ip route add to local lb.pub.ip.addr/32 dev eth0 proto 66
We have still not resolved why the address manager stopped and does not start on boot, but at least the LB Pool is healthy

Related

Tinyproxy not forwarding requests. Getting Unauthorized connection from <IP>

I have installed tinyproxy in CentOS 7 machine and changed the port to 8080 in tinyproxy.conf
Wherenever I am hitting request I am getting following logs in tinyproxy.log:-
CONNECT Mar 15 08:14:42 [22148]: Connect (file descriptor 6): <IP> [<IP>]
NOTICE Mar 15 08:14:42 [22148]: Unauthorized connection from "<IP>" [<IP>].
INFO Mar 15 08:14:42 [22148]: Read request entity of 1200 bytes
My request is reaching to proxy and proxy is not forwarding it to the destination.
In the Tinyproxy config file (/etc/tinyproxy/tinyproxy.conf) you can use the Allow directive to explicitly specify the host(s) that will be connecting to the proxy. You can also comment out or remove all Allow <host> lines to allow connections from all hosts. See below description from the config file (here I've commented out Allow 127.0.0.1 and since there are no other entries all connections will be allowed):
# Allow: Customization of authorization controls. If there are any
# access control keywords then the default action is to DENY. Otherwise,
# the default action is ALLOW.
#
# The order of the controls are important. All incoming connections are
# tested against the controls based on order.
#
#Allow 127.0.0.1

CakePHP 3 - Enable SSL on development server [duplicate]

OS: Ubuntu 12.04 64-bit
PHP version: 5.4.6-2~precise+1
When I test an https page I am writing through the built-in webserver (php5 -S localhost:8000), Firefox (16.0.1) says "Problem loading: The connection was interrupted", while the terminal tells me "::1:37026 Invalid request (Unsupported SSL request)".
phpinfo() tells me:
Registered Stream Socket Transports: tcp, udp, unix, udg, ssl, sslv3,
tls
[curl] SSL: Yes
SSL Version: OpenSSL/1.0.1
openssl:
OpenSSL support: enabled
OpenSSL Library Version OpenSSL 1.0.1 14 Mar 2012
OpenSSL Header Version OpenSSL 1.0.1 14 Mar 2012
Yes, http pages work just fine.
Any ideas?
See the manual section on the built-in webserver shim:
http://php.net/manual/en/features.commandline.webserver.php
It doesn't support SSL encryption. It's for plain HTTP requests. The openssl extension and function support is unrelated. It does not accept requests or send responses over the stream wrappers.
If you want SSL to run over it, try a stunnel wrapper:
php -S localhost:8000 &
stunnel3 -d 443 -r 8080
It's just for toying anyway.
It's been three years since the last update; here's how I got it working in 2021 on macOS (as an extension to mario's answer):
# Install stunnel
brew install stunnel
# Find the configuration directory
cd /usr/local/etc/stunnel
# Copy the sample conf file to actual conf file
cp stunnel.conf-sample stunnel.conf
# Edit conf
vim stunnel.conf
Modify stunnel.conf so it looks like this:
(all other options can be deleted)
; **************************************************************************
; * Global options *
; **************************************************************************
; Debugging stuff (may be useful for troubleshooting)
; Enable foreground = yes to make stunnel work with Homebrew services
foreground = yes
debug = info
output = /usr/local/var/log/stunnel.log
; **************************************************************************
; * Service definitions (remove all services for inetd mode) *
; **************************************************************************
; ***************************************** Example TLS server mode services
; TLS front-end to a web server
[https]
accept = 443
connect = 8000
cert = /usr/local/etc/stunnel/stunnel.pem
; "TIMEOUTclose = 0" is a workaround for a design flaw in Microsoft SChannel
; Microsoft implementations do not use TLS close-notify alert and thus they
; are vulnerable to truncation attacks
;TIMEOUTclose = 0
This accepts HTTPS / SSL at port 443 and connects to a local webserver running at port 8000, using stunnel's default bogus cert at /usr/local/etc/stunnel/stunnel.pem. Log level is info and log outputs are written to /usr/local/var/log/stunnel.log.
Start stunnel:
brew services start stunnel # Different for Linux
Start the webserver:
php -S localhost:8000
Now you can visit https://localhost:443 to visit your webserver: screenshot
There should be a cert error and you'll have to click through a browser warning but that gets you to the point where you can hit your localhost with HTTPS requests, for development.
I've been learning nginx and Laravel recently, and this error has came up many times. It's hard to diagnose because you need to align nginx with Laravel and also the SSL settings in your operating system at the same time (assuming you are making a self-signed cert).
If you are on Windows, it is even more difficult because you have to fight unix carriage returns when dealing with SSL certs. Sometimes you can go through the steps correctly, but you get ruined by cert validation issues. I find the trick is to make the certs in Ubuntu or Mac and email them to yourself, or use the linux subsystem.
In my case, I kept running into an issue where I declare HTTPS somewhere but php artisan serve only works on HTTP.
I just caused this Invalid request (Unsupported SSL request) error again after SSL was hooked up fine. It turned out to be that I was using Axios to make a POST request to https://. Changing it to POST http:// fixed it.
My recommendation to anyone would be to take a look at where and how HTTP/HTTPS is being used.
The textbook definition is probably something like php artisan serve only works over HTTP but requires underlying SSL layer.
Use Ngrok
Expose your server's port like so:
ngrok http <server port>
Browse with the ngrok's secure public address (the one with https).
Note: Though it works like a charm, it seems an overkill since it requires internet and would appreciate better recommendations.

Open Splunk forwarder/receiver TCP port on OpenShift

I need to open TCP port 9997 on OpenShift so Splunk is able to listen for incoming data from fowarders on other servers.
I've set up Splunk using this guide: http://www.kelvinism.com/2013/11/free-splunk-hosting.html and but I can't figure out how to add another TCP port to the manifest.yml file. I tried the following for a new OpenShift instance but with no luck.
- Private-IP-Name: IP
Private-Port-Name: PORT_FORWARDER
Private-Port: 9997
Public-Port-Name: PROXY_PORT_FORWARDER
Options: { "ssl_to_gear": true }
Do I need to configure other parts of the cartridge to read my new port and set up some configuration elsewhere?
You will only be able to listen publicly on ports 80/443/8000/8443, no other tcp or udp ports are allowed in (except 22 for ssh/scp/sftp). The private port that you have configured is for internal access only (either on the same gear, or installed on it's own gear as part of a scaled application). Having remote agents connect to your application on port 9997 just won't work.
alternatively, you can write a very simple splunk add-on to listen on that port, that's very straight forward.
Splunk has a SDK you can implement it with variable language. Here is a framework for python. for more information, you can see a full example for UDP receiver: link to the example, it's not an english post, but you can read the code from there.
import sys
from splunklib.modularinput import *
class MyScript(Script):
def get_scheme(self):
# Returns scheme.
def validate_input(self, validation_definition):
# Validates input.
def stream_events(self, inputs, ew):
# Splunk Enterprise calls the modular input,
# streams XML describing the inputs to stdin,
# and waits for XML on stdout describing events.
# TODO: implement a socket to listen and receive the
# message then send by Event()
if __name__ == "__main__":
sys.exit(MyScript().run(sys.argv))

GCE + Load Balancer + Instance without public IP

I have instance that on purpose does not have public IP.
I have GCE Network Load Balancer that is using above instance as target pool.
Everything works great.
Then I wanted my instance to communicate with internet so I followed this documentation: https://cloud.google.com/compute/docs/networking#natgateway (Configuring a NAT gateway)
Instance can communicate with internet fine but load balancer cannot communicate with my instance anymore.
I think that these steps create the issue with loadbalancer:
$ gcloud compute routes create no-ip-internet-route --network gce-network \
--destination-range 0.0.0.0/0 \
--next-hop-instance nat-gateway \
--next-hop-instance-zone us-central1-a \
--tags no-ip --priority 800
user#nat-gateway:~$ sudo iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
```
Do you know what can be done to make both things work together ?
I have recreated the environment you've described and did not run into any issues.
$ gcloud compute routes create no-ip-internet-route --network gce-network \
--destination-range 0.0.0.0/0 \
--next-hop-instance nat-gateway \
--next-hop-instance-zone us-central1-a \
--tags no-ip --priority 800
The only thing that the above command will do, is create a routing rule so that instances with no external traffic are pointed to the NAT gateway for any traffic they want need to send out. This will not affect the LB's ability to reach your instance.
In my test, I followed the exact guide you referenced, which you can find here, and that results in:
1 new network
1 firewall rule to allow SSH on port 22
1 firewall rule to allow all internal traffic
1 new instance to act as a NAT Gateway
1 new instance internal instance with no external IP address
I also added the internal instance to a 'TargetPool' and created an LB for the purpose of the test.
My internal instance was accessible both via the LB's external address and internally via the NAT Gateway. The internal instance was also able to communicate with the Internet due to the NAT Gateway's configuration. Everything works as expected.
My recommendation for you and other readers (as this post is now rather old) is to try again. Make sure that you do not have any other conflicting rules, routes or software.

I have a SysV init script on Fedora 18. How can I make it start after the network is ready?

I have a SysV init script on Fedora 18. Fedora 18 uses systemd (and apparently, there is no way to switch back to SysV).
My script requires the network to be ready. Currently, at the time the script runs, the network is not ready. How can I make sure that my SysV init script runs after the network is up?
The beginning of my script looks like this:
#!/bin/bash
#
# chkconfig: 345 99 01
# description: starts the xyz boot service
OK, after trying several things, I tried adding an LSB header:
### BEGIN INIT INFO
# Required-Start: $network $local_fs $named
# Default-Start: 3 4 5
# Default-Stop: 0 1 2 6
# Short-Description: Starts/stops the foo service
# Description: Starts/stops the foo service
### END INIT INFO
This worked! The script now runs after the network is initialized. I guess the systemd implementation reads the LSB header.
To run a script when the network is ready, in the [Unit] section of your systemd service file, add the following:
After=network-online.target
Wants=network-online.target
The network is defined as ready when the network management software considers the network is up (that generally means a IP address is configured and routable). For NetWorkManager, it queries dbus to get the information..
References
Running Services After the Network is up
systemd.unit(5) man page
nm-online source code, function check_online