do all backend servers have to be up for haproxy startup? - configuration

I'm trying to start a Haproxy loadbalancer with the following configuration:
global
log 127.0.0.1 local0
log 127.0.0.1 local0 notice
resolvers docker
nameserver dnsmasq 1.2.3.4:53
defaults
mode http
log global
option httplog
option dontlognull
frontend ft_radix_real
bind *:61616
maxconn 6000
acl is_websocket hdr(Upgrade) -i WebSocket
acl is_websocket hdr_beg(Host) -i ws
use_backend bk_radix_real if is_websocket
backend bk_radix_real
balance roundrobin
option forwardfor
server radix-real-1 1.2.3.4:1884 check resolvers docker resolve-prefer ipv4
server radix-real-2 1.2.3.4:1884 check resolvers docker resolve-prefer ipv4
server radix-real-3 1.2.3.4:1884 check resolvers docker resolve-prefer ipv4
server radix-real-4 1.2.3.4:1884 check resolvers docker resolve-prefer ipv4
listen stats
mode http
option httplog
option dontlognull
bind *:1936
stats enable
stats scope ft_radix_real
stats scope bk_radix_real
stats uri /
stats realm Haproxy\ Statistics
stats auth admin:admin
This configuration works when all backend servers are up. However, I would like to be able to start Haproxy even if some(NOT ALL) of the backend servers are not running. I checked the configuration document but couldn't find a solution. Is it possible?

Since 1.7 you can start HAproxy without resolving all the hosts on startup:
defaults
# never fail on address resolution
default-server init-addr none
https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#init-addr

I don't see any problem, the checks are there for this. Servers will be checked, the dead ones will be marked down and only the remaining valid ones will handle the traffic. You probably need to describe what type of issue you're facing exactly.

Related

Asterisk Realtime Crashing on load when using HAProxy to Galera Cluster

Works fine under little load on our test bench but once we add to production the whole thing crashes and we are unable to get asterisk to function correctly. Almost as if there is a lag or delay in accessing the MariaDB cluster.
Our architecture and configs below;
Asterisk 13 Realtime with HAProxy(1.5.18) --> 6 x MariaDB(10.4.11) on independent Datacentres with Galera syncing them (1 only as backup)
Galera Sync is working fine and other services are able to read/write via the HAProxy 100%
Only seems to become and issue when we add load or we reload the dialplan or restart asterisk etc.
[haproxy.cfg]
global
user haproxy
group haproxy
defaults
mode http
log global
retries 2
timeout connect 3000ms
timeout server 10h
timeout client 10h
listen stats
bind *:8404
stats enable
stats hide-version
stats uri /stats
listen mysql-cluster
bind 127.0.0.1:3306
mode tcp
option mysql-check user haproxy_check
balance roundrobin
server mysql_server1 10.0.0.1:3306 check
server mysql_server2 10.0.0.2:3306 check
server mysql_server3 10.0.0.3:3306 check
server mysql_server4 10.0.0.4:3306 check
server mysql_server5 10.0.0.5:3306 check
server mysql_server6 10.0.0.6:3306 check backup
Really we would like to know if firstly Asterisk 13 Realtime will work via HAProxy and if so are there config changes we need to make to get it working.
Can provide more info if required
Try use Realtime->ODBC->haproxy.
If not help, use debugging, for example, gdb traces.
There is no way to determine what issue you have. Need more logs and configs.

MXRecord error when running diagnostics on hMailServer

I have downloaded and installed hmailserver. I want to run it on my local machine and want to send email using a local web application which is also running on my local machine. My web application couldn't send email (got error - couldn't connect to mydomain:25) and thus I thought to first run hmailserver's diagnostic tool to test things.
But when I run diagnostics on it, I see the error:
Test: Collect server details
hMailServer version: hMailServer 5.6.7-B2425
Database type: MSSQL Compact
Test: Test IPv6
IPv6 support is available in operating system.
Test: Test outbound port
SMTP relayer not in use. Attempting mail.hmailserver.com:25...
Trying to connect to host mail.hmailserver.com...
Trying to connect to TCP/IP address 5.189.183.138 on port 25.
Received: 220 mail.hmailserver.com ESMTP.
Connected successfully.
Test: Test backup directory
ERROR: Backup directory has not been specified.
Test: Test MX records
Trying to resolve MX records for mydomain.com...
ERROR: MX records for domain mydomain.com could not be resolved
Test: Test local connect
Connecting to TCP/IP address in MX records for local domain domain mydomain.com...
ERROR: MX records for local domain mydomain.com could not be resolved
Test: Test message file locations
Relative message paths are stored in the database for all messages.
Test: Test IP range configuration
No problems were found in the IP range configuration.
To be honest, I don't know what MX Record is and how to set it up. Things I have done so far are:
Installed hmailserver (obviously!)
Added a domain (mydomain.com)
Added an account in mydomain.com (signup#mydomain.com)
In settings->Protocols ->SMTP, I added localhost in Delivery of email ->Localhostname section
In c:\Windows\System32\Drivers\etc\hosts file, I added entry 127.0.0.1 mydomain.com #for play application
When I run netstat -a, I see that hmailserver is listening on port 25 (I tested it using net stop hmailserver and net start hmailserver)
Proto Local Address Foreign Address State
TCP 0.0.0.0:25 DESKTOP-6PLQOLJ:0 LISTENING[/list]
Have I made a mistake?
This Error Message is shown if your Windows Operating system DNS-Resolver Service points to an DNS-Server which doesn't know anything or cannot forward the Query to the DNS-Server in charge for the specified Domain and its MX-Record.
In short terms: Without having your DNS-Domain correctly configured, hMailServer (or any other SMTP-Server) can't work.

mysql farm with haproxy

I want to configure a DB farm in a single node with containers. My idea is to access in each of these DB with a subdomain, for example mysql1.example.com:3306, mysql2.example.com:3306, mysql3.example.com:3306.
I'm trying to implement this model with HAProxy, it seems that the first time that I connect to one database through the HAProxy it works. When I reconnect I get:
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 0
The template I use in HAproxy is:
global
maxconn 256
debug
defaults
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
listen www
bind *:3306
mode tcp
acl host_mysql hdr(host) -i mysql1.example.com
server mysql_db_1 172.31.20.75:3307
acl host_mysql hdr(host) -i mysql2.example.com
server mysql_db_2 172.31.20.75:3308
acl host_mysql hdr(host) -i mysql3.example.com
server mysql_db_3 172.31.20.75:3309
I auto-respond. It's not possible to create this implemetation due to Mysql uses TCP protocol, so it not include the URL in the header. For this reason HAproxy can't redirect to the correct server.
I'm thinking to implement this environment using virtual IP's assigned to each database. Another implementation would be running all databases in the same server and different ports.

unable to connect to AWS VPC RDS instance (mysql or postgres)

(I'm posting this question after the fact because of the time it took to find the root cause and solution. There's also a good chance other people will run into the same problem)
I have an RDS instance (in a VPC) that I'm trying to connect to from an application running on a classic EC2 instance, connected via ClassicLink. Security groups and DNS aren't an issue.
I am able to establish socket connections to the RDS instance, but cannot connect with CLI tools (psql, mysql, etc.) or DB GUI tools like toad or mysql workbench.
Direct socket connections with telnet or nc result in TCP connections in the "ESTABLISHED" state (output from netstat).
Connections from DB CLI, GUI tools, or applications result in timeouts and TCP connections that are stuck in the "SYN" state.
UPDATE: The root cause in my case was a problem with MTU size and EC2 ClassicLink. I've posted some general troubleshooting information below in an answer in case other people run into similar RDS connectivity issues.
Additional information for people who might run into similar issues trying to connect to RDS or RedShift:
1) Check security groups
Verify the security group for the RDS instance allows access from the security group your source server belongs to (or its IP added directly if external to AWS). The security group you should be looking at is the one specified in the RDS instance attributes from the RDS console UI (named "security group").
NOTE: Database security groups might be different from AWS EC2 security groups. If your RDS instance is in classic/public EC2, you should check in the "database security group" section of the RDS UI. For VPC users, the security group will be a normal VPC security group (the name sg-xxx will be listed in the RDS instance's attributes).
2) Confirm DNS isn't an issue.
Amazon uses split DNS, so a DNS lookup external to AWS will return the public IP while a lookup internal to AWS will return a private IP. If you suspect it is a DNS issue, have you confirmed different IPs are returned from different availability zones? If different AZs get different IPs, you will need to contact AWS support.
3) Confirm network connectivity by establishing a socket connection.
Tools like tracepath and traceroute likely won't help since RDS currently drops ICMP traffic.
Test port connectivity by trying to establish a socket connection to the RDS instance on port 3306 (mysql, or 5432 for postgres). Start by finding the IP of the RDS instance and using either telnet or nc (be sure to use the internal/private IP if connecting from within AWS):
telnet x.x.x.x 3306
nc -vz x.x.x.x 3306
a) If your connection attempt isn't successful and immediately fails, the port is likely blocked or the remote host isn't running a service on that port. you may need to engage AWS support to troubleshoot further. If connecting from outside of AWS, try to connect from another instance inside AWS first (as your firewall might be blocking those connections).
b) If your connection isn't successful and you get a timeout, packets are probably being dropped/ignored by a firewall or packets are returning on a different network path. You can confirm this by running netstat -an | grep SYN (from a different ssh session while waiting for the telnet/nc command to timeout).
Connections in the SYN state mean that you've sent a connection request, but haven't received anything back (SYN_ACK or reject/block). Usually this means a firewall or security group is ignoring or dropping packets.
It can also be a problem with NAT routing or multiple paths from multiple interfaces. Check to make sure you're not using iptables or a NAT gateway between your host and the RDS instance. If you're in a VPC, also make sure you allow egress/outbound traffic from the source host.
c) If your socket connection test was successful, but you can't connect with a mysql client (CLI, workbench, app, etc.), take a look at the output of netstat to see what state the connection is in (replace x.x.x.x with the actual IP address of the RDS instance):
netstat -an | grep x.x.x.x
If you were getting a connection established when using telnet or NC, but you see the 'SYN' state when using a mysql client, you might be running into an MTU issue.
RDS, at the time this is written, may not support ICMP packets used for PMTUD (https://en.wikipedia.org/wiki/Path_MTU_Discovery#Problems_with_PMTUD). This can be a problem if you're trying to access RDS or RedShift that's in a VPC from a classic ec2 instance via ClassicLink. Try lowering the MTU with the following, then testing again:
sudo ip link show
# take note of the current MTU (likely 1500 or 9001)
sudo ip link set dev eth0 mtu 1400
If the lower MTU worked, be sure to follow up with AWS customer support for help and mention that you are seeing an MTU issue while trying to connect to your RDS instance. This can happen if TCP packets are wrapped with encapsulation for tunneling, resulting in a lower usable MTU for packet data / payload. Lowering the MTU on the source server allows the wrapped packets to still fit under the MTU limit while passing through the tunneling gateway.
If it didn't work, set your MTU back to it's default and engage AWS support for further troubleshooting.

Site to site OpenSWAN VPN tunnel issues with AWS

We have a VPN tunnel with Openswan between two AWS regions and our colo facility (Used AWS’s guide: http://aws.amazon.com/articles/5472675506466066). Regular usage works OK (ssh, etc), but we are having some MySQL issues over the tunnel between all areas. Using mysql command line client on a linux server and trying to connect using the MySQL Connector J it basically stalls… it seems to open the connection, but then gets stuck. It doesn't get denied or anything, just hangs there.
After initial research thought this was an MTU issue, but I've messed with that a lot and no luck.
Connection to the server works fine, and we can choose a database to use and such, but using the Java connector it appears that the Java client isn't receiving any network traffic after the query is made.
When running a select in the MySQL client on linux we can get a max of 2 or 3 rows before it goes dead.
With this said, I also have a separate openswan VPN on the AWS side for client (mac and iOS) vpn connections. Everything works fantastically through the client VPN and it seems more stable in general. The main difference I've noticed is that the static connection is using "tunnel" as the type and the client is using "transport", but when switching the static tunnel connection to transport it says there's like 30 open connections and doesn't work.
I'm very new to OpenSWAN, so hoping someone can help to point me in the right direction of getting the static tunnel working as well as the client VPN.
As always, here's my config files:
ipsec.conf for BOTH static tunnel servers:
# basic configuration
config setup
# Debug-logging controls: "none" for (almost) none, "all" for lots.
# klipsdebug=none
# plutodebug="control parsing"
# For Red Hat Enterprise Linux and Fedora, leave protostack=netkey
protostack=netkey
nat_traversal=yes
virtual_private=
oe=off
# Enable this if you see "failed to find any available worker"
# nhelpers=0
#You may put your configuration (.conf) file in the "/etc/ipsec.d/" and uncomment this.
include /etc/ipsec.d/*.conf
VPC1-to-colo tunnel conf
conn vpc1-to-DT
type=tunnel
authby=secret
left=%defaultroute
leftid=54.213.24.xxx
leftnexthop=%defaultroute
leftsubnet=10.1.4.0/24
right=72.26.103.xxx
rightsubnet=10.1.2.0/23
pfs=yes
auto=start
colo-to-VPC1 tunnel conf
conn DT-to-vpc1
type=tunnel
authby=secret
left=%defaultroute
leftid=72.26.103.xxx
leftnexthop=%defaultroute
leftsubnet=10.1.2.0/23
right=54.213.24.xxx
rightsubnet=10.1.4.0/24
pfs=yes
auto=start
Client point VPN ipsec.conf
# basic configuration
config setup
interfaces=%defaultroute
klipsdebug=none
nat_traversal=yes
nhelpers=0
oe=off
plutodebug=none
plutostderrlog=/var/log/pluto.log
protostack=netkey
virtual_private=%v4:10.1.4.0/24
conn L2TP-PSK
authby=secret
pfs=no
auto=add
keyingtries=3
rekey=no
type=transport
forceencaps=yes
right=%any
rightsubnet=vhost:%any,%priv
rightprotoport=17/0
# Using the magic port of "0" means "any one single port". This is
# a work around required for Apple OSX clients that use a randomly
# high port, but propose "0" instead of their port.
left=%defaultroute
leftprotoport=17/1701
# Apple iOS doesn't send delete notify so we need dead peer detection
# to detect vanishing clients
dpddelay=10
dpdtimeout=90
dpdaction=clear
Found the solution. Needed to add the following IP tables rule on both ends:
iptables -t mangle -I POSTROUTING -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu
This along with an MTU of 1400 and we're looking very solid
We had the same issue with a server connecting from the EU region to an RDS instance in the US. This appears to be a known issue with the RDS instances not responding to ICMP which is needed to auto-discover the MTU settings. As a workaround, you'll need to configure a smaller MTU on the instance that is performing the query.
On the server that is making the connection to the RDS instance (not the VPN tunnel instances), run the following command to get a MTU setting of 1422 (which worked for us):
sudo ifconfig eth0 mtu 1422