Asterisk Realtime Crashing on load when using HAProxy to Galera Cluster - mysql

Works fine under little load on our test bench but once we add to production the whole thing crashes and we are unable to get asterisk to function correctly. Almost as if there is a lag or delay in accessing the MariaDB cluster.
Our architecture and configs below;
Asterisk 13 Realtime with HAProxy(1.5.18) --> 6 x MariaDB(10.4.11) on independent Datacentres with Galera syncing them (1 only as backup)
Galera Sync is working fine and other services are able to read/write via the HAProxy 100%
Only seems to become and issue when we add load or we reload the dialplan or restart asterisk etc.
[haproxy.cfg]
global
user haproxy
group haproxy
defaults
mode http
log global
retries 2
timeout connect 3000ms
timeout server 10h
timeout client 10h
listen stats
bind *:8404
stats enable
stats hide-version
stats uri /stats
listen mysql-cluster
bind 127.0.0.1:3306
mode tcp
option mysql-check user haproxy_check
balance roundrobin
server mysql_server1 10.0.0.1:3306 check
server mysql_server2 10.0.0.2:3306 check
server mysql_server3 10.0.0.3:3306 check
server mysql_server4 10.0.0.4:3306 check
server mysql_server5 10.0.0.5:3306 check
server mysql_server6 10.0.0.6:3306 check backup
Really we would like to know if firstly Asterisk 13 Realtime will work via HAProxy and if so are there config changes we need to make to get it working.
Can provide more info if required

Try use Realtime->ODBC->haproxy.
If not help, use debugging, for example, gdb traces.
There is no way to determine what issue you have. Need more logs and configs.

Related

Mediasoup: Connection state changed to disconnected a few tens of seconds after connected

We use mediasoup to create our products. However, I am having problems with the transport connection.
The client transport connection state goes disconnected a few eights seconds after connection.
The following log will be output in the chrome console.
mediasoup-client:Transport connection state changed to connected
However, the following log will be output in the chrome console a few eights seconds later
mediasoup-client:Transport connection state changed to disconnected
If the NewProducer is present before the disconnection, the above will not happen.
Do you know the possible causes?
Resolved. I changed the AWS security policy according to the topic below and it worked.
You’ll also need to configure your AWS Security Group to allow TCP/UDP on whatever port range you’re using.
https://mediasoup.discourse.group/t/docker-setup-with-listenips/2557/4

2 mySQL clusters in HAProxy

We use HAProxy (1.5) to proxy mysql to 4 Galera Nodes. We use roundrobin and works good for High Availability and Load Balancing.
See /etc/haproxy/haproxy.cfg
global
user haproxy
group haproxy
defaults
mode http
log global
retries 2
timeout connect 3000ms
timeout server 10h
timeout client 10h
listen stats
bind *:8404
stats enable
stats hide-version
stats uri /stats
listen mysql-cluster
bind 127.0.0.1:3306
mode tcp
option mysql-check user haproxy_check
balance roundrobin
server dbcl_01_dc1 xx.xx.xx.xx:3306 check
server dbcl_03_dc6 1xx.xx.xx.xx:3306 check
server dbcl_04_do xx.xx.xx.xx:3306 check
server dbcl_05_dc4 xx.xx.xx.xx:3306 check
This works great but we have a fear of the Cluster failing us some day and we would like haproxy to roll over to another mysql server should none of the above 4 galera nodes be available. We would only want this last server being used as dooms day scenario as its data is one hour behind the production cluster and more importantly a different dataset. The idea is we automatically roll over to our non-clustered mysql data from one hour behind and keep our customers operating.
Does anybody know if this is possible with HAProxy? So First 4 Servers in roundrobin and if they are not available then choose non clustered single database server as last resort.
You can try something with backup to help you configure with failover
listen mysql-cluster
bind 127.0.0.1:3306
mode tcp
option mysql-check user haproxy_check
balance roundrobin
server dbcl_01_dc1 xx.xx.xx.xx:3306 check
server dbcl_03_dc6 xx.xx.xx.xx:3306 check
server dbcl_04_dc2 xx.xx.xx.xx:3306 check
server dbcl_05_dc4 xx.xx.xx.xx:3306 check
// Solution
server dbbk_01_dc1 xx.xx.xx.xx:3306 check backup
In this case if all the 4 servers in the cluster goes down traffic will get routed to the backup server.
However, you can also try multiple backup servers as part of the configuration
listen mysql-cluster
bind 127.0.0.1:3306
mode tcp
option mysql-check user haproxy_check
balance roundrobin
server dbcl_01_dc1 xx.xx.xx.xx:3306 check
server dbcl_03_dc6 xx.xx.xx.xx:3306 check
server dbcl_04_dc2 xx.xx.xx.xx:3306 check
server dbcl_05_dc4 xx.xx.xx.xx:3306 check
// Solution
server dbbk_01_dc1 xx.xx.xx.xx:3306 check backup
server dbbk_02_dc2 xx.xx.xx.xx:3306 check backup
In the above solution HAProxy picks up first server as backup until it goes down, and as a failover it uses the second server to serve the traffic if first backup server goes down.
If there is huge traffic surge and you want multiple backups to handle all your traffic you can also setup something like this with option allbackups which routes traffic to all the backups.
There is official documentation with much more complex settings.

MySQL stuck or network issue?

we have mysql-server(5.5.47)that hosted on physical server. It listen external internet interface(with restrict user access), mysql server intensively used from different places(we use different libraries to communicate with mysql). But sometimes whole mysql server(or network) stuck and stop accept connection, and a clients failed with etimedout(connect)/timeout(recv), even direct connection from server to mysql with mysql cli not working(stuck without any response — seems to be try to establish connections).
First thought was that it is related to tcp backlog, so mysql backlog was increased — but this not help at all.
Issue not repeatable, so last time when this issue happened we sniff traffic, and what we get:
http://grab.by/STwq — screenshot
*.*.27.65 — it is client
*.*.20.80 — it is mysql server
From session we can assume that tcp connection established, but server retransmit SYN/ACK to client(from dump we see that server receive ACK, why retransmit ?), but in normal case mysql must generate init packet and send to client, after connection was established.
It is only screen from 1 session, but all other sessions mostly same, SYN -> SYN/ACK -> ACK -> and server retransmit SYN/ACK up to retries_count.
After restart mysql all get normal immediately after restart. So not sure it is related to network or mysql.
Any thoughts would be appropriate.
Thank you!

AWS RDS Aborted Connection Haproxy

I create 1 master and 2 replication in AWS RDS and 1 EC2 with haproxy
listen rds-cluster
bind 172.30.0.xxx:3306
mode tcp
option mysql-check user ha_check
balance roundrobin
server mysql-1 replica1.xxxx.ap-southeast-1.rds.amazonaws.com:3306 check weight 1 fall 2 fastinter 1000
server mysql-2 replica2.xxxx.ap-southeast-1.rds.amazonaws.com:3306 check weight 1 fall 2 fastinter 1000
If I can connect directly using endpoint to replica server,
But if I using haproxy
$ mysql -h172.30.0.xxx -uha_read -ppassword -e "show variables like 'server_id'"
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 0
i got that error
I already increase connect_timeout
if I check
SHOW GLOBAL STATUS LIKE 'Aborted_connects';
it's keep increasing
===============
This article solve my problem
CUSTOM CONFIGURATION OF AMAZON RDS INSTANCES
by default if you did not change the security group settings when launch RDS, only your IP will be authorized to reach your databases. In your case you need to authorize your haproxy node to reach your databases as well.
Go to RDS, select your instance, then security group, edit, add a new rule to enable either the security group of your HAproxy (best practice) or HAproxy IP (still good enough if this is an elastic IP) to access the database on port 3306.
Hope this is clear enough :)
EDIT: I understand that you solved your issue, but for people reading later (or even for you if you want to enhance security) I add a little information about what I said:
the RDS hostname will be resolved to private IP when the DNS query is made from an instance in the same VPC to the Amazon provided DNS server in that VPC. Thus in your security group, in that case, you would have to allow either the subnet of you haproxy or its private IP (not public one).

How can I configure HAProxy to work with server sent events?

I'm trying to add an endpoint to an existing application that sends Server Sent Events. There often may be no event for ~5 minutes. I'm hoping to configure that endpoint to not cut off my server even when the response has not been completed in ~1min, but all other endpoints to timeout if the server fails to respond.
Is there an easy way to support server sent events in HAProxy?
Here is my suggestion for HAProxy and SSE: you have plenty of custom timeout options in HAProxy, and there is 2 interesting options for you.
The timeout tunnel specifies timeout for tunnel connection - used for Websockets, SSE or CONNECT. Bypass both server and client timeout.
The timeout client handles the situation where a client looses their connection (network loss, disappear before the ACK of ending session, etc...)
In your haproxy.cfg, this is what you should do, first in your defaults section :
# Set the max time to wait for a connection attempt to a server to succeed
timeout connect 30s
# Set the max allowed time to wait for a complete HTTP request
timeout client 50s
# Set the maximum inactivity time on the server side
timeout server 50s
Nothing special until there.
Now, still in the defaults section :
# handle the situation where a client suddenly disappears from the net
timeout client-fin 30s
Next, jump to your backend definition and add this:
timeout tunnel 10h
I suggest a high value, 10 hours seems ok.
You should also avoid using the default http-keep-alive option, SSE does not use it. Instead, use http-server-close.