My node can't join the cluster.
I'm installed both servers in Linux Debian 10, the server bootsrap ok, but the node can't join the server.
I tried several settings and follow the guide, but without success.
My server configuration is this:
# Template my.cnf for PXC
# Edit to your requirements.
[client]
socket=/var/run/mysqld/mysqld.sock
ssl-ca=/etc/mysql/certs/ca.pem
ssl-cert=/etc/mysql/certs/client-cert.pem
ssl-key=/etc/mysql/certs/client-key.pem
[mysqld]
server-id=1
datadir=/var/lib/mysql
socket=/var/run/mysqld/mysqld.sock
log-error=/var/log/mysql/error.log
pid-file=/var/run/mysqld/mysqld.pid
ssl-ca=/etc/mysql/certs/ca.pem
ssl-cert=/etc/mysql/certs/server-cert.pem
ssl-key=/etc/mysql/certs/server-key.pem
lower_case_table_names=1
#innodb_dedicated_server=1
innodb_buffer_pool_size=2500M
innodb_log_buffer_size=16M
#expire_logs_days = 3
#max_binlog_size = 300M
#default-storage-engine = InnoDB
collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8
# Binary log expiration period is 604800 seconds, which equals 7 days
binlog_expire_logs_seconds=604800
######## wsrep ###############
# Path to Galera library
wsrep_provider=/usr/lib/galera4/libgalera_smm.so
# Cluster connection URL contains IPs of nodes
#If no IP is found, this implies that a new cluster needs to be created,
#in order to do that you need to bootstrap this node
#wsrep_cluster_address=gcomm://
wsrep_cluster_address=gcomm://152.44.34.207,3.143.85.192
# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW
# Slave thread to use
wsrep_slave_threads=8
wsrep_log_conflicts
# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2
# Node IP address
wsrep_node_address=152.44.34.207
# Cluster name
wsrep_cluster_name=pxc-cluster
#If wsrep_node_name is not specified, then system hostname will be used
wsrep_node_name=pxc-cluster-node-1
#pxc_strict_mode allowed values: DISABLED,PERMISSIVE,ENFORCING,MASTER
pxc_strict_mode=ENFORCING
# SST method
wsrep_sst_method=xtrabackup-v2
wsrep_provider_options=”socket.ssl_key=server-key.pem;socket.ssl_cert=server-cert.pem;socket.ssl_ca=ca.pem”
[sst]
encrypt=4
ssl-key=server-key.pem
ssl-ca=ca.pem
ssl-cert=server-cert.pem
My node configuration, this:
# Template my.cnf for PXC
# Edit to your requirements.
[client]
socket=/var/run/mysqld/mysqld.sock
ssl-ca=/etc/mysql/certs/ca.pem
ssl-cert=/etc/mysql/certs/client-cert.pem
ssl-key=/etc/mysql/certs/client-key.pem
[mysqld]
server-id=1
datadir=/var/lib/mysql
socket=/var/run/mysqld/mysqld.sock
log-error=/var/log/mysql/error.log
pid-file=/var/run/mysqld/mysqld.pid
ssl-ca=/etc/mysql/certs/ca.pem
ssl-cert=/etc/mysql/certs/server-cert.pem
ssl-key=/etc/mysql/certs/server-key.pem
lower_case_table_names=1
#innodb_dedicated_server=1
innodb_buffer_pool_size=2500M
innodb_log_buffer_size=16M
#expire_logs_days = 3
#max_binlog_size = 300M
#default-storage-engine = InnoDB
collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8
# Binary log expiration period is 604800 seconds, which equals 7 days
binlog_expire_logs_seconds=604800
######## wsrep ###############
# Path to Galera library
wsrep_provider=/usr/lib/galera4/libgalera_smm.so
# Cluster connection URL contains IPs of nodes
#If no IP is found, this implies that a new cluster needs to be created,
#in order to do that you need to bootstrap this node
#wsrep_cluster_address=gcomm://
wsrep_cluster_address=gcomm://152.44.34.207,3.143.85.192
# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW
# Slave thread to use
wsrep_slave_threads=8
wsrep_log_conflicts
# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2
# Node IP address
wsrep_node_address=3.143.85.192
# Cluster name
wsrep_cluster_name=pxc-cluster
#If wsrep_node_name is not specified, then system hostname will be used
wsrep_node_name=pxc-cluster-node-2
#pxc_strict_mode allowed values: DISABLED,PERMISSIVE,ENFORCING,MASTER
pxc_strict_mode=ENFORCING
# SST method
wsrep_sst_method=xtrabackup-v2
wsrep_provider_options=”socket.ssl_key=server-key.pem;socket.ssl_cert=server-cert.pem;socket.ssl_ca=ca.pem”
[sst]
encrypt=4
ssl-key=server-key.pem
ssl-ca=ca.pem
ssl-cert=server-cert.pem
The server error log is:
2021-06-22T13:53:46.793403Z 0 [Note] [MY-000000] [Galera] SSL handshake successful, remote endpoint ssl://3.143.85.192:42862 local endpoint ssl://152.44.34.207:4567 cipher: TLS_AES_256_GCM_SHA384 compression: none
2021-06-22T13:53:46.815265Z 0 [Note] [MY-000000] [Galera] (34eb80d9-8ec1, 'ssl://0.0.0.0:4567') connection established to 43c3db4f-8b6f ssl://3.143.85.192:4567
2021-06-22T13:53:46.836932Z 0 [Note] [MY-000000] [Galera] (34eb80d9-8ec1, 'ssl://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2021-06-22T13:53:47.289191Z 0 [Note] [MY-000000] [Galera] declaring 43c3db4f-8b6f at ssl://3.143.85.192:4567 stable
2021-06-22T13:53:47.310756Z 0 [Note] [MY-000000] [Galera] Node 34eb80d9-8ec1 state primary
2021-06-22T13:53:47.332131Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(PRIM,34eb80d9-8ec1,2)
memb {
34eb80d9-8ec1,0
43c3db4f-8b6f,0
}
joined {
}
left {
}
partitioned {
}
)
2021-06-22T13:53:47.332202Z 0 [Note] [MY-000000] [Galera] Save the discovered primary-component to disk
2021-06-22T13:53:47.335049Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
2021-06-22T13:53:47.335191Z 0 [Note] [MY-000000] [Galera] STATE_EXCHANGE: sent state UUID: 4422c892-d361-11eb-91ba-5a82f4f6395e
2021-06-22T13:53:47.360122Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: sent state msg: 4422c892-d361-11eb-91ba-5a82f4f6395e
2021-06-22T13:53:47.381654Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: 4422c892-d361-11eb-91ba-5a82f4f6395e from 0 (pxc-cluster-node-1)
2021-06-22T13:53:47.768196Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: 4422c892-d361-11eb-91ba-5a82f4f6395e from 1 (pxc-cluster-node-2)
2021-06-22T13:53:47.768277Z 0 [Note] [MY-000000] [Galera] Quorum results:
version = 6,
component = PRIMARY,
conf_id = 1,
members = 1/2 (primary/total),
act_id = 33,
last_appl. = 32,
protocols = 2/10/4 (gcs/repl/appl),
vote policy= 0,
group UUID = 45dca0ba-d2cf-11eb-bec1-7eba7f7a7dfe
2021-06-22T13:53:47.768356Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [141, 141]
2021-06-22T13:53:47.768461Z 2 [Note] [MY-000000] [Galera] ####### processing CC 34, local, ordered
2021-06-22T13:53:47.768494Z 2 [Note] [MY-000000] [Galera] Maybe drain monitors from 33 upto current CC event 34 upto:33
2021-06-22T13:53:47.768506Z 2 [Note] [MY-000000] [Galera] Drain monitors from 33 up to 33
2021-06-22T13:53:47.768516Z 2 [Note] [MY-000000] [Galera] ####### My UUID: 34eb80d9-d361-11eb-8ec1-5a3b0be2fce9
2021-06-22T13:53:47.768524Z 2 [Note] [MY-000000] [Galera] Skipping cert index reset
2021-06-22T13:53:47.768531Z 2 [Note] [MY-000000] [Galera] REPL Protocols: 10 (5)
2021-06-22T13:53:47.768540Z 2 [Note] [MY-000000] [Galera] ####### Adjusting cert position: 33 -> 34
2021-06-22T13:53:47.768580Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.
2021-06-22T13:53:47.771208Z 2 [Note] [MY-000000] [Galera] ================================================
View:
id: 45dca0ba-d2cf-11eb-bec1-7eba7f7a7dfe:34
status: primary
protocol_version: 4
capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
final: no
own_index: 0
members(2):
0: 34eb80d9-d361-11eb-8ec1-5a3b0be2fce9, pxc-cluster-node-1
1: 43c3db4f-d361-11eb-8b6f-3e5ca8791b18, pxc-cluster-node-2
=================================================
2021-06-22T13:53:47.771239Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2021-06-22T13:53:47.775065Z 2 [Note] [MY-000000] [Galera] Recording CC from group: 34
2021-06-22T13:53:47.775089Z 2 [Note] [MY-000000] [Galera] Lowest cert index boundary for CC from group: 34
2021-06-22T13:53:47.775098Z 2 [Note] [MY-000000] [Galera] Min available from gcache for CC from group: 1
2021-06-22T13:53:48.811770Z 0 [Note] [MY-000000] [Galera] Member 1.0 (pxc-cluster-node-2) requested state transfer from '*any*'. Selected 0.0 (pxc-cluster-node-1)(SYNCED) as donor.
2021-06-22T13:53:48.811843Z 0 [Note] [MY-000000] [Galera] Shifting SYNCED -> DONOR/DESYNCED (TO: 34)
2021-06-22T13:53:48.811888Z 2 [Note] [MY-000000] [Galera] Detected STR version: 1, req_len: 64, req: STRv1
2021-06-22T13:53:48.811912Z 2 [Warning] [MY-000000] [Galera] Joiner didn't provide IST connection info - cert. index preload impossible, bailing out.
2021-06-22T13:53:48.833331Z 0 [Warning] [MY-000000] [Galera] 0.0 (pxc-cluster-node-1): State transfer to 1.0 (pxc-cluster-node-2) failed: -42 (No message of desired type)
2021-06-22T13:53:48.833367Z 0 [Note] [MY-000000] [Galera] Shifting DONOR/DESYNCED -> JOINED (TO: 34)
2021-06-22T13:53:48.854846Z 0 [Note] [MY-000000] [Galera] Member 0.0 (pxc-cluster-node-1) synced with group.
2021-06-22T13:53:48.854899Z 0 [Note] [MY-000000] [Galera] Shifting JOINED -> SYNCED (TO: 34)
2021-06-22T13:53:48.854968Z 16 [Note] [MY-000000] [Galera] Server pxc-cluster-node-1 synced with group
2021-06-22T13:53:49.855435Z 0 [Note] [MY-000000] [Galera] forgetting 43c3db4f-8b6f (ssl://3.143.85.192:4567)
2021-06-22T13:53:49.855498Z 0 [Note] [MY-000000] [Galera] (34eb80d9-8ec1, 'ssl://0.0.0.0:4567') turning message relay requesting off
2021-06-22T13:53:49.855528Z 0 [Note] [MY-000000] [Galera] Node 34eb80d9-8ec1 state primary
2021-06-22T13:53:49.855563Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(PRIM,34eb80d9-8ec1,3)
memb {
34eb80d9-8ec1,0
}
joined {
}
left {
}
partitioned {
43c3db4f-8b6f,0
}
)
2021-06-22T13:53:49.855575Z 0 [Note] [MY-000000] [Galera] Save the discovered primary-component to disk
2021-06-22T13:53:49.857453Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
2021-06-22T13:53:49.857546Z 0 [Note] [MY-000000] [Galera] forgetting 43c3db4f-8b6f (ssl://3.143.85.192:4567)
2021-06-22T13:53:49.857598Z 0 [Note] [MY-000000] [Galera] STATE_EXCHANGE: sent state UUID: 45a3abe8-d361-11eb-a845-5e19d7621cf0
2021-06-22T13:53:49.857618Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: sent state msg: 45a3abe8-d361-11eb-a845-5e19d7621cf0
2021-06-22T13:53:49.857634Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: 45a3abe8-d361-11eb-a845-5e19d7621cf0 from 0 (pxc-cluster-node-1)
2021-06-22T13:53:49.857663Z 0 [Note] [MY-000000] [Galera] Quorum results:
version = 6,
component = PRIMARY,
conf_id = 2,
members = 1/1 (primary/total),
act_id = 34,
last_appl. = 32,
protocols = 2/10/4 (gcs/repl/appl),
vote policy= 0,
group UUID = 45dca0ba-d2cf-11eb-bec1-7eba7f7a7dfe
2021-06-22T13:53:49.857699Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [100, 100]
2021-06-22T13:53:49.857801Z 12 [Note] [MY-000000] [Galera] ####### processing CC 35, local, ordered
2021-06-22T13:53:49.857835Z 12 [Note] [MY-000000] [Galera] Maybe drain monitors from 34 upto current CC event 35 upto:34
2021-06-22T13:53:49.857845Z 12 [Note] [MY-000000] [Galera] Drain monitors from 34 up to 34
2021-06-22T13:53:49.857854Z 12 [Note] [MY-000000] [Galera] ####### My UUID: 34eb80d9-d361-11eb-8ec1-5a3b0be2fce9
2021-06-22T13:53:49.857862Z 12 [Note] [MY-000000] [Galera] Skipping cert index reset
2021-06-22T13:53:49.857869Z 12 [Note] [MY-000000] [Galera] REPL Protocols: 10 (5)
2021-06-22T13:53:49.857877Z 12 [Note] [MY-000000] [Galera] ####### Adjusting cert position: 34 -> 35
2021-06-22T13:53:49.857940Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.
2021-06-22T13:53:49.859286Z 12 [Note] [MY-000000] [Galera] ================================================
View:
id: 45dca0ba-d2cf-11eb-bec1-7eba7f7a7dfe:35
status: primary
protocol_version: 4
capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
final: no
own_index: 0
members(1):
0: 34eb80d9-d361-11eb-8ec1-5a3b0be2fce9, pxc-cluster-node-1
=================================================
2021-06-22T13:53:49.859323Z 12 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2021-06-22T13:53:49.862140Z 12 [Note] [MY-000000] [Galera] Recording CC from group: 35
2021-06-22T13:53:49.862164Z 12 [Note] [MY-000000] [Galera] Lowest cert index boundary for CC from group: 35
2021-06-22T13:53:49.862173Z 12 [Note] [MY-000000] [Galera] Min available from gcache for CC from group: 1
2021-06-22T13:53:55.316854Z 0 [Note] [MY-000000] [Galera] cleaning up 43c3db4f-8b6f (ssl://3.143.85.192:4567)
The node log is:
2021-06-22T13:53:46.446447Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.23-14.1) starting as process 28549
2021-06-22T13:53:46.447416Z 0 [Warning] [MY-013242] [Server] --character-set-server: 'utf8' is currently an alias for the character set UTF8MB3, but will be an alias for UTF8MB4 in a future release. Please consider using UTF8MB4 in order to be unambiguous.
2021-06-22T13:53:46.447428Z 0 [Warning] [MY-013244] [Server] --collation-server: 'utf8_unicode_ci' is a collation of the deprecated character set UTF8MB3. Please consider using UTF8MB4 with an appropriate collation instead.
2021-06-22T13:53:46.450889Z 0 [Warning] [MY-010068] [Server] CA certificate /etc/mysql/certs/ca.pem is self signed.
2021-06-22T13:53:46.450949Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel.
2021-06-22T13:53:46.450985Z 0 [Note] [MY-000000] [WSREP] New joining cluster node configured to use specified SSL artifacts
2021-06-22T13:53:46.451039Z 0 [Note] [MY-000000] [Galera] Loading provider /usr/lib/galera4/libgalera_smm.so initial position: 69c901dc-d2e9-11eb-ac90-5ec816a285fd:1
2021-06-22T13:53:46.451084Z 0 [Note] [MY-000000] [Galera] wsrep_load(): loading provider library '/usr/lib/galera4/libgalera_smm.so'
2021-06-22T13:53:46.451707Z 0 [Note] [MY-000000] [Galera] wsrep_load(): Galera 4.7(r752664d) by Codership Oy <info#codership.com> loaded successfully.
2021-06-22T13:53:46.451773Z 0 [Note] [MY-000000] [Galera] CRC-32C: using 64-bit x86 acceleration.
2021-06-22T13:53:46.452068Z 0 [ERROR] [MY-000000] [Galera] Unrecognized parameter '”socket.ssl_key'
2021-06-22T13:53:46.452458Z 0 [Note] [MY-000000] [Galera] Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 1
2021-06-22T13:53:46.452532Z 0 [Note] [MY-000000] [Galera] GCache DEBUG: opened preamble:
Version: 2
UUID: 45dca0ba-d2cf-11eb-bec1-7eba7f7a7dfe
Seqno: -1 - -1
Offset: -1
Synced: 0
2021-06-22T13:53:46.452548Z 0 [Note] [MY-000000] [Galera] Recovering GCache ring buffer: version: 2, UUID: 45dca0ba-d2cf-11eb-bec1-7eba7f7a7dfe, offset: -1
2021-06-22T13:53:46.452610Z 0 [Note] [MY-000000] [Galera] GCache::RingBuffer initial scan... 0.0% ( 0/134217752 bytes) complete.
2021-06-22T13:53:46.503384Z 0 [Note] [MY-000000] [Galera] GCache::RingBuffer initial scan...100.0% (134217752/134217752 bytes) complete.
2021-06-22T13:53:46.503424Z 0 [Note] [MY-000000] [Galera] Recovering GCache ring buffer: didn't recover any events.
2021-06-22T13:53:46.504534Z 0 [Note] [MY-000000] [Galera] Complete reset of the galera cache
2021-06-22T13:53:46.584064Z 0 [Note] [MY-000000] [Galera] Flushing memory map to disk...
2021-06-22T13:53:46.703025Z 0 [Note] [MY-000000] [Galera] Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 3.143.85.192; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 10; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 4; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.freeze_purge_at_seqno = -1; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 100; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery = true; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = PT30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 10; socket.checksum = 2; socket.recv_buf_size = auto; socket.send_buf_size = auto; socket.ssl_ca = /etc/mysql/certs/ca.pem; socket.ssl_cert = /etc/mysql/certs/server-cert.pem; socket.ssl_cipher = ; socket.ssl_compression = YES; socket.ssl_key = /etc/mysql/certs/server-key.pem;
2021-06-22T13:53:46.712223Z 0 [Note] [MY-000000] [WSREP] Starting replication
2021-06-22T13:53:46.712273Z 0 [Note] [MY-000000] [Galera] Connecting with bootstrap option: 0
2021-06-22T13:53:46.712294Z 0 [Note] [MY-000000] [Galera] Setting GCS initial position to 00000000-0000-0000-0000-000000000000:-1
2021-06-22T13:53:46.712356Z 0 [Note] [MY-000000] [Galera] protonet asio version 0
2021-06-22T13:53:46.712513Z 0 [Note] [MY-000000] [Galera] Using CRC-32C for message checksums.
2021-06-22T13:53:46.712534Z 0 [Note] [MY-000000] [Galera] initializing ssl context
2021-06-22T13:53:46.712764Z 0 [Note] [MY-000000] [Galera] backend: asio
2021-06-22T13:53:46.712847Z 0 [Note] [MY-000000] [Galera] gcomm thread scheduling priority set to other:0
2021-06-22T13:53:46.712932Z 0 [Warning] [MY-000000] [Galera] Fail to access the file (/var/lib/mysql//gvwstate.dat) error (No such file or directory). It is possible if node is booting for first time or re-booting after a graceful shutdown
2021-06-22T13:53:46.712949Z 0 [Note] [MY-000000] [Galera] Restoring primary-component from disk failed. Either node is booting for first time or re-booting after a graceful shutdown
2021-06-22T13:53:46.713090Z 0 [Note] [MY-000000] [Galera] GMCast version 0
2021-06-22T13:53:46.713214Z 0 [Note] [MY-000000] [Galera] (43c3db4f-8b6f, 'ssl://0.0.0.0:4567') listening at ssl://0.0.0.0:4567
2021-06-22T13:53:46.713228Z 0 [Note] [MY-000000] [Galera] (43c3db4f-8b6f, 'ssl://0.0.0.0:4567') multicast: , ttl: 1
2021-06-22T13:53:46.713509Z 0 [Note] [MY-000000] [Galera] EVS version 1
2021-06-22T13:53:46.713588Z 0 [Note] [MY-000000] [Galera] gcomm: connecting to group 'pxc-cluster', peer '152.44.34.207:,3.143.85.192:'
2021-06-22T13:53:46.761126Z 0 [Note] [MY-000000] [Galera] SSL handshake successful, remote endpoint ssl://152.44.34.207:4567 local endpoint ssl://172.31.36.42:42862 cipher: TLS_AES_256_GCM_SHA384 compression: none
2021-06-22T13:53:46.805155Z 0 [Note] [MY-000000] [Galera] (43c3db4f-8b6f, 'ssl://0.0.0.0:4567') connection established to 34eb80d9-8ec1 ssl://152.44.34.207:4567
2021-06-22T13:53:46.805259Z 0 [Note] [MY-000000] [Galera] (43c3db4f-8b6f, 'ssl://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2021-06-22T13:53:47.278818Z 0 [Note] [MY-000000] [Galera] EVS version upgrade 0 -> 1
2021-06-22T13:53:47.278864Z 0 [Note] [MY-000000] [Galera] declaring 34eb80d9-8ec1 at ssl://152.44.34.207:4567 stable
2021-06-22T13:53:47.278886Z 0 [Note] [MY-000000] [Galera] PC protocol upgrade 0 -> 1
2021-06-22T13:53:47.300343Z 0 [Note] [MY-000000] [Galera] Node 34eb80d9-8ec1 state primary
2021-06-22T13:53:47.324984Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(PRIM,34eb80d9-8ec1,2)
memb {
34eb80d9-8ec1,0
43c3db4f-8b6f,0
}
joined {
}
left {
}
partitioned {
}
)
2021-06-22T13:53:47.325016Z 0 [Note] [MY-000000] [Galera] Save the discovered primary-component to disk
2021-06-22T13:53:47.328229Z 0 [Note] [MY-000000] [Galera] discarding pending addr without UUID: ssl://3.143.85.192:4567
2021-06-22T13:53:47.328250Z 0 [Note] [MY-000000] [Galera] discarding pending addr proto entry 0x55cd9e76eb50
2021-06-22T13:53:47.714199Z 0 [Note] [MY-000000] [Galera] gcomm: connected
2021-06-22T13:53:47.714266Z 0 [Note] [MY-000000] [Galera] Changing maximum packet size to 64500, resulting msg size: 32636
2021-06-22T13:53:47.714442Z 0 [Note] [MY-000000] [Galera] Shifting CLOSED -> OPEN (TO: 0)
2021-06-22T13:53:47.714463Z 0 [Note] [MY-000000] [Galera] Opened channel 'pxc-cluster'
2021-06-22T13:53:47.714585Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
2021-06-22T13:53:47.714664Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: Waiting for state UUID.
2021-06-22T13:53:47.714844Z 1 [Note] [MY-000000] [WSREP] Starting rollbacker thread 1
2021-06-22T13:53:47.714890Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: sent state msg: 4422c892-d361-11eb-91ba-5a82f4f6395e
2021-06-22T13:53:47.714925Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: 4422c892-d361-11eb-91ba-5a82f4f6395e from 0 (pxc-cluster-node-1)
2021-06-22T13:53:47.715009Z 2 [Note] [MY-000000] [WSREP] Starting applier thread 2
2021-06-22T13:53:47.736445Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: 4422c892-d361-11eb-91ba-5a82f4f6395e from 1 (pxc-cluster-node-2)
2021-06-22T13:53:47.736496Z 0 [Note] [MY-000000] [Galera] Quorum results:
version = 6,
component = PRIMARY,
conf_id = 1,
members = 1/2 (primary/total),
act_id = 33,
last_appl. = 32,
protocols = 2/10/4 (gcs/repl/appl),
vote policy= 0,
group UUID = 45dca0ba-d2cf-11eb-bec1-7eba7f7a7dfe
2021-06-22T13:53:47.736546Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [141, 141]
2021-06-22T13:53:47.736560Z 0 [Note] [MY-000000] [Galera] Shifting OPEN -> PRIMARY (TO: 34)
2021-06-22T13:53:47.736621Z 2 [Note] [MY-000000] [Galera] ####### processing CC 34, local, ordered
2021-06-22T13:53:47.736662Z 2 [Note] [MY-000000] [Galera] Maybe drain monitors from -1 upto current CC event 34 upto:-1
2021-06-22T13:53:47.736676Z 2 [Note] [MY-000000] [Galera] Drain monitors from -1 up to -1
2021-06-22T13:53:47.736691Z 2 [Note] [MY-000000] [Galera] Process first view: 45dca0ba-d2cf-11eb-bec1-7eba7f7a7dfe my uuid: 43c3db4f-d361-11eb-8b6f-3e5ca8791b18
2021-06-22T13:53:47.736717Z 2 [Note] [MY-000000] [Galera] Server pxc-cluster-node-2 connected to cluster at position 45dca0ba-d2cf-11eb-bec1-7eba7f7a7dfe:34 with ID 43c3db4f-d361-11eb-8b6f-3e5ca8791b18
2021-06-22T13:53:47.736733Z 2 [Note] [MY-000000] [WSREP] Server status change disconnected -> connected
2021-06-22T13:53:47.736753Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2021-06-22T13:53:47.736788Z 2 [Note] [MY-000000] [Galera] ####### My UUID: 43c3db4f-d361-11eb-8b6f-3e5ca8791b18
2021-06-22T13:53:47.736807Z 2 [Note] [MY-000000] [Galera] Cert index reset to 00000000-0000-0000-0000-000000000000:-1 (proto: 10), state transfer needed: yes
2021-06-22T13:53:47.736881Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.
2021-06-22T13:53:47.736925Z 2 [Note] [MY-000000] [Galera] ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: -1
2021-06-22T13:53:47.736943Z 2 [Note] [MY-000000] [Galera] State transfer required:
Group state: 45dca0ba-d2cf-11eb-bec1-7eba7f7a7dfe:34
Local state: 00000000-0000-0000-0000-000000000000:-1
2021-06-22T13:53:47.736954Z 2 [Note] [MY-000000] [WSREP] Server status change connected -> joiner
2021-06-22T13:53:47.736965Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2021-06-22T13:53:47.737466Z 0 [Note] [MY-000000] [WSREP] Initiating SST/IST transfer on JOINER side (wsrep_sst_xtrabackup-v2 --role 'joiner' --address '3.143.85.192' --datadir '/var/lib/mysql/' --basedir '/usr/' --plugindir '/usr/lib/mysql/plugin/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '28549' --mysqld-version '8.0.23-14.1' '' )
2021-06-22T13:53:48.388054Z 0 [Warning] [MY-000000] [WSREP-SST] Found a stale sst_in_progress file: /var/lib/mysql//sst_in_progress
2021-06-22T13:53:48.757877Z 2 [Note] [MY-000000] [WSREP] Prepared SST request: xtrabackup-v2|3.143.85.192:4444/xtrabackup_sst//1
2021-06-22T13:53:48.757935Z 2 [Note] [MY-000000] [Galera] Check if state gap can be serviced using IST
2021-06-22T13:53:48.757963Z 2 [Note] [MY-000000] [Galera] Local UUID: 00000000-0000-0000-0000-000000000000 != Group UUID: 45dca0ba-d2cf-11eb-bec1-7eba7f7a7dfe
2021-06-22T13:53:48.757986Z 2 [Note] [MY-000000] [Galera] ####### IST uuid:00000000-0000-0000-0000-000000000000 f: 0, l: 34, STRv: 3
2021-06-22T13:53:48.758062Z 2 [Note] [MY-000000] [Galera] IST receiver addr using ssl://3.143.85.192:4568
2021-06-22T13:53:48.758114Z 2 [Note] [MY-000000] [Galera] IST receiver using ssl
2021-06-22T13:53:48.758421Z 2 [Note] [MY-000000] [Galera] State gap can't be serviced using IST. Switching to SST
2021-06-22T13:53:48.758438Z 2 [Warning] [MY-000000] [Galera] Failed to prepare for incremental state transfer: Failed to open IST listener at ssl://3.143.85.192:4568', asio error 'bind: Cannot assign requested address': 99 (Cannot assign requested address)
at galera/src/ist.cpp:prepare():376. IST will be unavailable.
2021-06-22T13:53:48.780145Z 0 [Note] [MY-000000] [Galera] Member 1.0 (pxc-cluster-node-2) requested state transfer from '*any*'. Selected 0.0 (pxc-cluster-node-1)(SYNCED) as donor.
2021-06-22T13:53:48.780196Z 0 [Note] [MY-000000] [Galera] Shifting PRIMARY -> JOINER (TO: 34)
2021-06-22T13:53:48.780246Z 2 [Note] [MY-000000] [Galera] Requesting state transfer: success, donor: 0
2021-06-22T13:53:48.780273Z 2 [Note] [MY-000000] [Galera] Resetting GCache seqno map due to different histories.
2021-06-22T13:53:48.780298Z 2 [Note] [MY-000000] [Galera] GCache history reset: 45dca0ba-d2cf-11eb-bec1-7eba7f7a7dfe:0 -> 45dca0ba-d2cf-11eb-bec1-7eba7f7a7dfe:34
2021-06-22T13:53:48.823302Z 0 [Warning] [MY-000000] [Galera] 0.0 (pxc-cluster-node-1): State transfer to 1.0 (pxc-cluster-node-2) failed: -42 (No message of desired type)
2021-06-22T13:53:48.823350Z 0 [ERROR] [MY-000000] [Galera] gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():1214: Will never receive state. Need to abort.
2021-06-22T13:53:48.823373Z 0 [Note] [MY-000000] [Galera] gcomm: terminating thread
2021-06-22T13:53:48.823410Z 0 [Note] [MY-000000] [Galera] gcomm: joining thread
2021-06-22T13:53:48.823481Z 0 [Note] [MY-000000] [Galera] gcomm: closing backend
2021-06-22T13:53:49.847630Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(NON_PRIM,34eb80d9-8ec1,2)
memb {
43c3db4f-8b6f,0
}
joined {
}
left {
}
partitioned {
34eb80d9-8ec1,0
}
)
2021-06-22T13:53:49.847692Z 0 [Note] [MY-000000] [Galera] (43c3db4f-8b6f, 'ssl://0.0.0.0:4567') turning message relay requesting off
2021-06-22T13:53:49.847714Z 0 [Note] [MY-000000] [Galera] PC protocol downgrade 1 -> 0
2021-06-22T13:53:49.847728Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view ((empty))
2021-06-22T13:53:49.847917Z 0 [Note] [MY-000000] [Galera] gcomm: closed
2021-06-22T13:53:49.847939Z 0 [Note] [MY-000000] [Galera] /usr/sbin/mysqld: Terminated.
2021-06-22T13:53:49.847964Z 0 [Note] [MY-000000] [WSREP] Initiating SST cancellation
2021-06-22T13:53:49.847975Z 0 [Note] [MY-000000] [WSREP] Terminating SST process
Related
We have three nodes in Galera cluster with IPs 172.21.100.23, 172.21.100.24 and 172.21.100.25. The cluster had run successfully for a year.
Recently I changed my.cnf on 23 to add some bin log configurations. After that, I restarted mysql on 23,the start command 'systemctl start mysqld' suspended, and it can not be stopped using 'systemctl stop mysqld'. So I found out mysql process id and killed it.
After that I removed the added configurations in my.cnf on 23, restoring it to the state before, but I still can not start mysql successfully.
I had looked into grastate.dat in datadir, it has following content:
version 2.1
UUID: 00000000-0000-0000-0000-000000000000
seqno:-1
safe_to_bootstrap:0
I had run 'mysqld --wsrep-recover' on 23, it executed successfully, but it had no effect.
In mysqld.log, there are some lines generated after executing 'mysqld --wsrep-recover',
Found saved state:00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap:0
it is the same as in grastate.dat.
Following is the my.cnf on 23:
[mysqld]
datadir=/data/g_mysql_data/data
socket=/var/lib/mysql/mysql.sock
user=mysql
#bind-address=172.21.100.23
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=0
innodb_buffer_pool_size=2000M
wsrep_provider=/usr/lib64/galera-3/libgalera_smm.so
wsrep_provider_options="gcache.size=5000M; gcache.page_size=300M"
wsrep_cluster_name="galera_cluster1"
wsrep_cluster_address="gcomm://172.21.100.23,172.21.100.24,172.21.100.25"
wsrep_sst_method=rsync
server_id=1
wsrep_node_address="172.21.100.23"
wsrep_node_name="gcm1"
log-error=/data/g_mysql_log/mysqld.log
pid-file=/data/g_mysql_log/mysqld.pid
sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
log_output=TABLE
max_connections=500
group_concat_max_len=102400
innodb_log_file_size=1024M
innodb_strict_mode=0
[mysql_safe]
Following is mysqld.log:
2022-01-13T03:42:22.176244Z 0 [Note] InnoDB: Log scan progressed past the checkpoint lsn 254788306500
2022-01-13T03:42:22.176267Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 254788308116
2022-01-13T03:42:22.176616Z 0 [Note] InnoDB: Database was not shutdown normally!
2022-01-13T03:42:22.176622Z 0 [Note] InnoDB: Starting crash recovery.
2022-01-13T03:42:22.190460Z 0 [Note] InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 03:42:22 UTC - mysqld got signal 11 ;
2022-01-13 11:42:22 0x7ff8f1da6700This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
InnoDB: Assertion failure in thread 140707186239232 in file log0recv.cc line 1930
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
InnoDB: Failing assertion: !page || (ibool)!!page_is_comp(page) == dict_table_is_comp(index->table)
key_buffer_size=8388608
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
read_buffer_size=131072
max_used_connections=0
max_threads=500
thread_count=0
connection_count=0
Fatal signal 6 while backtracing
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 207164 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
2022-01-13T03:42:22.559205Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2022-01-13T03:42:22.560595Z 0 [Note] /usr/sbin/mysqld (mysqld 5.7.30) starting as process 229812 ...
2022-01-13T03:42:22.563894Z 0 [Note] WSREP: Read nil XID from storage engines, skipping position init
2022-01-13T03:42:22.563910Z 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera-3/libgalera_smm.so'
2022-01-13T03:42:22.564682Z 0 [Note] WSREP: wsrep_load(): Galera 3.30(r4e1a604) by Codership Oy <info#codership.com> loaded successfully.
2022-01-13T03:42:22.564704Z 0 [Note] WSREP: CRC-32C: using hardware acceleration.
2022-01-13T03:42:22.565018Z 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 0
2022-01-13T03:42:22.566424Z 0 [Note] WSREP: Passing config to GCS: base_dir = /data/g_mysql_data/data/; base_host = 172.21.100.23; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /data/g_mysql_data/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /data/g_mysql_data/data//galera.cache; gcache.page_size = 300M; gcache.recover = no; gcache.size = 5000M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0
2022-01-13T03:42:22.589530Z 0 [Note] WSREP: GCache history reset: 60a4edc4-d099-11ea-ba22-c3903525a426:0 -> 00000000-0000-0000-0000-000000000000:-1
2022-01-13T03:42:22.589730Z 0 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
2022-01-13T03:42:22.589767Z 0 [Note] WSREP: wsrep_sst_grab()
2022-01-13T03:42:22.589776Z 0 [Note] WSREP: Start replication
2022-01-13T03:42:22.589800Z 0 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
2022-01-13T03:42:22.589962Z 0 [Note] WSREP: protonet asio version 0
2022-01-13T03:42:22.590183Z 0 [Note] WSREP: Using CRC-32C for message checksums.
2022-01-13T03:42:22.590241Z 0 [Note] WSREP: backend: asio
2022-01-13T03:42:22.590325Z 0 [Note] WSREP: gcomm thread scheduling priority set to other:0
2022-01-13T03:42:22.590497Z 0 [Warning] WSREP: access file(/data/g_mysql_data/data//gvwstate.dat) failed(No such file or directory)
2022-01-13T03:42:22.590510Z 0 [Note] WSREP: restore pc from disk failed
2022-01-13T03:42:22.590940Z 0 [Note] WSREP: GMCast version 0
2022-01-13T03:42:22.591482Z 0 [Note] WSREP: (d101bfe8, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2022-01-13T03:42:22.591499Z 0 [Note] WSREP: (d101bfe8, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2022-01-13T03:42:22.592211Z 0 [Note] WSREP: EVS version 0
2022-01-13T03:42:22.592444Z 0 [Note] WSREP: gcomm: connecting to group 'galera_cluster1', peer '172.21.100.23:,172.21.100.24:,172.21.100.25:'
2022-01-13T03:42:22.593730Z 0 [Note] WSREP: (d101bfe8, 'tcp://0.0.0.0:4567') Found matching local endpoint for a connection, blacklisting address tcp://172.21.100.23:4567
2022-01-13T03:42:22.594410Z 0 [Note] WSREP: (d101bfe8, 'tcp://0.0.0.0:4567') connection established to 1c9e42c9 tcp://172.21.100.24:4567
2022-01-13T03:42:22.594581Z 0 [Note] WSREP: (d101bfe8, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2022-01-13T03:42:22.594628Z 0 [Note] WSREP: (d101bfe8, 'tcp://0.0.0.0:4567') connection established to 2bc6a038 tcp://172.21.100.25:4567
2022-01-13T03:42:23.095574Z 0 [Note] WSREP: declaring 1c9e42c9 at tcp://172.21.100.24:4567 stable
2022-01-13T03:42:23.095627Z 0 [Note] WSREP: declaring 2bc6a038 at tcp://172.21.100.25:4567 stable
2022-01-13T03:42:23.096046Z 0 [Note] WSREP: Node 1c9e42c9 state prim
2022-01-13T03:42:23.096446Z 0 [Note] WSREP: view(view_id(PRIM,1c9e42c9,105) memb {
1c9e42c9,0
2bc6a038,0
d101bfe8,0
} joined {
} left {
} partitioned {
})
2022-01-13T03:42:23.096484Z 0 [Note] WSREP: save pc into disk
2022-01-13T03:42:23.595866Z 0 [Note] WSREP: gcomm: connected
2022-01-13T03:42:23.595922Z 0 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2022-01-13T03:42:23.595999Z 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2022-01-13T03:42:23.596010Z 0 [Note] WSREP: Opened channel 'galera_cluster1'
2022-01-13T03:42:23.596154Z 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
2022-01-13T03:42:23.596228Z 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2022-01-13T03:42:23.596239Z 0 [Note] WSREP: Waiting for SST to complete.
2022-01-13T03:42:23.596294Z 0 [Note] WSREP: STATE EXCHANGE: sent state msg: d1514dd1-7422-11ec-a8c5-9ad193910923
2022-01-13T03:42:23.596312Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: d1514dd1-7422-11ec-a8c5-9ad193910923 from 0 (gcm2)
2022-01-13T03:42:23.596338Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: d1514dd1-7422-11ec-a8c5-9ad193910923 from 1 (gcm3)
2022-01-13T03:42:23.596883Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: d1514dd1-7422-11ec-a8c5-9ad193910923 from 2 (gcm1)
2022-01-13T03:42:23.596907Z 0 [Note] WSREP: Quorum results:
version = 6,
component = PRIMARY,
conf_id = 104,
members = 2/3 (joined/total),
act_id = 257246765,
last_appl. = -1,
protocols = 0/9/3 (gcs/repl/appl),
group UUID = 60a4edc4-d099-11ea-ba22-c3903525a426
2022-01-13T03:42:23.596922Z 0 [Note] WSREP: Flow-control interval: [28, 28]
2022-01-13T03:42:23.596933Z 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 257246765)
2022-01-13T03:42:23.597050Z 2 [Note] WSREP: State transfer required:
Group state: 60a4edc4-d099-11ea-ba22-c3903525a426:257246765
Local state: 00000000-0000-0000-0000-000000000000:-1
2022-01-13T03:42:23.597103Z 2 [Note] WSREP: REPL Protocols: 9 (4, 2)
2022-01-13T03:42:23.597130Z 2 [Note] WSREP: New cluster view: global state: 60a4edc4-d099-11ea-ba22-c3903525a426:257246765, view# 105: Primary, number of nodes: 3, my index: 2, protocol version 3
2022-01-13T03:42:23.597140Z 2 [Warning] WSREP: Gap in state sequence. Need state transfer.
2022-01-13T03:42:23.597287Z 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '172.21.100.23' --datadir '/data/g_mysql_data/data/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '229812' '' '
2022-01-13T03:42:23.740222Z 2 [Note] WSREP: Prepared SST request: rsync|172.21.100.23:4444/rsync_sst
2022-01-13T03:42:23.740273Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2022-01-13T03:42:23.740313Z 2 [Note] WSREP: Assign initial position for certification: 257246765, protocol version: 4
2022-01-13T03:42:23.740360Z 0 [Note] WSREP: Service thread queue flushed.
2022-01-13T03:42:23.740490Z 2 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (60a4edc4-d099-11ea-ba22-c3903525a426): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():463. IST will be unavailable.
2022-01-13T03:42:23.741249Z 0 [Note] WSREP: Member 2.0 (gcm1) requested state transfer from '*any*'. Selected 0.0 (gcm2)(SYNCED) as donor.
2022-01-13T03:42:23.741287Z 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 257246766)
2022-01-13T03:42:23.741323Z 2 [Note] WSREP: Requesting state transfer: success, donor: 0
2022-01-13T03:42:23.741352Z 2 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 60a4edc4-d099-11ea-ba22-c3903525a426:257246765
2022-01-13T03:42:25.598167Z 0 [Note] WSREP: (d101bfe8, 'tcp://0.0.0.0:4567') turning message relay requesting off
Any help will be greatly appreciated.
I meet some troubles with initiliazing a Galera cluster, the start of 2nd node always failed, without error message in log.
I have two nodes for the moment, i will install the third later. Here are my configuration
node1 : 192.168.0.21 db01
node2 : 192.168.0.22 db02
both /etc/hosts are field with the hostnames entries.
My galera.cnf looks like this on both nodes :
[mysqld]
#mysql settings
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
query_cache_size=0
query_cache_type=0
bind-address=0.0.0.0
#galera settings
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="my_wsrep_cluster"
wsrep_cluster_address="gcomm://192.168.0.21,192.168.0.22"
wsrep_node_address="192.168.0.21" # 192.168.0.22 on db02
wsrep_node_name="db01" # db02 on db02 server
wsrep_sst_method=rsync
log-error=/var/log/mysql/error.log
I can start service on db01 with this command no trouble :
service mysql start --wsrep-new-cluster
But when i start db02 with service mysql start i have a failed message. And the service didn't listen on port 3306 on DB02. Here are my logs which said than db02 detect the cluster, detect db01 , and need to synchronize, but the synchronisation doesn't seems to start....
161009 13:33:02 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
161009 13:33:02 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.7VTkrM' --pid-file='/var/lib/mysql/db02-recover.pid'
161009 13:33:02 [Note] /usr/sbin/mysqld (mysqld 5.5.52-MariaDB-1~wheezy-wsrep) starting as process 14839 ...
161009 13:33:04 mysqld_safe WSREP: Recovered position f89d319e-8e08-11e6-8b25-ebe3bbf9c45b:2
161009 13:33:04 [Note] WSREP: wsrep_start_position var submitted: 'f89d319e-8e08-11e6-8b25-ebe3bbf9c45b:2'
161009 13:33:04 [Note] /usr/sbin/mysqld (mysqld 5.5.52-MariaDB-1~wheezy-wsrep) starting as process 14890 ...
161009 13:33:04 [Note] WSREP: Read nil XID from storage engines, skipping position init
161009 13:33:04 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
161009 13:33:04 [Note] WSREP: wsrep_load(): Galera 25.3.17(r3619) by Codership Oy <info#codership.com> loaded successfully.
161009 13:33:04 [Note] WSREP: CRC-32C: using hardware acceleration.
161009 13:33:04 [Note] WSREP: Found saved state: f89d319e-8e08-11e6-8b25-ebe3bbf9c45b:-1
161009 13:33:04 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.0.22; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false;
161009 13:33:05 [Note] WSREP: Service thread queue flushed.
161009 13:33:05 [Note] WSREP: Assign initial position for certification: 2, protocol version: -1
161009 13:33:05 [Note] WSREP: wsrep_sst_grab()
161009 13:33:05 [Note] WSREP: Start replication
161009 13:33:05 [Note] WSREP: Setting initial position to f89d319e-8e08-11e6-8b25-ebe3bbf9c45b:2
161009 13:33:05 [Note] WSREP: protonet asio version 0
161009 13:33:05 [Note] WSREP: Using CRC-32C for message checksums.
161009 13:33:05 [Note] WSREP: backend: asio
161009 13:33:05 [Note] WSREP: gcomm thread scheduling priority set to other:0
161009 13:33:05 [Note] WSREP: restore pc from disk successfully
161009 13:33:05 [Note] WSREP: GMCast version 0
161009 13:33:05 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
161009 13:33:05 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
161009 13:33:05 [Note] WSREP: EVS version 0
161009 13:33:05 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '192.168.0.21:,192.168.0.22:'
161009 13:33:05 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') connection established to def7d829 tcp://192.168.0.22:4567
161009 13:33:05 [Warning] WSREP: (def7d829, 'tcp://0.0.0.0:4567') address 'tcp://192.168.0.22:4567' points to own listening address, blacklisting
161009 13:33:05 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') connection established to def7d829 tcp://192.168.0.22:4567
161009 13:33:05 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') connection established to 51e5ab0b tcp://192.168.0.21:4567
161009 13:33:05 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
161009 13:33:05 [Note] WSREP: declaring 51e5ab0b at tcp://192.168.0.21:4567 stable
161009 13:33:05 [Note] WSREP: re-bootstrapping prim from partitioned components
161009 13:33:05 [Note] WSREP: view(view_id(PRIM,51e5ab0b,19) memb {
51e5ab0b,0
def7d829,0
} joined {
} left {
} partitioned {
})
161009 13:33:05 [Note] WSREP: save pc into disk
161009 13:33:05 [Note] WSREP: clear restored view
161009 13:33:06 [Note] WSREP: gcomm: connected
161009 13:33:06 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
161009 13:33:06 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
161009 13:33:06 [Note] WSREP: Opened channel 'my_wsrep_cluster'
161009 13:33:06 [Note] WSREP: Waiting for SST to complete.
161009 13:33:06 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
161009 13:33:06 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
161009 13:33:06 [Note] WSREP: STATE EXCHANGE: sent state msg: 2528a615-8e14-11e6-9e93-ab2afde28393
161009 13:33:06 [Note] WSREP: STATE EXCHANGE: got state msg: 2528a615-8e14-11e6-9e93-ab2afde28393 from 0 (db01)
161009 13:33:06 [Note] WSREP: STATE EXCHANGE: got state msg: 2528a615-8e14-11e6-9e93-ab2afde28393 from 1 (db02)
161009 13:33:06 [Warning] WSREP: Quorum: No node with complete state:
Version : 4
Flags : 0x3
Protocols : 0 / 7 / 3
State : NON-PRIMARY
Desync count : 0
Prim state : SYNCED
Prim UUID : e33a5f6b-8e12-11e6-9981-7b9a76958a99
Prim seqno : 2
First seqno : -1
Last seqno : 3
Prim JOINED : 1
State UUID : 2528a615-8e14-11e6-9e93-ab2afde28393
Group UUID : 20836ccf-8e06-11e6-adf3-5330826fa72d
Name : 'db01'
Incoming addr: '192.168.0.21:3306'
Version : 4
Flags : 00
Protocols : 0 / 7 / 3
State : NON-PRIMARY
Desync count : 0
Prim state : NON-PRIMARY
Prim UUID : 00000000-0000-0000-0000-000000000000
Prim seqno : -1
First seqno : -1
Last seqno : 2
Prim JOINED : 0
State UUID : 2528a615-8e14-11e6-9e93-ab2afde28393
Group UUID : f89d319e-8e08-11e6-8b25-ebe3bbf9c45b
Name : 'db02'
Incoming addr: '192.168.0.22:3306'
161009 13:33:06 [Note] WSREP: Full re-merge of primary e33a5f6b-8e12-11e6-9981-7b9a76958a99 found: 1 of 1.
161009 13:33:06 [Note] WSREP: Quorum results:
version = 4,
component = PRIMARY,
conf_id = 2,
members = 1/2 (joined/total),
act_id = 3,
last_appl. = -1,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = 20836ccf-8e06-11e6-adf3-5330826fa72d
161009 13:33:06 [Note] WSREP: Flow-control interval: [23, 23]
161009 13:33:06 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 3)
161009 13:33:06 [Note] WSREP: State transfer required:
Group state: 20836ccf-8e06-11e6-adf3-5330826fa72d:3
Local state: f89d319e-8e08-11e6-8b25-ebe3bbf9c45b:2
161009 13:33:06 [Note] WSREP: New cluster view: global state: 20836ccf-8e06-11e6-adf3-5330826fa72d:3, view# 3: Primary, number of nodes: 2, my index: 1, protocol version 3
161009 13:33:06 [Warning] WSREP: Gap in state sequence. Need state transfer.
161009 13:33:06 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '192.168.0.22' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '14890''
161009 13:33:08 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') turning message relay requesting off
All networks communication seems OK. I've disable firewall for making my test.
No selinux too.
After try to start node2, i have this connections on db01 :
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 7584/mysqld
tcp 0 0 0.0.0.0:4567 0.0.0.0:* LISTEN 7584/mysqld
tcp 0 0 192.168.0.21:4567 192.168.0.22:53335 ESTABLISHED 7584/mysqld
I don't know if the rsync service should already listen on the node1, and maybe is why my 2nd node can't sync with the cluster.
So what's my missed up there ? Any errors in my configuration ?
P.S : It's the first time than i'm trying to install this. I followed official man page to install this : https://mariadb.org/installing-mariadb-galera-cluster-on-debian-ubuntu/
In my my.cnf, I currently have the following
innodb-write-io-threads = 32
innodb-read-io-threads = 32
When I try to increase that to 48, native AIO fails to start up.
2016-08-22T05:02:35.933600Z 0 [Note] /usr/sbin/mysqld (mysqld 5.7.13-6-log) starting as process 23187 ...
2016-08-22T05:02:35.941699Z 0 [Note] InnoDB: PUNCH HOLE support available
2016-08-22T05:02:35.941736Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-08-22T05:02:35.941748Z 0 [Note] InnoDB: Uses event mutexes
2016-08-22T05:02:35.941755Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2016-08-22T05:02:35.941762Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
2016-08-22T05:02:35.941769Z 0 [Note] InnoDB: Using Linux native AIO
2016-08-22T05:02:35.942435Z 0 [Note] InnoDB: Number of pools: 1
2016-08-22T05:02:35.942597Z 0 [Note] InnoDB: Using CPU crc32 instructions
2016-08-22T05:02:35.951309Z 0 [Warning] InnoDB: io_setup() failed with EAGAIN. Will make 5 attempts before giving up.
2016-08-22T05:02:35.951329Z 0 [Warning] InnoDB: io_setup() attempt 1.
2016-08-22T05:02:36.451533Z 0 [Warning] InnoDB: io_setup() attempt 2.
2016-08-22T05:02:36.951724Z 0 [Warning] InnoDB: io_setup() attempt 3.
2016-08-22T05:02:37.451960Z 0 [Warning] InnoDB: io_setup() attempt 4.
2016-08-22T05:02:37.952196Z 0 [Warning] InnoDB: io_setup() attempt 5.
2016-08-22T05:02:38.452394Z 0 [ERROR] InnoDB: io_setup() failed with EAGAIN after 5 attempts.
2016-08-22T05:02:38.452437Z 0 [Note] InnoDB: You can disable Linux Native AIO by setting innodb_use_native_aio = 0 in my.cnf
2016-08-22T05:02:38.452776Z 0 [ERROR] InnoDB: Cannot initialize AIO sub-system
2016-08-22T05:02:38.452789Z 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
2016-08-22T05:02:38.452799Z 0 [ERROR] Plugin 'InnoDB' init function returned error.
2016-08-22T05:02:38.452805Z 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2016-08-22T05:02:38.452812Z 0 [ERROR] Failed to initialize plugins.
2016-08-22T05:02:38.452817Z 0 [ERROR] Aborting
2016-08-22T05:02:38.452844Z 0 [Note] Binlog end
2016-08-22T05:02:38.452891Z 0 [Note] Shutting down plugin 'CSV'
2016-08-22T05:02:38.452899Z 0 [Note] Shutting down plugin 'MyISAM'
2016-08-22T05:02:38.454072Z 0 [Note] /usr/sbin/mysqld: Shutdown complete
Here's my full my.cnf:
[mysql]
# CLIENT #
port = 3306
socket = /var/lib/mysql/mysql.sock
[mysqld]
# GENERAL #
user = mysql
default-storage-engine = InnoDB
socket = /var/lib/mysql/mysql.sock
pid-file = /var/lib/mysql/mysql.pid
bind-address = 0.0.0.0
event_scheduler = on
# MyISAM #
key-buffer-size = 32M
#myisam-recover = FORCE,BACKUP
# SAFETY #
max-allowed-packet = 16M
max-connect-errors = 1000000
skip-name-resolve
sql-mode = STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY
sysdate-is-now = 1
innodb = FORCE
# DATA STORAGE #
datadir = /var/lib/mysql/
# BINARY LOGGING #
log-bin = /var/lib/mysql/mysql-bin
expire-logs-days = 14
sync-binlog = 0
server-id = 1
# CACHES AND LIMITS #
tmp-table-size = 32M
max-heap-table-size = 32M
query-cache-type = 0
query-cache-size = 0
max-connections = 500
thread-cache-size = 50
open-files-limit = 65535
table-definition-cache = 1024
table-open-cache = 2048
# INNODB #
innodb-flush-method = O_DIRECT
innodb-log-files-in-group = 2
innodb-log-file-size = 512M
innodb-flush-log-at-trx-commit = 2
innodb-file-per-table = 1
innodb-buffer-pool-size = 160G
innodb-doublewrite = 0
innodb-adaptive-flushing = 1
innodb-thread-concurrency = 0
innodb-write-io-threads = 32
innodb-read-io-threads = 32
innodb-buffer-pool-instances = 64
innodb-flush-neighbors = 0
# LOGGING #
log-error = /var/lib/mysql/mysql-error.log
log-queries-not-using-indexes = 1
slow-query-log = 1
slow-query-log-file = /var/lib/mysql/mysql-slow.log
What could be causing this?
Thanks
When I start mysql service in ubuntu. error.log keep looping at following message.
150223 23:23:55 [Note] WSREP: Read nil XID from storage engines, skipping position init
150223 23:23:55 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'
150223 23:23:55 [Note] WSREP: wsrep_load(): Galera 2.8(r165) by Codership Oy <info#codership.com> loaded successfully.
150223 23:23:55 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
150223 23:23:55 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'.
150223 23:23:55 [Note] WSREP: Passing config to GCS: base_host = 10.1.1.30; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
150223 23:23:55 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
150223 23:23:55 [Note] WSREP: wsrep_sst_grab()
150223 23:23:55 [Note] WSREP: Start replication
150223 23:23:55 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
150223 23:23:55 [Note] WSREP: protonet asio version 0
150223 23:23:55 [Note] WSREP: backend: asio
150223 23:23:55 [Note] WSREP: GMCast version 0
150223 23:23:55 [Note] WSREP: (ef9c60a1-bb84-11e4-ae0e-5f6dda01f2fc, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
150223 23:23:55 [Note] WSREP: (ef9c60a1-bb84-11e4-ae0e-5f6dda01f2fc, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
150223 23:23:55 [Note] WSREP: EVS version 0
150223 23:23:55 [Note] WSREP: PC version 0
150223 23:23:55 [Note] WSREP: gcomm: connecting to group 'cluster', peer '10.1.1.29:,10.1.1.30:'
150223 23:23:55 [Warning] WSREP: (ef9c60a1-bb84-11e4-ae0e-5f6dda01f2fc, 'tcp://0.0.0.0:4567') address 'tcp://10.1.1.30:4567' points to own listening address, blacklisting
150223 23:23:55 [Note] WSREP: (ef9c60a1-bb84-11e4-ae0e-5f6dda01f2fc, 'tcp://0.0.0.0:4567') address 'tcp://10.1.1.30:4567' pointing to uuid ef9c60a1-bb84-11e4-ae0e-5f6dda01f2fc is blacklisted, skipping
150223 23:23:56 [Note] WSREP: declaring 9cc1474d-bb83-11e4-af2d-4a569b62ed1c stable
150223 23:23:56 [Note] WSREP: Node 9cc1474d-bb83-11e4-af2d-4a569b62ed1c state prim
150223 23:23:56 [Note] WSREP: view(view_id(PRIM,9cc1474d-bb83-11e4-af2d-4a569b62ed1c,36) memb {
9cc1474d-bb83-11e4-af2d-4a569b62ed1c,
ef9c60a1-bb84-11e4-ae0e-5f6dda01f2fc,
} joined {
} left {
} partitioned {
})
150223 23:23:56 [Note] WSREP: gcomm: connected
150223 23:23:56 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
150223 23:23:56 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
150223 23:23:56 [Note] WSREP: Opened channel 'cluster'
150223 23:23:56 [Note] WSREP: Waiting for SST to complete.
150223 23:23:56 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
150223 23:23:56 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
150223 23:23:56 [Note] WSREP: STATE EXCHANGE: sent state msg: 53dffab6-bb8a-11e4-8a2b-e309e267ec98
150223 23:23:56 [Note] WSREP: STATE EXCHANGE: got state msg: 53dffab6-bb8a-11e4-8a2b-e309e267ec98 from 0 (Pluto)
150223 23:23:56 [Note] WSREP: STATE EXCHANGE: got state msg: 53dffab6-bb8a-11e4-8a2b-e309e267ec98 from 1 (mars)
150223 23:23:56 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 35,
members = 1/2 (joined/total),
act_id = 1964141,
last_appl. = -1,
protocols = 0/4/3 (gcs/repl/appl),
group UUID = a1ce21ad-b997-11e4-ad11-a6b52963ec45
150223 23:23:56 [Note] WSREP: Flow-control interval: [23, 23]
150223 23:23:56 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 1964141)
150223 23:23:56 [Note] WSREP: State transfer required:
Group state: a1ce21ad-b997-11e4-ad11-a6b52963ec45:1964141
Local state: 00000000-0000-0000-0000-000000000000:-1
150223 23:23:56 [Note] WSREP: New cluster view: global state: a1ce21ad-b997-11e4-ad11-a6b52963ec45:1964141, view# 36: Primary, number of nodes: 2, my index: 1, protocol version 3
150223 23:23:56 [Warning] WSREP: Gap in state sequence. Need state transfer.
150223 23:23:56 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role 'joiner' --address '10.1.1.30' --auth 'userass' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '3589''
150223 23:23:56 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_xtrabackup --role 'joiner' --address '10.1.1.30' --auth 'userass' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '3589'
Read: '(null)'
150223 23:23:56 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address '10.1.1.30' --auth 'userass' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '3589': 2 (No such file or directory)
150223 23:23:56 [ERROR] WSREP: Failed to prepare for 'xtrabackup' SST. Unrecoverable.
150223 23:23:56 [ERROR] Aborting
150223 23:23:58 [Note] WSREP: Closing send monitor...
150223 23:23:58 [Note] WSREP: Closed send monitor.
150223 23:23:58 [Note] WSREP: gcomm: terminating thread
150223 23:23:58 [Note] WSREP: gcomm: joining thread
150223 23:23:58 [Note] WSREP: gcomm: closing backend
150223 23:23:59 [Note] WSREP: view(view_id(NON_PRIM,9cc1474d-bb83-11e4-af2d-4a569b62ed1c,36) memb {
ef9c60a1-bb84-11e4-ae0e-5f6dda01f2fc,
} joined {
} left {
} partitioned {
9cc1474d-bb83-11e4-af2d-4a569b62ed1c,
})
150223 23:23:59 [Note] WSREP: view((empty))
150223 23:23:59 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
150223 23:23:59 [Note] WSREP: gcomm: closed
150223 23:23:59 [Note] WSREP: Flow-control interval: [16, 16]
150223 23:23:59 [Note] WSREP: Received NON-PRIMARY.
150223 23:23:59 [Note] WSREP: Shifting PRIMARY -> OPEN (TO: 1964153)
150223 23:23:59 [Note] WSREP: Received self-leave message.
150223 23:23:59 [Note] WSREP: Flow-control interval: [0, 0]
150223 23:23:59 [Note] WSREP: Received SELF-LEAVE. Closing connection.
150223 23:23:59 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 1964153)
150223 23:23:59 [Note] WSREP: RECV thread exiting 0: Success
150223 23:23:59 [Note] WSREP: recv_thread() joined.
150223 23:23:59 [Note] WSREP: Closing replication queue.
150223 23:23:59 [Note] WSREP: Closing slave action queue.
150223 23:23:59 [Note] WSREP: Service disconnected.
150223 23:23:59 [Note] WSREP: rollbacker thread exiting
150223 23:24:00 [Note] WSREP: Some threads may fail to exit.
150223 23:24:00 [Note] /usr/sbin/mysqld: Shutdown complete
I ran wsrep_sst_xtrabackup command line and it seem ok.
user#home# wsrep_sst_xtrabackup --role 'joiner' --address '10.1.1.30' --auth 'userass' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '3474'
WSREP_SST: [INFO] Streaming with tar (20150223 23:43:10.344)
WSREP_SST: [INFO] Using socat as streamer (20150223 23:43:10.347)
WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | tar xfi - --recursive-unlink -h; RC=( ${PIPESTATUS[#]} ) (20150223 23:43:10.368)
ready 10.1.1.30:4444/xtrabackup_sst
Any idea where's going wrong?
It seem appamor has been causing the problem.
Disabling it solved it.
Reference: http://www.percona.com/blog/2012/12/20/percona-xtradb-cluster-selinux-is-not-always-the-culprit/
I am using Percona Xtradb Cluster with 5 nodes. I am not able to start the node and am facing the following errors.
I used the command mysqld --console to trace and view this error.
WSREP: Failed to prepare for 'rsync' SST. Unrecoverable.
130320 18:49:37 [Warning] You need to use --log-bin to make --log-slave-updates work.
130320 18:49:37 [Note] WSREP: Read nil XID from storage engines, skipping position init
130320 18:49:37 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'
130320 18:49:37 [Note] WSREP: wsrep_load(): Galera 2.3(r143) by Codership Oy <info#codership.com> loaded succesfully.
130320 18:49:37 [ERROR] WSREP: Process completed with error: /sbin/ifconfig | grep -E '^[[:space:]]+inet addr:' | grep -m1 -v 'inet addr:127' | sed 's/:/ /' | awk '{ print $3 }': 2 (No such file or directory)
130320 18:49:37 [Warning] WSREP: Failed to guess base node address. Set it explicitly via wsrep_node_address.
130320 18:49:37 [Warning] WSREP: Guessing address for incoming client connections failed. Try setting wsrep_node_incoming_address explicitly.
130320 18:49:37 [Note] WSREP: Found saved state: 9ad38334-9082-11e2-0800-6edb31989fd4:-1
130320 18:49:37 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'.
130320 18:49:37 [Note] WSREP: Passing config to GCS: base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; pc.ignore_sb = true; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
130320 18:49:37 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
130320 18:49:37 [Note] WSREP: wsrep_sst_grab()
130320 18:49:37 [Note] WSREP: Start replication
130320 18:49:37 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
130320 18:49:37 [Note] WSREP: protonet asio version 0
130320 18:49:37 [Note] WSREP: backend: asio
130320 18:49:37 [Note] WSREP: GMCast version 0
130320 18:49:37 [Note] WSREP: (70fd13f2-91b0-11e2-0800-73cacdb90501, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
130320 18:49:37 [Note] WSREP: (70fd13f2-91b0-11e2-0800-73cacdb90501, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
130320 18:49:37 [Note] WSREP: EVS version 0
130320 18:49:37 [Note] WSREP: PC version 0
130320 18:49:37 [Note] WSREP: gcomm: connecting to group 'eclickz', peer '192.168.133.66:'
130320 18:49:37 [Note] WSREP: (70fd13f2-91b0-11e2-0800-73cacdb90501, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://192.168.133.68:4567
130320 18:49:37 [Note] WSREP: (70fd13f2-91b0-11e2-0800-73cacdb90501, 'tcp://0.0.0.0:4567') turning message relay requesting off
130320 18:49:37 [Note] WSREP: declaring 60a03c55-919b-11e2-0800-5b5a359d2cb3 stable
130320 18:49:37 [Note] WSREP: declaring c96d5052-918c-11e2-0800-210b57b08c72 stable
130320 18:49:38 [Note] WSREP: view(view_id(PRIM,60a03c55-919b-11e2-0800-5b5a359d2cb3,352) memb {
60a03c55-919b-11e2-0800-5b5a359d2cb3,
70fd13f2-91b0-11e2-0800-73cacdb90501,
c96d5052-918c-11e2-0800-210b57b08c72,
} joined {
} left {
} partitioned {
})
130320 18:49:38 [Note] WSREP: gcomm: connected
130320 18:49:38 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
130320 18:49:38 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
130320 18:49:38 [Note] WSREP: Opened channel 'eclickz'
130320 18:49:38 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 3
130320 18:49:38 [Note] WSREP: Waiting for SST to complete.
130320 18:49:38 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
130320 18:49:38 [Note] WSREP: STATE EXCHANGE: sent state msg: 45816f5a-91b0-11e2-0800-a4fd5a8bbd3c
130320 18:49:38 [Note] WSREP: STATE EXCHANGE: got state msg: 45816f5a-91b0-11e2-0800-a4fd5a8bbd3c from 0 (node03)
130320 18:49:38 [Note] WSREP: STATE EXCHANGE: got state msg: 45816f5a-91b0-11e2-0800-a4fd5a8bbd3c from 2 (node1)
130320 18:49:38 [Note] WSREP: STATE EXCHANGE: got state msg: 45816f5a-91b0-11e2-0800-a4fd5a8bbd3c from 1 (slave5.eclickz.com)
130320 18:49:38 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 346,
members = 2/3 (joined/total),
act_id = 832701,
last_appl. = -1,
protocols = 0/4/2 (gcs/repl/appl),
group UUID = 9ad38334-9082-11e2-0800-6edb31989fd4
130320 18:49:38 [Note] WSREP: Flow-control interval: [28, 28]
130320 18:49:38 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 832701)
130320 18:49:38 [Note] WSREP: State transfer required:
Group state: 9ad38334-9082-11e2-0800-6edb31989fd4:832701
Local state: 9ad38334-9082-11e2-0800-6edb31989fd4:-1
130320 18:49:38 [Note] WSREP: New cluster view: global state: 9ad38334-9082-11e2-0800-6edb31989fd4:832701, view# 347: Primary, number of nodes: 3, my index: 1, protocol version 2
130320 18:49:38 [Warning] WSREP: Gap in state sequence. Need state transfer.
130320 18:49:40 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '192.168.133.70:4567' --auth '' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '38994''
130320 18:49:40 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_rsync --role 'joiner' --address '192.168.133.70:4567' --auth '' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '38994'
Read: '(null)'
130320 18:49:40 [ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'joiner' --address '192.168.133.70:4567' --auth '' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '38994': 2 (No such file or directory)
130320 18:49:40 [ERROR] WSREP: Failed to prepare for 'rsync' SST. Unrecoverable.
130320 18:49:40 [ERROR] Aborting
130320 18:49:42 [Note] WSREP: Closing send monitor...
130320 18:49:42 [Note] WSREP: Closed send monitor.
130320 18:49:42 [Note] WSREP: gcomm: terminating thread
130320 18:49:42 [Note] WSREP: gcomm: joining thread
130320 18:49:42 [Note] WSREP: gcomm: closing backend
130320 18:49:42 [Note] WSREP: view(view_id(NON_PRIM,60a03c55-919b-11e2-0800-5b5a359d2cb3,3 52) memb {
70fd13f2-91b0-11e2-0800-73cacdb90501,
} joined {
} left {
} partitioned {
60a03c55-919b-11e2-0800-5b5a359d2cb3,
c96d5052-918c-11e2-0800-210b57b08c72,
})
130320 18:49:42 [Note] WSREP: view((empty))
130320 18:49:42 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
130320 18:49:42 [Note] WSREP: gcomm: closed
130320 18:49:42 [Note] WSREP: Flow-control interval: [16, 16]
130320 18:49:42 [Note] WSREP: Received NON-PRIMARY.
130320 18:49:42 [Note] WSREP: Shifting PRIMARY -> OPEN (TO: 832701)
130320 18:49:42 [Note] WSREP: Received self-leave message.
130320 18:49:42 [Note] WSREP: Flow-control interval: [0, 0]
130320 18:49:42 [Note] WSREP: Received SELF-LEAVE. Closing connection.
130320 18:49:42 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 832701)
130320 18:49:42 [Note] WSREP: RECV thread exiting 0: Success
130320 18:49:42 [Note] WSREP: recv_thread() joined.
130320 18:49:42 [Note] WSREP: Closing slave action queue.
130320 18:49:42 [Note] WSREP: Service disconnected.
130320 18:49:42 [Note] WSREP: rollbacker thread exiting
130320 18:49:43 [Note] WSREP: Some threads may fail to exit.
One more thing, some other nodes can start but they could not bind port 3306, which is weird.
I was not able to resolve this. Please help me. Thanks
I figured it out by installing libssl0.9.8.
Run the command manually in the server and check the output
WSREP: Running: 'wsrep_sst_rsync --role
Check if rsync, lsof are installed in the server, i had the following issue popup
'rsync' not found in PATH
and lsof equivalent
Hope it helps