Percona Xtradb - unable to start nodes after long downtime - mysql

We have a Percona Xtradb-v2 cluster set up with 3 nodes.
Everything was working and in synchronisation when we shut down nodes 2 and 3, leaving only node 1. The nodes stayed down for a week, during which time the database grew by 100GB in size.
When we attempted to restart the nodes 2 and 3, the startup failed during the initial SST, after less than a minute. I have tried completely removing the /var/lib/mysql and restarting but it has the same effect.
The error logs appear to show an issue with the initial SST, possibly due to the volume of data required to be transferred for the initial startup. We have sufficient disk space, and file permissions are correct. The xtrabackup package is installed and available (and worked previously anyway).
The logs show a 'no such file or directory'
Joiner Logs show:
Dec 15 01:21:51 xm1adb05 mysqld: #011Group state: 67e7e56d-8e95-11e6-a9d2-ce8abe8f95bb:5766440
Dec 15 01:21:51 xm1adb05 mysqld: #011Local state: 00000000-0000-0000-0000-000000000000:-1
Dec 15 01:21:51 xm1adb05 mysqld: 2016-12-15 01:21:51 13029 [Note] WSREP: New cluster view: global state: 67e7e56d-8e95-11e6-a9d2-ce8abe8f95bb:5766440, view# 54: Primary, number of nodes: 2, my index: 1, protocol version 3
Dec 15 01:21:51 xm1adb05 mysqld: 2016-12-15 01:21:51 13029 [Warning] WSREP: Gap in state sequence. Need state transfer.
Dec 15 01:21:51 xm1adb05 mysqld: 2016-12-15 01:21:51 13029 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.23.40.115' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '13029' '' '
Dec 15 01:21:51 xm1adb05 mysqld: WSREP_SST: [INFO] Logging all stderr of SST/Innobackupex to syslog (20161215 01:21:51.575)
Dec 15 01:21:51 xm1adb05 -wsrep-sst-joiner: Streaming with xbstream
Dec 15 01:21:51 xm1adb05 -wsrep-sst-joiner: Using socat as streamer
...
Dec 15 01:21:51 xm1adb05 mysqld: 2016-12-15 01:21:51 13029 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (67e7e56d-8e95-11e6-a9d2-ce8abe8f95bb): 1 (Operation not permitted)
Dec 15 01:21:51 xm1adb05 mysqld: #011 at galera/src/replicator_str.cpp:prepare_for_IST():507. IST will be unavailable.
...
Dec 15 01:21:51 xm1adb05 mysqld: 2016-12-15 01:21:51 13029 [Note] WSREP: Member 1.0 (xm1adb05) requested state transfer from '*any*'. Selected 0.0 (xm1adb04)(SYNCED) as donor.
Dec 15 01:21:51 xm1adb05 mysqld: 2016-12-15 01:21:51 13029 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 5766440)
Dec 15 01:21:51 xm1adb05 mysqld: 2016-12-15 01:21:51 13029 [Note] WSREP: Requesting state transfer: success, donor: 0
Dec 15 01:21:51 xm1adb05 mysql-systemd: State transfer in progress, setting sleep higher
...
Dec 15 01:22:02 xm1adb05 -wsrep-sst-joiner: xtrabackup_checkpoints missing, failed innobackupex/SST on donor
Dec 15 01:22:02 xm1adb05 -wsrep-sst-joiner: Cleanup after exit with status:2
Dec 15 01:22:02 xm1adb05 mysqld: 2016-12-15 01:22:02 13029 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.23.40.115' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '13029' '' : 2 (No such file or directory)
Dec 15 01:22:02 xm1adb05 mysqld: 2016-12-15 01:22:02 13029 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
Dec 15 01:22:02 xm1adb05 mysqld: 2016-12-15 01:22:02 13029 [ERROR] WSREP: SST script aborted with error 2 (No such file or directory)
Dec 15 01:22:02 xm1adb05 mysqld: 2016-12-15 01:22:02 13029 [ERROR] WSREP: SST failed: 2 (No such file or directory)
Dec 15 01:22:02 xm1adb05 mysqld: 2016-12-15 01:22:02 13029 [ERROR] Aborting
Donor logs show:
Dec 15 01:22:02 xm1adb04 mysqld: 2016-12-15 01:22:02 6531 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.23.40.115:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' '' --gtid '67e7e56d-8e95-11e6-a9d2-ce8abe8f95bb:5766440'
Dec 15 01:22:02 xm1adb04 mysqld: 2016-12-15 01:22:02 6531 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.23.40.115:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' '' --gtid '67e7e56d-8e95-11e6-a9d2-ce8abe8f95bb:5766440': 22 (Invalid argument)
Dec 15 01:22:03 xm1adb04 mysqld: 2016-12-15 01:22:03 6531 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.23.40.115:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' '' --gtid '67e7e56d-8e95-11e6-a9d2-ce8abe8f95bb:5766440'
Similar actions successfully started the secondary nodes on another (much smaller) database, so it would seem that the size may be the issue.
Can anyone give some help on how we can initialise and re-start the additional nodes?

Just switch to rsync.
wsrep_sst_method = rsync
Once the node is in sync, stop it, switch to xtrabackup and start again.

XtraBackup failed in your case. You can get lead of XB failure by looking at XB generated log files.
BTW, you should check the XB log on donor node.
As per the short snippet XB reports (Invalid argument)

The solution to restart the nodes was to first restart the one remaining cluster member (1), then completely wipe the /var/lib/mysql on the joiners (2 and 3) before trying to restart and rejoin. This causes an SST and all worked.
The problem seems to be that the nodes 2 and 3 were partitioned on node 1, and so it was not allowing the SST to complete (I think maybe the final IST was denied and so the SST rolled back). Restarting node 1 seems to reset the partitioning and then the SST could complete.
We also had a rather small gcache.size which didn't help as there were a lot of writes going on in the database.
Later events showed that the SST seems to have failed due to issues with xtrabackup on the donor node. The xtrabackup process didn't like the my.cnf settings where we had a line duplicated. Fixing this and restarting the donor (to end the partitioning) has let things work.

Whoever looking for an answer, having similar errors especially at donor, in my case I had to open firewall ports for 4444, 4567 and 4568 across all the nodes.

Please take backup first of advanced Node that is Node 1 and restore on second and third then then try to restart second and third node ,that will solve your problems .
because while starting second node your donor is not receiving last LSN no from joiner because second node datadir is empty there is no data

Related

Galera cluster not working use wsrep_sst_method=xtrabackup (-v2) until first use rsync

I'm trying to setup a galera cluster, with MariaDB 10.2, and percona-xtrabackup-2.3.10-1.el7.x86_64.
If bootstrap the donor by wsrep_sst_method=xtrabackup or xtrabackup-v2, the joiner will be unable to join in the cluster, and the error message complains about "no valid checkpoint".
However, if first bring up the cluster using wsrep_sst_method=rsync then change it to xtrabackup, then stop all nodes and bootstrap the donor again (by galera_new_cluster), the joiner is able to join in OK.
I suspect there was some data synchronized to the second node when using rsync.
Could you please give me some pointers about why xtrabackup doesn't work the first time?
Is it a common practice to bootstrap first time with rsync?
Or, anything else.
Any hints will be highly appreciated, and just let me know if you need more information.
Thank you for your help.
More details:
When bootstrap the first time using xtrabackup, the joiner is unable to join in, and its log says:
Jan 23 04:09:14 setsv-dr.local.example.com mysqld[6924]: WSREP_SST: [ERROR] xtrabackup_checkpoints missing, failed innobackupex/SST on donor (20210123 04:09:14.948)
Jan 23 04:09:14 setsv-dr.local.example.com mysqld[6924]: WSREP_SST: [ERROR] Cleanup after exit with status:2 (20210123 04:09:14.971)
, and the donor logs say:
Jan 23 04:11:31 setsv mysqld: group UUID = a53cd166-5d68-11eb-aa0f-83c1fd168f1f
Jan 23 04:11:31 setsv mysqld: 2021-01-23 4:11:31 140325406353152 [Note] WSREP: Flow-control interval: [16, 16]
Jan 23 04:11:31 setsv mysqld: 2021-01-23 4:11:31 140325646681856 [Note] WSREP: REPL Protocols: 9 (4, 2)
Jan 23 04:11:31 setsv mysqld: 2021-01-23 4:11:31 140325646681856 [Note] WSREP: New cluster view: global state: a53cd166-5d68-11eb-aa0f-83c1fd168f1f:0, view# 35: Primary, number of nodes: 1, my index: 0, protocol version 3
Jan 23 04:11:31 setsv mysqld: 2021-01-23 4:11:31 140325646681856 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Jan 23 04:11:31 setsv mysqld: 2021-01-23 4:11:31 140325646681856 [Note] WSREP: Assign initial position for certification: 0, protocol version: 4
Jan 23 04:11:31 setsv mysqld: 2021-01-23 4:11:31 140325655074560 [Note] WSREP: Service thread queue flushed.
Jan 23 04:11:32 setsv mysqld: WSREP_SST: [INFO] Streaming the backup to joiner at 192.168.56.71 4444 (20210123 04:11:32.454)
Jan 23 04:11:32 setsv mysqld: WSREP_SST: [INFO] Evaluating innobackupex --no-version-check $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA}/innobackup.backup.log | socat -u stdio TCP:192.168.56.71:4444; RC=( ${PIPESTATUS[#]} ) (20210123 04:11:32.457)
Jan 23 04:11:32 setsv mysqld: 2021/01/23 04:11:32 socat[15384] E connect(6, AF=2 192.168.56.71:4444, 16): Connection refused
Jan 23 04:11:32 setsv mysqld: 2021-01-23 4:11:32 140325134952192 [Warning] Aborted connection 26 to db: 'unconnected' user: 'sst_user' host: 'localhost' (Got an error reading communication packets)
Jan 23 04:11:32 setsv mysqld: WSREP_SST: [ERROR] innobackupex finished with error: 1. Check /var/lib/mysql//innobackup.backup.log (20210123 04:11:32.467)
Jan 23 04:11:32 setsv mysqld: WSREP_SST: [ERROR] Cleanup after exit with status:22 (20210123 04:11:32.469)
Jan 23 04:11:32 setsv mysqld: WSREP_SST: [INFO] Cleaning up temporary directories (20210123 04:11:32.471)
Jan 23 04:11:32 setsv mysqld: 2021-01-23 4:11:32 140324111660800 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.56.71:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --gtid 'a53cd166-5d68-11eb-aa0f-83c1fd168f1f:0' --gtid-domain-id '0' --mysqld-args --basedir=/usr --wsrep-new-cluster --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
Jan 23 04:11:32 setsv mysqld: 2021-01-23 4:11:32 140324111660800 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.56.71:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --gtid 'a53cd166-5d68-11eb-aa0f-83c1fd168f1f:0' --gtid-domain-id '0' --mysqld-args --basedir=/usr --wsrep-new-cluster --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1: 22 (Invalid argument)
Jan 23 04:11:32 setsv mysqld: 2021-01-23 4:11:32 140324111660800 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.56.71:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --gtid 'a53cd166-5d68-11eb-aa0f-83c1fd168f1f:0' --gtid-domain-id '0' --mysqld-args --basedir=/usr --wsrep-new-cluster --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
Jan 23 04:11:32 setsv mysqld: 2021-01-23 4:11:32 140325406353152 [Warning] WSREP: Could not find peer: b8dba97a-5d6b-11eb-8d6d-ae8d4ded2a34
Jan 23 04:11:32 setsv mysqld: 2021-01-23 4:11:32 140325406353152 [Warning] WSREP: 0.0 (setsv): State transfer to -1.-1 (left the group) failed: -22 (Invalid argument)
Jan 23 04:11:32 setsv mysqld: 2021-01-23 4:11:32 140325406353152 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 0)
Jan 23 04:11:32 setsv mysqld: 2021-01-23 4:11:32 140325406353152 [Note] WSREP: Member 0.0 (setsv) synced with group.
Jan 23 04:11:32 setsv mysqld: 2021-01-23 4:11:32 140325406353152 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
Jan 23 04:11:32 setsv mysqld: 2021-01-23 4:11:32 140325646681856 [Note] WSREP: Synchronized with group, ready for connections
Jan 23 04:11:32 setsv mysqld: 2021-01-23 4:11:32 140325646681856 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Jan 23 04:11:36 setsv mysqld: 2021-01-23 4:11:36 140325414745856 [Note] WSREP: cleaning up b8dba97a (ssl://192.168.56.71:4567)
, file /var/lib/mysql//innobackup.backup.log says
210123 04:11:32 innobackupex: Starting the backup operation
IMPORTANT: Please check that the backup run completes successfully.
At the end of a successful backup run innobackupex
prints "completed OK!".
210123 04:11:32 Connecting to MySQL server host: localhost, user: sst_user, password: set, port: not set, socket: /var/lib/mysql/mysql.sock
Using server version 10.2.36-MariaDB
innobackupex version 2.3.10 based on MySQL server 5.6.24 Linux (x86_64) (revision id: bd0d4403f36)
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /var/lib/mysql/
xtrabackup: open files limit requested 0, set to 16384
xtrabackup: using the following InnoDB configuration:
xtrabackup: innodb_data_home_dir = ./
xtrabackup: innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup: innodb_log_group_home_dir = ./
xtrabackup: innodb_log_files_in_group = 2
xtrabackup: innodb_log_file_size = 50331648
InnoDB: No valid checkpoint found.
InnoDB: If this error appears when you are creating an InnoDB database,
InnoDB: the problem may be that during an earlier attempt you managed
InnoDB: to create the InnoDB data files, but log file creation failed.
InnoDB: If that is the case, please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/error-creating-innodb.html

Reinstall mysql-server after upgrade to Ubuntu 18.04 - errors

Mysql needed to be reinstalled after I upgraded to Ubuntu 18.04. I did 'sudo service mysqld stop' before the upgrade, and then the cleanup step must have removed mysql server. When it wasn't found I did:
sudo apt-get remove mysql-server
sudo apt-get install mysql-server
...
Setting up mysql-server-5.7 (5.7.22-0ubuntu18.04.1) ...
update-alternatives: using /etc/mysql/mysql.cnf to provide /etc/mysql/my.cnf (my.cnf) in auto mode
mysqld: [Warning] World-writable config file '/etc/mysql/my.cnf' is ignored.
Please enable --log-error option or set appropriate redirections for standard output and/or standard error in daemon mode.
Warning: Unable to start the server. Please restart MySQL and run mysql_upgrade to ensure the database is ready for use.
mysqld: [Warning] World-writable config file '/etc/mysql/my.cnf' is ignored.
Please enable --log-error option or set appropriate redirections for standard output and/or standard error in daemon mode.
Warning: Unable to start the server.
...
I then ran 'systemctl status mysql.service' which reported this:
Jul 19 11:21:49 steve-VAIO mysqld[6106]: 2018-07-19T18:21:49.828124Z 0 [ERROR] InnoDB: Operating system error number 2 in a file oper
Jul 19 11:21:49 steve-VAIO mysqld[6106]: 2018-07-19T18:21:49.828132Z 0 [ERROR] InnoDB: The error means the system cannot find the pat
Jul 19 11:21:49 steve-VAIO mysqld[6106]: 2018-07-19T18:21:49.828140Z 0 [ERROR] InnoDB: Could not find a valid tablespace file for `my
Jul 19 11:21:49 steve-VAIO mysqld[6106]: 2018-07-19T18:21:49.828199Z 0 [Warning] InnoDB: Cannot calculate statistics for table `mysql
Jul 19 11:21:49 steve-VAIO mysqld[6106]: 2018-07-19T18:21:49.828232Z 0 [ERROR] Can't open and lock privilege tables: Tablespace is mi
Jul 19 11:21:49 steve-VAIO mysqld[6106]: 2018-07-19T18:21:49.846964Z 0 [Note] InnoDB: Buffer pool(s) load completed at 180719 11:21:4
Jul 19 11:21:49 steve-VAIO mysqld[6106]: 2018-07-19T18:21:49.934673Z 0 [Note] Event Scheduler: Loaded 0 events
Jul 19 11:21:49 steve-VAIO mysqld[6106]: 2018-07-19T18:21:49.934950Z 0 [Note] /usr/sbin/mysqld: ready for connections.
Jul 19 11:21:49 steve-VAIO mysqld[6106]: Version: '5.7.22-0ubuntu18.04.1' socket: '/var/run/mysqld/mysqld.sock' port: 3306 (Ubuntu
Jul 19 11:21:49 steve-VAIO systemd[1]: Started MySQL Community Server.
mysqld is running and my databases are available, but I'm sure this will bite me again when it updates. phpMyAdmin shows the mysql database, which seems healthy. At the top of the list of databases is one called '#mysql50#mysql.bkp'.
Any suggestions for cleaning this up?
After suggestion to run mysql_upgrade, I did:
steve#steve-VAIO:~/workspace/JavascriptCourse$ mysql_upgrade -u root -p
mysql_upgrade: [Warning] World-writable config file '/etc/mysql/my.cnf' is ignored.
Enter password:
Checking if update is needed.
Checking server version.
Running queries to upgrade MySQL server.
mysql_upgrade: [ERROR] 1812: Tablespace is missing for table `mysql`.`plugin`.
So I'll hunt for a fix for this table's tablespace.

mariadb, add 4th galera node failed

I have three node setup and running perfectly for the past months.
Recently I want to add another node in a different location but somehow I keep on getting errors.
At first, I was just following this tutorial (where I setup the first time few months ago) https://www.howtoforge.com/tutorial/how-to-install-and-configure-galera-cluster-on-ubuntu-1604/ I did not start all the nodes again from the beginning, I just has to find the file of /mysql/conf.d/galera.cnf in the other three nodes I added the new nodes ip into the previous three. So for the forth one I had the /etc/mysql/conf.d/galera.cnf setup like...
[mysqld]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
# Galera Provider Configuration
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
# Galera Cluster Configuration
wsrep_cluster_name="galera_cluster"
wsrep_cluster_address="gcomm://node1_ip,node2_ip,node3_ip,node4_ip"
# Galera Synchronization Configuration
wsrep_sst_method=rsync
# Galera Node Configuration
wsrep_node_address="xx.xx.xxx.xxx"
wsrep_node_name="Node4"
somehow I am getting this HUGE error,
Group state: e3ade7e7-e682-11e7-8d16-be7d28cda90e:36273
Local state: 00000000-0000-0000-0000-000000000000:-1
[Note] WSREP: New cluster view: global state: e3ade7e7-e682-11e7-8d16-be7d28cda90e:36273, view# 122: Primary, number of nodes: 4, my
[Warning] WSREP: Gap in state sequence. Need state transfer.
[Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address 'xxx.node.4.ip' --datadir '/var/lib/mysql/' --parent '22828' ''
rsyncd version 3.1.1 starting, listening on port 4444
[Note] WSREP: Prepared SST request: rsync|xxx.node.4.ip:4444/rsync_sst
[Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
[Note] WSREP: REPL Protocols: 7 (3, 2)
[Note] WSREP: Assign initial position for certification: 36273, protocol version: 3
[Note] WSREP: Service thread queue flushed.
[Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not
at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
[Note] WSREP: Member 0.0 (Node4) requested state transfer from '*any*'. Selected 1.0 (Node1)(SYNCED) as donor.
[Note] WSREP: Shifting PRIMARY -> JOINER (TO: 36273)
[Note] WSREP: Requesting state transfer: success, donor: 1
[Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> e3ade7e7-e682-11e7-8d16-be7d28cda90e:36273
[Note] WSREP: (7642cf37, 'tcp://0.0.0.0:4567') connection to peer 7642cf37 with addr tcp://xxx.node.4.ip:4567 timed out, no messages
[Note] WSREP: (7642cf37, 'tcp://0.0.0.0:4567') turning message relay requesting off
mariadb.service: Start operation timed out. Terminating.
Terminated
WSREP_SST: [INFO] Joiner cleanup. rsync PID: 22875
sent 0 bytes received 0 bytes total size 0
WSREP_SST: [INFO] Joiner cleanup done.
[ERROR] WSREP: Process was aborted.
[ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'joiner' --address 'xxx.node.4.ip' --datadir '/var/lib/mysql/'
[ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script.
[ERROR] WSREP: SST failed: 2 (No such file or directory)
[ERROR] Aborting
Error in my_thread_global_end(): 1 threads didn't exit
mariadb.service: Main process exited, code=exited, status=1/FAILURE
Failed to start MariaDB 10.1.33 database server.
P.S for the older 3 nodes maria db version is 10.1.29 and the new node is 10.1.33
Thanks in advance for any suggestions.

MYSQL 5.7 Percona XtraDBCluster - Cant start MYSQL - Digital Ocean Box

I rebooted a digital ocean box and now I can't start mysql. When I run start command i get :
Redirecting to /bin/systemctl restart mysql.service
Job for mysql.service failed because the control process exited with error code. See "systemctl status mysql.service" and "journalctl -xe" for details.
Result of systemctl status mysql.service
● mysql.service - Percona XtraDB Cluster
Loaded: loaded (/usr/lib/systemd/system/mysql.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Mon 2017-12-11 17:08:18 UTC; 5s ago
Process: 26300 ExecStopPost=/usr/bin/mysql-systemd stop-post (code=exited, status=0/SUCCESS)
Process: 26270 ExecStop=/usr/bin/mysql-systemd stop (code=exited, status=2)
Process: 25674 ExecStartPost=/usr/bin/mysql-systemd start-post $MAINPID (code=exited, status=1/FAILURE)
Process: 25673 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS)
Process: 25632 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
Main PID: 25673 (code=exited, status=0/SUCCESS)
Dec 11 17:08:18 server-name-hidden mysql-systemd[25674]: ERROR! mysqld_safe with PID 25673 has already exited: FAILURE
Dec 11 17:08:18 server-name-hidden systemd[1]: mysql.service: control process exited, code=exited status=1
Dec 11 17:08:18 server-name-hidden mysql-systemd[26270]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 11 17:08:18 server-name-hidden mysql-systemd[26270]: ERROR! mysql already dead
Dec 11 17:08:18 server-name-hidden systemd[1]: mysql.service: control process exited, code=exited status=2
Dec 11 17:08:18 server-name-hidden mysql-systemd[26300]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 11 17:08:18 server-name-hidden mysql-systemd[26300]: WARNING: mysql may be already dead
Dec 11 17:08:18 server-name-hidden systemd[1]: Failed to start Percona XtraDB Cluster.
Dec 11 17:08:18 server-name-hidden systemd[1]: Unit mysql.service entered failed state.
Dec 11 17:08:18 server-name-hidden systemd[1]: mysql.service failed.
/var/run/mysqld/ is owned by mysql user - but it's empty. If i add a mysqld.pid file it gets removed when i run mysql start.
Does anyone know why a reboot would cause this or give me any next steps. I have reviewed the mysqld.log file and can't see anything of use. here are the last 30 lines
017-12-11T15:14:42.473164Z 0 [Note] WSREP: Received shutdown signal. Will sleep for 10 secs before initiating shutdown. pxc_maint_mode switched to SHUTDOWN
2017-12-11T15:14:52.474584Z 0 [Note] WSREP: Stop replication
2017-12-11T15:14:52.474691Z 0 [Note] WSREP: Closing send monitor...
2017-12-11T15:14:52.475160Z 0 [Note] WSREP: Closed send monitor.
2017-12-11T15:14:52.475268Z 0 [Note] WSREP: gcomm: terminating thread
2017-12-11T15:14:52.475305Z 0 [Note] WSREP: gcomm: joining thread
2017-12-11T15:14:52.475886Z 0 [Note] WSREP: gcomm: closing backend
2017-12-11T15:14:52.476079Z 0 [Note] WSREP: Current view of cluster as seen by this node
2017-12-11T15:14:52.476491Z 0 [Note] WSREP: gcomm: closed
2017-12-11T15:14:52.476532Z 0 [Note] WSREP: Received self-leave message.
2017-12-11T15:14:52.476559Z 0 [Note] WSREP: Flow-control interval: [0, 0]
2017-12-11T15:14:52.476595Z 0 [Note] WSREP: Trying to continue unpaused monitor
2017-12-11T15:14:52.476602Z 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2017-12-11T15:14:52.476608Z 0 [Note] WSREP: Shifting SYNCED -> CLOSED (TO: 13842228)
2017-12-11T15:14:52.476632Z 0 [Note] WSREP: RECV thread exiting 0: Success
2017-12-11T15:14:52.477098Z 0 [Note] WSREP: recv_thread() joined.
2017-12-11T15:14:52.477110Z 0 [Note] WSREP: Closing replication queue.
2017-12-11T15:14:52.477116Z 0 [Note] WSREP: Closing slave action queue.
2017-12-11T15:14:52.477123Z 0 [Note] Giving 63 client threads a chance to die gracefully
2017-12-11T15:14:52.478905Z 0 [Warning] /usr/sbin/mysqld: Forcing close of thread 21 user: ‘hidden’
2017-12-11T15:14:52.478957Z 0 [Warning] /usr/sbin/mysqld: Forcing close of thread 22 user: 'hidden'
2017-12-11T15:14:52.478994Z 0 [Warning] /usr/sbin/mysqld: Forcing close of thread 6387 user: 'hidden'
2017-12-11T15:14:52.479034Z 0 [Warning] /usr/sbin/mysqld: Forcing close of thread 6367 user: 'hidden'
2017-12-11T15:14:52.479084Z 0 [Warning] /usr/sbin/mysqld: Forcing close of thread 6373 user: 'hidden'
2017-12-11T15:14:52.479130Z 0 [Warning] /usr/sbin/mysqld: Forcing close of thread 6368 user: 'hidden'
2017-12-11T15:14:54.479289Z 0 [Note] WSREP: Waiting for active wsrep applier to exit
I've also been told to try : systemctl start mysql#bootstrap however this fails with the same error. here is the result from journalctl -xe
Dec 11 17:24:58 dropletname systemd[1]: mysql#bootstrap.service: control process exited, code=exited status=1
Dec 11 17:24:58 dropletname mysql-systemd[28663]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 11 17:24:58 dropletname mysql-systemd[28663]: ERROR! mysql already dead
Dec 11 17:24:58 dropletname systemd[1]: mysql#bootstrap.service: control process exited, code=exited status=2
Dec 11 17:24:58 dropletname mysql-systemd[28694]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 11 17:24:58 dropletname mysql-systemd[28694]: WARNING: mysql may be already dead
Dec 11 17:24:58 dropletname systemd[1]: Failed to start Percona XtraDB Cluster with config /etc/sysconfig/mysql.bootstrap.
Dec 11 17:24:58 dropletname systemd[1]: Unit mysql#bootstrap.service entered failed state.
Dec 11 17:24:58 dropletname systemd[1]: mysql#bootstrap.service failed.
Dec 11 17:24:58 dropletname polkitd[510]: Unregistered Authentication Agent for unix-process:28009:649814 (system bus name :1.159, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_GB.UTF-8) (disconnected from bus)
Dec 11 17:25:02 dropletname sendmail[27970]: unable to qualify my own domain name (dropletname) -- using short name
Dec 11 17:25:02 dropletname sendmail[27970]: vBBHP2cx027970: from=hidden, size=1655, class=-60, nrcpts=1, msgid=<201712111725.vBBHP2cx027970#dropletname>, relay=hidden#localhost
Dec 11 17:25:02 dropletname sendmail[28728]: vBBHP26v028728: from=<hidden#dropletname>, size=1935, class=-60, nrcpts=1, msgid=<201712111725.vBBHP2cx027970#dropletname>, proto=ESMTP, daemon=MTA, relay=dropletname [127.0.0.1]
Dec 11 17:25:02 dropletname sendmail[27970]: vBBHP2cx027970: to=hidden, ctladdr=hidden (1000/1000), delay=00:00:00, xdelay=00:00:00, mailer=relay, pri=139655, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (vBBHP26v028728 Message accepted for del
Dec 11 17:25:02 dropletname sendmail[28729]: vBBHP26v028728: to=<hidden#dropletname>, ctladdr=<hidden#dropletname> (1000/1000), delay=00:00:00, xdelay=00:00:00, mailer=local, pri=140148, dsn=2.0.0, stat=Sent
Dec 11 17:25:02 dropletname systemd[1]: Removed slice User Slice of hidden
ps aux|grep mysql
root 11590 0.0 0.0 107924 608 pts/0 T 15:43 0:00 cat /var/log/mysqld.log
root 24666 0.0 0.0 107924 612 pts/0 T 16:52 0:00 cat /var/log/mysqld.log
root 32182 0.0 0.0 112664 972 pts/2 S+ 17:59 0:00 grep --color=auto mysq
I managed to resolve this in the end however i'm not sure what 100% did it. I changed /etc/my.conf to look like this :
!includedir /etc/my.cnf.d/
!includedir /etc/percona-xtradb-cluster.conf.d/
[client]
socket=/var/lib/mysql/mysql.sock
[mysqld]
server-id=1
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
log-bin
log_slave_updates
expire_logs_days=7
innodb_strict_mode=OFF
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
I created a mysqld.pid and mysqld.lock file in /var/run/mysqld/ ensuring the owner and group was mysql:mysql
touch /var/run/mysqld/mysqld.pid;
touch /var/run/mysqld/mysqld.lock;
chown -R mysql:mysql /var/run/mysqld/;
I then ran the following two commands which seems to fail but may have helped
service mysql start --wsrep-cluster-address="gcomm://";
systemctl start mysql --wsrep-new-cluster;
Finally I ran
systemctl start mysql#bootstrap.service
service mysql start
This had previously failed every time I tried. Subsequently I then found that the /var/lib/mysql/mysql.sock had been created which previously did not exist.

MySql fails to start

Having trouble diagnosing my issue (mysql fails to start). Any help would be greatly appreciated.
Also, this is on Ubuntu 16.04
Step 1 - service mysql start
Job for mysql.service failed because the control process exited with error code. See "systemctl status mysql.service" and "journalctl -xe" for details.
Step 2 - systemctl status mysql.service
mysql.service - MySQL Community Server
Loaded: loaded (/lib/systemd/system/mysql.service; enabled; vendor preset: enabled)
Active: activating (start-post) (Result: exit-code) since Mon 2016-10-31 22:40:22 UTC; 11s ago
Process: 9780 ExecStart=/usr/sbin/mysqld (code=exited, status=1/FAILURE)
Process: 9777 ExecStartPre=/usr/share/mysql/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
Main PID: 9780 (code=exited, status=1/FAILURE); : 9781 (mysql-systemd-s)
CGroup: /system.slice/mysql.service
└─control
├─9781 /bin/bash /usr/share/mysql/mysql-systemd-start post
└─9806 sleep 1
Step 3 - journalctl -xe
Oct 31 22:42:32 sshd[10033]: Received disconnect from 121.18.238.109 port 34577:11: [preauth]
Oct 31 22:42:32 sshd[10033]: Disconnected from 121.18.238.109 port 34577 [preauth]
Oct 31 22:42:32 sshd[10033]: PAM 2 more authentication failures; logname= uid=0 euid=0 tty=ssh ruser= rhost=121.18.238.109 user=root
Oct 31 22:42:35 sshd[10090]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=116.31.116.23 user=root
Oct 31 22:42:37 sshd[10090]: Failed password for root from 116.31.116.23 port 52791 ssh2
Oct 31 22:42:39 sshd[10090]: Failed password for root from 116.31.116.23 port 52791 ssh2
Oct 31 22:42:41 sshd[10090]: Failed password for root from 116.31.116.23 port 52791 ssh2
Oct 31 22:42:41 sshd[10090]: Received disconnect from 116.31.116.23 port 52791:11: [preauth]
Oct 31 22:42:41 sshd[10090]: Disconnected from 116.31.116.23 port 52791 [preauth]
Oct 31 22:42:41 sshd[10090]: PAM 2 more authentication failures; logname= uid=0 euid=0 tty=ssh ruser= rhost=116.31.116.23 user=root
Oct 31 22:42:43 sshd[10084]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=121.18.238.109 user=root
Oct 31 22:42:45 sshd[10084]: Failed password for root from 121.18.238.109 port 40812 ssh2
Oct 31 22:42:48 sshd[10084]: Failed password for root from 121.18.238.109 port 40812 ssh2
Oct 31 22:42:50 sshd[10084]: Failed password for root from 121.18.238.109 port 40812 ssh2
Oct 31 22:42:50 sshd[10084]: Received disconnect from 121.18.238.109 port 40812:11: [preauth]
Oct 31 22:42:50 sshd[10084]: Disconnected from 121.18.238.109 port 40812 [preauth]
Oct 31 22:42:50 sshd[10084]: PAM 2 more authentication failures; logname= uid=0 euid=0 tty=ssh ruser= rhost=121.18.238.109 user=root
Oct 31 22:42:56 systemd[1]: Failed to start MySQL Community Server.
-- Subject: Unit mysql.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mysql.service has failed.
--
-- The result is failed.
Oct 31 22:42:56 systemd[1]: mysql.service: Unit entered failed state.
Oct 31 22:42:56 systemd[1]: mysql.service: Failed with result 'exit-code'.
Oct 31 22:42:56 systemd[1]: mysql.service: Service hold-off time over, scheduling restart.
Oct 31 22:42:56 systemd[1]: Stopped MySQL Community Server.
-- Subject: Unit mysql.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mysql.service has finished shutting down.
Oct 31 22:42:56 systemd[1]: Starting MySQL Community Server...
-- Subject: Unit mysql.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mysql.service has begun starting up.
Oct 31 22:42:56 systemd[1]: mysql.service: Main process exited, code=exited, status=1/FAILURE
Oct 31 22:43:04 sshd[10141]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=121.18.238.109 user=root
Oct 31 22:43:06 sshd[10141]: Failed password for root from 121.18.238.109 port 56169 ssh2
Oct 31 22:43:11 sshd[10141]: Failed password for root from 121.18.238.109 port 56169 ssh2
Oct 31 22:43:13 sshd[10141]: Failed password for root from 121.18.238.109 port 56169 ssh2
Oct 31 22:43:14 sshd[10178]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=116.31.116.23 user=root
Oct 31 22:43:17 sshd[10178]: Failed password for root from 116.31.116.23 port 60094 ssh2
Oct 31 22:43:18 sshd[10178]: Failed password for root from 116.31.116.23 port 60094 ssh2
Oct 31 22:43:20 sshd[10178]: Failed password for root from 116.31.116.23 port 60094 ssh2
Oct 31 22:43:20 sshd[10178]: Received disconnect from 116.31.116.23 port 60094:11: [preauth]
Oct 31 22:43:20 sshd[10178]: Disconnected from 116.31.116.23 port 60094 [preauth]
Oct 31 22:43:20 sshd[10178]: PAM 2 more authentication failures; logname= uid=0 euid=0 tty=ssh ruser= rhost=116.31.116.23 user=root
Oct 31 22:43:21 sshd[10141]: Received disconnect from 121.18.238.109 port 56169:11: [preauth]
Oct 31 22:43:21 sshd[10141]: Disconnected from 121.18.238.109 port 56169 [preauth]
Oct 31 22:43:21 sshd[10141]: PAM 2 more authentication failures; logname= uid=0 euid=0 tty=ssh ruser= rhost=121.18.238.109 user=root
More info that could help. I tired mysql but received ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
Also, here is the Error log
2016-10-31T19:06:58.590888Z 0 [Note] Giving 0 client threads a chance to die gracefully
2016-10-31T19:06:58.591364Z 0 [Note] Shutting down slave threads
2016-10-31T19:06:58.592730Z 0 [Note] Forcefully disconnecting 0 remaining clients
2016-10-31T19:06:58.592792Z 0 [Note] Event Scheduler: Purging the queue. 0 events
2016-10-31T19:06:58.598944Z 0 [Note] Binlog end
2016-10-31T19:06:58.701931Z 0 [Note] Shutting down plugin 'validate_password'
2016-10-31T19:06:58.705709Z 0 [Note] Shutting down plugin 'ngram'
2016-10-31T19:06:58.705751Z 0 [Note] Shutting down plugin 'ARCHIVE'
2016-10-31T19:06:58.705790Z 0 [Note] Shutting down plugin 'partition'
2016-10-31T19:06:58.705812Z 0 [Note] Shutting down plugin 'BLACKHOLE'
2016-10-31T19:06:58.705844Z 0 [Note] Shutting down plugin 'INNODB_SYS_VIRTUAL'
2016-10-31T19:06:58.706964Z 0 [Note] Shutting down plugin 'INNODB_SYS_DATAFILES'
2016-10-31T19:06:58.707001Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLESPACES'
2016-10-31T19:06:58.707020Z 0 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN_COLS'
2016-10-31T19:06:58.707037Z 0 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN'
2016-10-31T19:06:58.707054Z 0 [Note] Shutting down plugin 'INNODB_SYS_FIELDS'
2016-10-31T19:06:58.707208Z 0 [Note] Shutting down plugin 'INNODB_SYS_COLUMNS'
2016-10-31T19:06:58.707235Z 0 [Note] Shutting down plugin 'INNODB_SYS_INDEXES'
2016-10-31T19:06:58.707255Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLESTATS'
2016-10-31T19:06:58.707271Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLES'
2016-10-31T19:06:58.707286Z 0 [Note] Shutting down plugin 'INNODB_FT_INDEX_TABLE'
2016-10-31T19:06:58.707306Z 0 [Note] Shutting down plugin 'INNODB_FT_INDEX_CACHE'
2016-10-31T19:06:58.707322Z 0 [Note] Shutting down plugin 'INNODB_FT_CONFIG'
2016-10-31T19:06:58.707340Z 0 [Note] Shutting down plugin 'INNODB_FT_BEING_DELETED'
2016-10-31T19:06:58.707358Z 0 [Note] Shutting down plugin 'INNODB_FT_DELETED'
2016-10-31T19:06:58.707411Z 0 [Note] Shutting down plugin 'INNODB_FT_DEFAULT_STOPWORD'
2016-10-31T19:06:58.707432Z 0 [Note] Shutting down plugin 'INNODB_METRICS'
2016-10-31T19:06:58.707449Z 0 [Note] Shutting down plugin 'INNODB_TEMP_TABLE_INFO'
2016-10-31T19:06:58.707467Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_POOL_STATS'
2016-10-31T19:06:58.707484Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE_LRU'
2016-10-31T19:06:58.707499Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE'
2016-10-31T19:06:58.707516Z 0 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX_RESET'
2016-10-31T19:06:58.707534Z 0 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX'
2016-10-31T19:06:58.707566Z 0 [Note] Shutting down plugin 'INNODB_CMPMEM_RESET'
2016-10-31T19:06:58.707587Z 0 [Note] Shutting down plugin 'INNODB_CMPMEM'
2016-10-31T19:06:58.707603Z 0 [Note] Shutting down plugin 'INNODB_CMP_RESET'
2016-10-31T19:06:58.707619Z 0 [Note] Shutting down plugin 'INNODB_CMP'
2016-10-31T19:06:58.707636Z 0 [Note] Shutting down plugin 'INNODB_LOCK_WAITS'
2016-10-31T19:06:58.707654Z 0 [Note] Shutting down plugin 'INNODB_LOCKS'
2016-10-31T19:06:58.707672Z 0 [Note] Shutting down plugin 'INNODB_TRX'
2016-10-31T19:06:58.707698Z 0 [Note] Shutting down plugin 'InnoDB'
2016-10-31T19:06:58.711649Z 0 [Note] InnoDB: FTS optimize thread exiting.
2016-10-31T19:06:58.714623Z 0 [Note] InnoDB: Starting shutdown...
2016-10-31T19:06:58.819989Z 0 [ERROR] InnoDB: Operating system error number 13 in a file operation.
2016-10-31T19:06:58.820088Z 0 [ERROR] InnoDB: The error means mysqld does not have the access rights to the directory.
2016-10-31T19:06:58.820228Z 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool
2016-10-31T19:06:58.820416Z 0 [ERROR] InnoDB: Cannot open '/var/lib/mysql/ib_buffer_pool.incomplete' for writing: Permission denied
2016-10-31T19:07:00.692529Z 0 [Note] InnoDB: Shutdown completed; log sequence number 1333673446
2016-10-31T19:07:00.693394Z 0 [ERROR] InnoDB: Operating system error number 13 in a file operation.
2016-10-31T19:07:00.693505Z 0 [ERROR] InnoDB: The error means mysqld does not have the access rights to the directory.
2016-10-31T19:07:00.693546Z 0 [Note] Shutting down plugin 'MEMORY'
2016-10-31T19:07:00.693586Z 0 [Note] Shutting down plugin 'PERFORMANCE_SCHEMA'
2016-10-31T19:07:00.693933Z 0 [Note] Shutting down plugin 'MRG_MYISAM'
2016-10-31T19:07:00.695547Z 0 [Note] Shutting down plugin 'MyISAM'
2016-10-31T19:07:00.696705Z 0 [Note] Shutting down plugin 'CSV'
2016-10-31T19:07:00.696747Z 0 [Note] Shutting down plugin 'sha256_password'
2016-10-31T19:07:00.696758Z 0 [Note] Shutting down plugin 'mysql_native_password'
2016-10-31T19:07:00.696766Z 0 [Note] Shutting down plugin 'keyring_file'
2016-10-31T19:07:00.728590Z 0 [Note] Shutting down plugin 'binlog'
2016-10-31T19:07:00.737815Z 0 [Note] /usr/sbin/mysqld: Shutdown complete
try using sudo service mysql start
to start the mysqld service. the issue seems to be access rights - The error means mysqld does not have the access rights to the directory.
I had a problem with that, in my case the problem was with log file, because this file was deleted. I solved it by recreating this file '/var/log/mysql/error.log', after that my server starting normaly.
Your data directory of mysql is gone for toss in where it holds all the DBs including mysql sys and authentication files.
You need to reinitialize mysql database by making fresh empty mysql base & data directory. You can follow steps from mysqld data directory initialization
You can use following mysqld to reset password easily
$ mysqld --initialize-insecure
to reset password from your Linux Variant root login. Else its little tougher to figure out generated password and that you need to have with init-file for settings I think. Because for me the process doesn't show up any random password.
Both the following option to be used I think. I cannot go over again. But you can try that it is generating and showing random password.
$ mysqld --initialize-insecure --initialize
Article says,
With --initialize-insecure, (either with or without --initialize
because --initialize-insecure implies --initialize), the server does
not generate a password or mark it expired, and writes a warning
message
To reset password,
# mysql -u root --skip-password
....
mysql> ALTER USER 'root'#'localhost' IDENTIFIED BY 'new_password';