My cloud sql instance is stuck in Restart sate for a very long time.
In the operations pane, the status of the Restart is showing as pending, and there was also an export happening whose state is still Running .
Is there an way i can force the restart or cancel the restart or recover the data from the regular backup ?
No, there is no way. If you pay Google for a premier account, you will be able to log a support ticket and they'll look into it and probably fix it.
There's not many options you have, maybe try restarting again?
Related
I have a small instance running in GCE, had some troubles with the MongoDb so after some tries decided to reset the instance. But... it didn't seem to come back online. So i stopped the instance and restarted it.
It is an Bitnami MEAN stack which starts apache and stuff at startup.
But... i can't reach the instance! No SCP, no SSH, no webservice running. When i try to connect via SSH (in GCE) it times out, cant make connection on port 22. In the information it says 'The instance is booting up and sshd is not running yet', which is possible of course.... But i cant reach the instance in no possible manner not even after an hour wait :) Not sure what's happening if i cant connect to it somehow :(
There is some activity in the console... some CPU usage, mostly 0%, some incomming traffic but no outgoing...
I hope someone can give me a hint here!
Update 1
After the helpfull tip form Serhii... if found this in the logs...
Booting from Hard Disk 0...
[ 0.872447] piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or use force_addr=0xaddr
/dev/sda1 contains a file system with errors, check forced.
/dev/sda1: Inodes that were part of a corrupted orphan linked list found.
/dev/sda1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)
fsck exited with status code 4
The root filesystem on /dev/sda1 requires a manual fsck
Update 2...
So, i need to fsck the drive...
Created a snapshot, made a new disk from that snapshot, added the new disk as an extra disk to another instance. Now that instance wont boot with the same problem... removing the extra disk fixed it again. So adding the disk makes it crash even though it isn't the boot-disk?
First, have a look at the Compute Engine -> VM instances -> NAME_OF_YOUR_VM -> Logs -> Serial port 1 (console) and try to find errors and warnings that could be connected to lack of free space or SSH. It'll be helpful if you updated your post by providing this information. In case if your instance run out of free space follow this instructions.
You can try to connect to your VM via Serial console by following this guide, but keep in mind that:
The interactive serial console does not support IP-based access
restrictions such as IP whitelists. If you enable the interactive
serial console on an instance, clients can attempt to connect to that
instance from any IP address.
more details you can find in the documentation.
Have a look at the Troubleshooting SSH guide and Known issues for SSH in browser. In addition, Google provides a troubleshooting script for Compute Engine to identify issues with SSH login/accessibility of your Linux based instance.
If you still have a problem try to use your disk on a new instance.
EDIT It looks like your test VM is trying to boot from the disk that you created from the snapshot. Try to follow this guide.
If you still have a problem, you can try to recreate the boot disk from a snapshot to resize it.
I've had some of Google Cloud SQL MySQL 2nd Gen 5.7 instances with failover replications. Recently I noticed that the one of the instance overloaded with the storage overloaded with binlogs and old binlogs not deleted for some reason.
I tried restart this instance but it wont start since 17 March.
Normal process with binlogs on other server:
Problem server. Binlogs not clearing and server wont start and always under maintenance in the gcloud console.
Also I created one other server with same configuration and not binlogs never clearing. I have already 5326 binlogs here when on normal server I have 1273 binlogs and they are clearing each day.
What I tried with the problem server:
1 - delete it from the Google Cloud Platform frontend. Response: The instance id is currently unavailable.
2 - restart it with the gcloud command. Response: ERROR: (gcloud.sql.instances.restart) HTTPError 409: The instance or operation is not in an appropriate state to handle the request. Same response on any other command which I sent with the gcloud.
Also I tried to solve problem with binlogs to configure with expire_logs_days option, but it seems this option not support by google cloud sql instance.
After 3 days of digging I found a solution. Binlogs must cleared automatically when 7 days past. In 8 day it must clear binlogs. It still not deleted for me and still storage still climbing, but I trust it must clear shortly (today I guess)
As I told - SQL instance always in maintenance and can't be deleted from the gcloud console command or frontend. But this is interesting because I still can connect to the instance with the mysql command like mysql -u root -p -h 123.123.123.123. So, I just connected to the instance, deleted database which unused (or we can just use mysqldump to save current live database) and then I just deleted it. In the mysql logs (I'm using Stackdriver for this) I got a lot of messages like this: 2018-03-25T09:28:06.033206Z 25 [ERROR] Disk is full writing '/mysql/binlog/mysql-bin.034311' (Errcode: -255699248 - No space left on device). Waiting for someone to free space.... Let's me be this "someone".
When I deleted database it restarted and then it up. Viola. And now we have live instance. Now we can delete it/restore database on it/change storage for it.
I decided to play around with Google Could SQL and I setup a test sql instance, loaded it with some data and then setup replication on it in the google dev console. I did my testing and found out it all works great, the master/slave setup works as it should and my little POC was a success. So now I want to delete the POC sql instances but that's not going so well.
I deleted the replica instance fine (aka the 'slave') but for some reason the master instance still thinks there is a slave and therefore will not let me delete it. For example I run the following command in the gclound shell:
gcloud sql instances delete MY-INSTANCE-NAME
I get the following message:
ERROR: (gcloud.sql.instances.delete) The requested operation is not valid for a replication master instance.
This screenshot also shows that in the google dev console it clearly thinks there are no replicas attached to this instance (because I deleted them) but when I run:
gcloud sql instances describe MY-INSTANCE-NAME
It shows that there is a replica name still attached to the instance.
Any ideas on how to delete this for good? Kinda lame to keep on paying for this when it was just a POC that I want to delete (glad I didn't pick a high memory machine!)
Issue was on Google's side and they fixed it. Here were the sequence of events that led to the issue happening:
1) Change master's tier
2) Promote replica to master while the master tier change is in progress
Just had the same problem using GCloud. Deleting the failover replica first and then the master instance worked for me.
i need to update listen_addresses in postgres.conf file and want to avoid restart of postgres server process. I tried pg_ctl reload but its not working. Postgres documentation for this parameter says "This parameter can only be set at server start."
http://www.postgresql.org/docs/8.4/static/runtime-config-connection.html#GUC-LISTEN-ADDRESSES
It there any possible way to avoid the restart?
It there any possible way to avoid the restart?
No. That's why the documentation says it may only be set at restart.
If you can't afford the downtime for a simple database restart then you almost certainly need to have a connection pooling and failover system in place anyway. Start planning that so you can introduce it at the same time.
Also, 8.4 is old. If you're restarting a busy system anyway, consider planning an upgrade into the mix. Look at pg_upgrade.
I just created a free php gear...
Is the instance automatically configured to roll logs and delete old logs (to make sure we dont go over disk quota?)
Can you pls tell me how often logs are rolled and when old ones get deleted?
thanks
At this moment (April 2014), Apache RotateLogs does not seem to be used anymore. This commit seems to have changed to use logshifter, which reportedly seems to default to rotating every 10MB with a max of 10 log files.
So, to answer your question, it seems like things are automatically configured to roll logs and delete old logs to prevent us from going over disk quota.
BTW, the new logshifter setup combines the access_log and error_log into one log file instead of keeping them separate.
At this moment (Feb 2014), all OpenShift Apache-based cartridges use Apache RotateLogs program to rotate logs every midnight:
/usr/sbin/rotatelogs <gear-dir>/php/logs/access_log-%Y%m%d-%H%M%S-%Z 86400
The log files are not deleted automatically. However, you can delete them manually using rhc app-tidy <app> command. (Read more about rhc tools.)
If concerned about logs eating all your gear capacity, you might consider using monit community cartridge to trigger automatic email notifications when the app hits 80% of gear storage quota, or to tidy your app automatically. If you already created your app, you can add the monit cartridge with the following commands:
rhc env set MONIT_ALERT_EMAIL=my#email.com -a YOUR_APP
rhc cartridge-add http://goo.gl/jiIB2C -a YOUR_APP
And last but not least, feel free to open a new bug report or new feature request for OpenShift.