Google Cloud SQL Restart and Update - google-compute-engine

Randomly, out of nowhere, we stopped being able to connect to the Google Cloud SQL database at almost precisely 8:30 am ET this morning.
We then tried to restart the instance and have been stuck for more than an hour with a similar situation to this question. It seems that this sort of freak accident has happened before on Google Cloud SQL.
The problem is that the instance is completely unresponsive to any commands - either via the GUI or the command line.
To make matters worse, there's no way to call support unless you pay hundreds of dollars per month to join a plan. I'm hoping that someone from Google might be trolling the SO threads with these tags, or someone who has dealt with this before can offer some advice.

Just providing update to this, for anyone who comes across the future...
The issue was with the Google Cloud SQL instance itself. Someone from tech support had to go in at a lower level and restart the entire instance. Basically, there's nothing you can do if you encounter the exact same situation.
This issue happened again just a few days ago (twice in the same 4 weeks), and again, there was nothing we could do.
NOTE: When this happens, you CANNOT access your db backups. This is serious cause for concern.
This seems very strange for a hosted db product, and I've come across similar cases documented by others.
We're still waiting on a post mortem that has taken almost 2 weeks, despite our upgrading to "gold level support" for $400 per month. In the meantime, we're migrating over to AWS as we've never experienced issues with downtown on RDS.

Related

Is it recommended to change RDS instance type regularly to scale down at night?

I got an app that is heavily used during the daytime and almost not used at night and on sundays.
RDS MySQL costs are quite high and I am wondering if I can not just downscale at night and upscale in the morning by changing instance type from m5.4xlarge to m5.large and back.
I dont want to fully turn it of since we got some crons running at night and also some users might want to access the system out of business hours but usually the DB just runs around 2% CPU around those hours.
The only recommendations I found about this kind of usage is to turn off instances which I would like to avoid. I manually changed instance type in the last days and to me it seems it should be no problem to automate this. The downtime of aprox. 2min for the instance change is completely acceptable for me and I could schedule this for times where usage is already low.
This question is not about how this is technically possible but if it is generally recommended or if there is any downside I didnt think if yet.
I am really wondering why I didnt find anything about this since it seems quite a common usecase for any business app that is used only in one timezone.
I did that manually and seems to work fine. I searched the internet for hours about this but didnt find anything useful pro or contra.
It's not something that people typically do. Amazon Aurora Serverless would be a much better way to handle this.
Aurora Serverless v2 is an on-demand, autoscaling configuration for Amazon Aurora. Aurora Serverless v2 helps to automate the processes of monitoring the workload and adjusting the capacity for your databases. Capacity is adjusted automatically based on application demand. You're charged only for the resources that your DB clusters consume. Thus, Aurora Serverless v2 can help you to stay within budget and avoid paying for computer resources that you don't use.
With the added benefit that Aurora can deliver up to five times the throughput of MySQL.
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2.html#aurora-serverless-v2.advantages

Google cloud SQL - CPU at 100%

Earlier we noticed that our Master DB CPU started spiking:
There wasn't any unusual traffic volume/load. Also, if you look at the earlier spikes they coincide with the Google backups, but it looks like there wasn't one on the 19th despite it saying that it was run in the operations logs. I'm guessing that the Google backup went wrong on the server and it went out of control the next morning when it eventually ran.
I've cloned that server and moved the traffic across to the new server and now the CPU has dropped to 10-20% but this is still a lot higher than normal (1-5%)
Things that I've checked:
- Process list
- Traffic volumes
- DB/Table sizes
Any ideas how to get to the bottom of what's causing the change? or how to fix?
High CPU usage in a database can be caused by a bunch of different things. It might have been a wide or inefficient query, a backup process gone wrong, or a few other likely suspects.
If your app can support downtime, you could try shutting it down and restarting to get a fresh state.
If you have the support package, you can also open a ticket and ask them to look into the spike farther. If you don't, you can still open an issue on the Cloud SQL issue tracker, but the response time might not be as fast.

upgrade mysql 5.0.x to 5.x on appserv windows

I want to upgrade my appserv mysql instalation from 5.0.x to 5.x.
I have some tables and views relationed with various web proyects and VB.net aplications in that.
Any body can help me to do that without data loss?
(Putting this in an answer as it's too long for a comment)
NB - I've not used AppServ so this answer is generic
The versions of software within AppServ appear to be old. Very old. MySQL 5.0.51b, PHP 5.2.6, Apache 5.2.8 are way behind with regards to security and features. The best thing you can do is to replace the whole stack with a newer one
If you do a quick Google search for WAMP installer, a plethora of available stacks are listed. The first one in the list uses MySQL 5.6.17, PHP 5.5.12, Apache 2.4.9. Again, not the newest, but much more recent and feature rich. It's also available in 32 and 64 bit versions
The first thing to do is to download a virtual machine system. (VirtualBox is a pretty simple one to get to grips with and runs on a variety of platforms). This is so that you can practise.
Spool up an instance of Windows (which is as close as possible to your live setup) and install your current version of AppServ and your applications which use it, take a snapshot (so you can roll back) and then work out slowly how to update to a new stack. Take lots of snapshots as you go.
You need to make note of your MySQL data directories and back up your Apache, MySQL and PHP configurations
It will take time to iron out the bugs and problems you find along the way. Do not be downhearted.
Once you have worked out how to update your stack without data loss, try your applications on the virtual machine. There is no point in upgrading your stack if your software is going to bomb out the second it start to run.
Once you're satisfied what all the steps you need are, roll back to the snap shot you took at the start and go through all the steps again. Then again. And keep on restoring/upgrading it until you are confident that you can do the update with the minimum of fuss and panic on the live system
I would recommend doing your update over two sessions. For both sessions, choose a quiet time to do it. Essentially, out of office hours is the best, early morning (after a good sleep) is even better.
During the first session (SESSION-1) the server offline, backup everything, then return the server to live. And when I say "backup everything", I mean EVERYTHING! Take this backup and restore it to a virtual machine. Go through your steps that you worked out before on this restored version to make sure everything is going to work. Make a note of anything that is different to the steps you worked out earlier.
When you've done your testing, you can do session two (SESSION-2). Again, take the server offline, run a differential backup on the system and a full backup of the MySQL databases. Update your WAMP stack (using the steps you worked out in SESSION-1) and bring it back online. Check that all your URLs and code still works.
After you've completed your checks, send triumphant emails to whoever needs to know, put a smug smile on your face for a job well done, pour yourself a large glass of whiskey (other drinks are available) and relax - you've earned it
Sorry that I can't give you definitive steps but I use Linux for all my PHP stacks so these steps are what I would do if I was upgrading them. I spent 3 months practising upgrading my servers then did all of them in a single night (I have separate MySQL servers so it was only the Apache/PHP side I was updating - much easier and quicker)
Hopefully some of this helps. Good luck

My servers status suddenly got PROVISIONING in Google Compute Engine

After restarting some of my servers in Google Compute Engine and try to connect them via ssh they are all in PROVISIONING status for more than 4 hours!
According to google documentation:
https://cloud.google.com/compute/docs/instances#checkmachinestatus
PROVISIONING - Resources are being reserved for the instance. The
instance isn't running yet.
Well, they were working for more than one month.
I tried several time to turn them off via gcloud command-line tool but it didn't work.
check for any problem in Google Cloud Status, nothing is mentioned there for today:
https://status.cloud.google.com
Any idea?
In some cases where your "PROVISIONING" takes too long, depending on the region/zone/time you try to create your instances in, the issue could be related to (generally, but not limited to. I'm just giving some ideas):
Limited resources in that zone, at that time, could cause the instances to hang in the provisioning status. Google usually takes care of this pretty quickly, within a few days or something. I don't have exact information but they add more resources to the zone and the issue disappears. They also add new zones and such, so it could be worth moving to a new zone if more resources are readily available there.
Temporary issue that should resolve itself.
A few questions I have for you:
Is this still happing? Considering the fact that you posted about a month ago, I assume you've gotten past this issue and everything is working as expected at this time. If so, I'd recommend posting an answer yourself with details on what happened or what you did to fix it. You can then accept your own answer so that others can see what fixed it.
Have you tried creating instances in different zones to see if you have the problem everywhere or just within a specific one?
All in all, this is usually a transient issue, based on my experience with the Google Cloud Platform. If this is still causing trouble for you, give us some more information on what is currently happening and the community might be able to help better.

Is there any way to manually migrate a GCE VM in a maintenance window?

With no warning e-mail, it seems that europe-west1 Zone B has gone down for maintenance, for 16 days until the 1st April 2014. Being that GCE is a cloud based service and that I have the automatic 'migrate on maintenance' setting enabled, I assumed that I had nothing to worry about. However, after the VM was terminated last night and I reread 'Designing Robust Systems' it seems that I was badly mistaken/misled! It will take 3 days work to rebuild a new server and I have 20 students with data locked up for two weeks in the middle of the semester. Does anybody have any suggestions?
I made the exact same mistake. I have been in contact with Google, and there is NOTHING we can do but wait for the maintenace window to exit.
Also, my instance is now removed, so I will have to re-create the instance and attach to my persistent disk.