Why are instances permanently created and deleted in my project(s)? - google-compute-engine

For some reason I see under "Operations" in my "Compute Engine" the following:
I would like to know/understand why this is happening. What is this gae-default-* VM (assuming these are actually VMs)? What are they doing actually?
If you know a lot of stuff about GAE and the Compute Engine please consider taking a look at this question "Deploying a GWT application to Google Compute Engine - What is happening here?" as well.
The CPU is getting utilized as well even though there can't be anything that runs:
If I manually delete those VMs they simply re-appear.

GAE stands for Google App Engine. Looks like you have some App Engine jobs configured. If you use the flexible version then it would manage GCE instances on your behalf. I would imagine you should be able to find the running jobs in the web console.

Related

1gb database with only two records

I identified an issue with an infrastructure I created on the Google Cloud Platform and would like to ask the community for help.
A charge was made that I did not expect, as theoretically it would be almost impossible for me to pass the limits of the free tier. But I noticed that my database is huge, with 1gb and growing, and there are some very heavy buckets.
My database is managed by a Django APP, and accessing the tool's admin panel there are only 2 records in production. There were several backups and things like that, but it wasn't me.
Can anyone give me a guide on how to solve this?
I would assume that you manage the database yourself, i.e. it is not Cloud SQL. Databases pre-allocate files in larger chunks in order to be efficient on writes. You can check this yourself - write additional 100k records, most likely size will not change.
Go to Cloud SQL and it will say how many sql instances you have. If you see the "create instance" window that means that you don't have any Google managed SQL instances and the one we're talking about is a standalone one.
You can also check Deployment Manger to see if you deployed one from the marketplace or it's a standalone VM with MySQL installed manually.
I made a simple experiment and deployed (from Marketplace) a MySQL 5.6 on top of Ubuntu 14.04 LTS. Initial DB size was over 110MB - but there are some records in it initially (schema, permissions etc) so it's not "empty" at all.
It is still weird that your DB is 1GB already. Try to deploy new one (if your use case permits that). Maybe this is some old DB that was used for some purpose and all the content deleted afterwards but some "trash" still remains and takes a lot of space.
Or - it may well be exactly as NeatNerd said - that DB will allocate some space due to performance issues and optimisation.
If I have more details I can give you better explanation.

Will GoogleBot's indexing cause CloudSQL to be expensive on a low traffic website (Afraid of Google's CloudSQL pricing)

here is the issue:
I used the CloudSQL price calculator to estimate the price of running a website, my website has 1000-2000 URLs and each URL will use the DB in some way, I don't have more than 1GB of data, and I mostly deal with reads, for a small 50k record table, nothing super-complicated, I don't currently have very complex queries either, and I write into the db only once a week maybe a couple records here and there, I've even considered SQLITE tbh.
I don't currently have a lot of traffic, maybe people come visit once a day, however, GoogleBot will continuously try to index the website via the sitemap, which causes some times lots of requests on the server.
Currently, I have a normal php+mysql website which does the job on a DigitalOcean instance, which doesn't take a lot of resources, however, I want to move to Cloud Run in order to try the Cloud Run technology, but running MySQL directly on the VM is discouraged (as per this question Should I run mysql on google cloud run? (or any database))
So I'm kind of afraid of using CloudSQL and then having GoogleBot destroying my credit card by doing lots of concurrent requests into the CloudSQL Database during daily indexation.
Traffic doesn't scare me (I don't have any), but crawlers do.
Should I use CloudSQL for this usecase?
Will my credit card be destroyed?
Are these valid concerns?
Any opinion from experienced CloudSQL Users would be greatly appreciated.
If you consider fully managed database instance Google Cloud is definitely good choice for you.
If you want to optimize GoogleBot crawling, you can do it from here
However, if you experience high server load from specific sites/services you may consider blocking them or using Google Cloud CDN caching
Please read this article will explain how to deal with heavy bot load on the website
Your concerns do not sound valid to me, since you can limit GoogleBot crawling rate.
Since Cloud Run is compute platform STATELESS container service, it is not suited to install MySQL. If you are searching to install your own MySQL server and manage it, you can do it on Cloud Compute Engine using one click solution from Marketplace

Should I run mysql on google cloud run? (or any database)

I've been researching the new options to run Docker containers in Google Cloud Run, however, there seems to be no advice on whether or not one should run MySQL on Cloud run, apparently, I know it isn't a web service, and I understand in the Official Google Documentation for GCP, Google would probably just tell people to kindly use Cloud SQL (their SQL Offering), I haven't found any advice online about "running mysql on cloud run", so I thought I'd ask here.
Will startup times from cold starts decrease performance of the solution? (assuming one uses a Bucket for storing the stuff)
Running a SQL database is not a good fit for Cloud Run.
First of all, the contract between the deployed container and Cloud Run is that the container needs to run an HTTP server on port 8080. That's not really the way MySQL works.
Second of all, the container is going to be limited to the filesystem that was included in the container image. This same image is going to be instantiated many times over as the service handles load. There will be no way to persist the data written to MySQL. You could have read-only data stored in that image that only changes when a new image is published, but that's not really what you would expect to use a relational database for.
Cloud Run is really good at operating HTTP/web services in a serverless and scalable way. These web services typically make use of other APIs and service deployed to Google Cloud, or third party services. It's not really meant to offer persistent, scalable, ACID-compliant database services - this is a whole different sort of problem space.

Shrink a Dataproc worker boot disk

Due to some mix-up during planning we ended up with several worker nodes running 23TB drives which are now almost completely unused (we keep data on external storage). As the drives are only wasting money at the moment, we need to shrink them to a reasonable size.
Using weresync I was able to fully clone the drive to a much smaller one but apparently you can't swap the boot drive in GCE (which makes no sense to me). Is there a way to achieve that or do I need to create new workers using the images? If so, is there any other config I need to copy to the new instance in order for it to be automatically joined to the cluster?
Dataproc does not support VMs configuration changes in running clusters.
I would advise you to delete old cluster and create new one with workers disk size that you need.
I ended up creating a ticket with GCP support - https://issuetracker.google.com/issues/120865687 - to get an official answer to that question. Got an that this is not possible currently but should be available shortly (within months) in the beta GCP CLI, possibly in the Console on a later data as well.
Went on with a complete rebuild of the cluster.

Google compute engine - how many instance should I purchase in order to host two of the website?

I want to host two website which one is using main domain and another one is using subdomain.
I would like to know how many instance should I purchase in order to host two of the website as mentioned above.
It always depends on the workload and host specifications.
If you have a static website then i will suggest you to use Google App Engine instead of Google Compute Engine. No doubt in compute engine you have more control over components but you have responsibility to keep things running. In contrast, Google App Engine manages the system for you. So if you have no idea about workload and managment of VM's then you should go with Google App Engine.