When I run my docker containers in Google Cloud Run, any disk space they use comes from the available memory.
I'm running several self-hosted github action runners on a single local server, and they have worn out my SSD over the past year. The thing is, all the data they are saving is pointless. None of it needs to be saved. It exists for a few minutes during the build, and then is deleted.
Is it possible for me to instead run all these docker machines using memory for the disk space? That would improve performance and stop putting unnecessary wear on a hard drive.
Related
I am trying to stuff as many lamp stack containers inside docker as possible that are running wordpress. According to many sources it is possible to stuff over 500 containers on a docker server, for example:
Is there a maximum number of containers running on a Docker host?
However, they most likely discuss empty docker containers and not LAMP stack ones (apache+mysql).
I have managed to stuff 120 containers running wordpress and don't see a siginifcant issue with RAM or CPU. I should be able to fit more but performance goes downhill in response times after that point.
A colleague has mentioned that mysql performs significant I/O operations which could cause issues with file descriptors / open file limits / and fs.io.max.nr .
Could this be the case? Could it be that the number of open files of each containers limits the total number of containers?
I run a pretty customized cluster for processing large amounts of scientific data based on a basic LAMP design. In general, I run a separate MySQL server with around 128GB of ram and about 1TB of storage. Separately, I run a head node that serves as an nfs mount point for the data input of my process, and a webserver to display results. Finally, I trypically have a few compute nodes that get their jobs from a mysql table, get the data from NFS, do some heavy lifting, then put results into mysql.
I have come across a dataset I would like to process which is pretty large (1TB of input data), and I don't really have the hardware on hand to handle it. As a result, I began investigating google compute engine etc, and the prospect of scaling instances to process these data rapidly with the results stored in a mysql instance. Upon completion the mysql tables could be dumped from the cloud and brought up locally for analysis. I would have no problem deploying a MySQL server, along with the rest of the LAMP pieces and the compute nodes, but I can't quite figure out how I would do this in the cloud.
A major sticking point seems to be the lack of read/write NFS which would allow me to get the data onto several instances, crunch it, then push the results to MySQL. This is a necessary step for me as I could queue hundreds of jobs from the webserver, then have the instances (as many as 50-100) pick the jobs up by connecting to a centralized mysql instance to find out what jobs an instance needs to do and where the data is. Process the data (there is a file conversion that happens which make the write part necessary), crunch the data, then load results to mysql. I hope I'm explaining my situation clearly. This seems like a great example of a CPU intensive process that would scale nicely in the cloud, I just can't seem to put all the pieces together... Any input is appreciated!
It sounds quite possible; I've been doing similar things in GCE for a while now.
NFS mount - you just need configure it as you would normally. Set up the NFS server on the head node, and then configure the clients on the slave nodes to mount it. Here and here are some basic configuration instructions for Centos 6 I used to get NFS up and running.
Setting up a LAMP stack is very straightforward. These machines run pretty much vanilla Linux distros, so you can just use yum or apt-get to install components.
For the cluster, you will probably end up having an image for the head node you use once, and then another image for the slave nodes that you replicate for each one.
For the scheduler, I've used Condor and Sge successfully, but I'm sure the other ones would work just as well.
Hope this helps.
I have gone over the pricing and documentation so many times but still do not understand pricing...
I picked a bare minimum server setup (CPU, RAM, etc). I am using this server as a development server (eventually) so it will be actively used about 6-8 hours a day, 5 days per week...when I entered these values in their "cost calculator" the result was a few bucks a month...perfect!
However I have been up and running for less than a week and already the price is $0.65 with a usage of 2,880.00 Minutes?!?!?!
So I am not being billed only for activity but for server uptime, entirely??? So even if the server sits idle, I am getting charged? Is there a way I can disable the instance during non-work hours? Re-enabling it when I arrive in the morning?
EDIT | how to stop compute engine instance without terminating the instance?
This may have answered my questions...
As the other question answered, you are billed by the minute while your server is running, whether or not it is actively using the CPU.
At the moment, there's no way to leave a server shut down and restart it later; you can use a persistent boot disk to store your development state and create/delete an instance (with the same name) each day that you want to use your server.
To use a persistent boot disk like this, you'll want to make sure that the "Delete boot disk when instance is deleted" checkbox is UNCHECKED -- you want your boot disk to stick around after the instance is deleted. The next time you create your instance, select "Use existing disk", and select the same disk.
You'll still pay $0.04/GB/month for the disk storage all the time, but you'll only pay for the instance running when you need it.
You can add a cron job that checks every 10 minutes to see if the load on the machine is less than 0.05 and no one is logged in and then runs "shutdown -p now" to shut down the machine if you're concerned about forgetting to shut down the machine.
I have created a Google Compute Engine instance with CentOS and added some stuff there, such as Apache, Webmin, ActiveCollab, Gitolite etc.. etc.
The problem is that the VM is always running out of memory because the RAM is too low.
How do I change the assigned RAM in Google Compute Engine?
Should I have to copy the VM to another with bigger RAM? If so will it copy all the contents from my CentOS installation?
Can anyone give me some advises on how to get more RAM without having to reinstall everything.
Thanks
The recommended approach for manually managed instances is to boot from a Persistent root Disk. When your instance has been booted from Persistent Disk, you can delete the instance and immediately create a new instance from the same disk with a larger machine type. This is similar to shutting down a physical machine, installing faster processors and more RAM, and starting it back up again. This doesn't work with scratch disks because they come and go with the instance.
Using Persistent Disks also enables snapshots, which allow you to take a point-in-time snapshot of the exact state of the disk and create new disks from it. You can use them as backups. Snapshots are also global resources, so you can use them to create Persistent Disks in any zone. This makes it easy to migrate your instance between zones (to prepare for a maintenance window in your current zone, for example).
Never store state on scratch disks. If the instance stops for any reason, you've lost that data. For manually configured instances, boot them from a Persistent Disk. For application data, store it on Persistent Disk, or consider using a managed service for state, like Google Cloud SQL or Google Cloud Datastore.
In our application we use a EBS partition with a Mysql database. Eventually, we ran out of space and had to a bigger partition for the DB.
We used the AWS panel functionality to create a new volume using a snapshot of the previous one.
Mysql was stopped, and now we are using the new, bigger EBS partition.
However, we had noticeable degradation of the database performance. We are not sure how this could happen, since theoretically we are using the same Mysql configuration and the same database.
Is it possible that maybe we have to rebuild indexes or re-optimize tables? I am not sure if that would be worth anything, so we haven't tried it yet because we are afraid it could slow down the database even more, and our application cannot be stopped easily, as it runs 24/7.
Can anyone help please?
When you start using a new EBS volume (whether created from a snapshot or created empty) there is always a first-use performance penalty for each block. This will manifest itself as slow performance from your MySQL database using the new volume.
You can "dd" the EBS volume to /dev/null to make sure all blocks have been hit. Here's an article I wrote on how to do this: http://alestic.com/2010/03/ebs-volume-initialization-from-snapshot
There may also be a performance hit while the database is brought into memory through queries. This is a standard IO issue that would happen when restarting the database on any platform and is unrelated to EC2 or EBS.
If the performance stays slow after everything has warmed up and should be humming, then you may try things like:
Create a new EBS volume and test it just in case the slow one was using defective hardware at EC2.
Move your EC2 instance to new hardware just in case your neighbors on the current hardware are network heavy and interfering with your EBS IO. This can be done through a simple stop/start (I wrote about it here: http://alestic.com/2011/02/ec2-move-hardware )
Move your database to 4-8 EBS volumes configured in RAID-0. This is a common approach to try to smooth out EBS IO volatility.
Consider trying out Amazon RDS. Some people have found that they get better performance with Amazon taking care of this part of the infrastructure.
Note also that you may experience IO slowness while an EBS snapshot is being created from an EBS volume that is being written to heavily. One approach to alleviate this is to replicate the master MySQL database to a different server and create the snapshots on the second server's EBS volume(s).