Image push/pull very slow with OpenShift Origin 3.11 - openshift

I'm setting up an OKD Cluster in a VM via oc cluster up. My VM has 6 GB of RAM and 2 CPUs. Interactions with the internal image registry are very, very slow (multiple minutes to pull or push an image, e.g. when building an application via S2I). At the same time, htop shows me a CPU utilization of 100% for both CPUs within the VM. Is there any way to avoid this issue?

Related

Why is GCE VM with COS image faster?

I have two GCE instances: one with a COS image running CentOS 7 in a container. Let's call it VM1. And another with a CentOS 7 image directly on it. Let's call it VM2. Both of them run the same PHP app (Laravel).
VM1
Image: COS container with CentOS 7
Type: n1-standard-1 (1 vCPUj, 3,75 GB)
Disk: persistent disk 10 GB
VM2
Image: CentOS 7
Type: n1-standard-1 (2 vCPUj, 3,75 GB)
Disk: persistent disk 20 GB
As you can see, VM2 has a slightly better spec than VM1. So it should perform better, right?
That said, when I request an specific endpoint, VM1 responds in ~ 1.6s, while VM2 responds in ~ 10s. It's about 10x slower. The endpoint does exactly the same thing on both VMs, it queries the data base on a GCP SQL instance, and returns the results. Nothing abnormal.
So, it's almost the same hardware, it's the same guest OS and the same app. The only difference is that the VM1 is running the app via Docker.
I search and tried to debug many things but have no idea of what is going on. Maybe I'm misunderstanding something.
My best guess is that the COS image has some optimization that makes the app execution faster. But I don't know what exactly. Firstly I thought it could be some disk IO problem. But the disk utilization is OK on VM2. Then I thought it could be some OS configuration, then I compared the sysctl settings of both VMs and there's a lot of differences, as well. But I'm not sure what could be the key for optimization.
My questions are: why is this difference? And what can I change to make VM2 as faster as VM1?
First of all, Container-Optimized OS is based on the open-source Chromium OS, it is not CentOS, basically it is another Linux distribution.
Having said that, you need to understand that this OS is optimized for running Docker containers.
It means that Container-Optimized OS instances come pre-installed with the Docker runtime and cloud-init and basically that is all that this OS contains, because it is a minimalistic container-optimized operating system.
So, this OS doesn’t waste resources with all the applications, libraries and packages that CentOS have that can consume extra resources.
I have installed both OSs in my own project to check the Disk usage of each OS, and Container-Optimized OS from Google only uses 740MB, and CentOS consumes 2.1GB instead.
$ hostnamectl | grep Operating
Operating System: Container-Optimized OS from Google
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/root 1.2G 740M 482M 61% /
$ hostnamectl | grep Operating
Operating System: CentOS Linux 7 (Core)
$ df -h
/dev/sda2 20G 2.1G 18G 11% /
I wasn't able to use a small persistent disk with CentOS, the minimum is 20 GB.
On the other hand, containers let your apps run with fewer dependencies on the host virtual machine (VM) and run independently from other containerized apps that you deploy to the same VM instance and optimize the resources used.
I don't have many experience with GCP (just Azure and AWS), but this problem maybe about latency. So You need confirm if all your assets are in the same region.
You can try to verify the time to response from each VM to your BD. With this information you will be able to know if this situation is about of latency or not.
https://cloud.google.com/compute/docs/regions-zones/

Horizontal pod autoscaling in openshift

does open shift actively monitor cpu for all the running processes or just the first process that was run.
i am running a service that is configured to use horizontal auto scaling capabilities.
i have hawkuler metrics and heapster setup and working as expected
i have set my resource limits and target cpu utilization at 50%.
once the pod hits cpu >50 i can see it scale up to 3 pods which is the max that i have in my configuration.
my question is does it monitor all the 3 pods at this point ?
also, is there a way to scale up in steps, like bring up one more pod and if the cpu is above 50% then bring up another on and so on or the maximum number of pods configured will be brought up immediatley once the cpu utlization on the 1st pod hits >50%
My understanding is that it's by pod.
Ex. Pod 1 goes over your CPU limit, so Pod 2 is deployed. 5 minutes later, Pod 2 goes over your CPU limit, and Pod3 is deployed.

Openshift combine 3 gears into 1 gear in free plan

i have opensfhit account and i am setting up my application which required more space than 1GB. as stated in this link , each gear has 1GB memory space.
and maximum of 3 gears allowed. is it possible to combine 3 Gears into 1 Gear which can have 3GB of space. currently i am in free plan, and having 1 gear with 1GB space, which has two cartridges Jboss and Postgresql database. Both combined is taking more than 1GB. so i cant deploy the application due to space constraint.. any direction would really help me.
Edit :-
I have created the scaled application in free plan, and as per openshift document, each gear can max hold 1GB space, and in my case , 2 Jboss (scaled) , load balancer, 1 psql database, so one gear will have(Jboss + load balancer) combined 1GB, 2nd gear postgresql 1GB space, and 3rd gear Jboss 1GB (scaleable).
Note :- in above case, minimal allowable gears to scale will be 2 and not 3, since one gear already allocated to database. But maximum allowed gears are 3 for scalling, and i don't know how it works.
From openshift admin panel -
JBoss Application Server 7 using 2
OpenShift is configured to scale this cartridge with the web proxy
HAProxy. OpenShift monitors the incoming web traffic to your
application and automatically adds or removes copies of your cartridge
(each running on their own gears) to serve requests as needed.
Control the number of gears OpenShift will use for your cartridge:
Minimum 2 (dropdown) and Maximum 3 (dropdown) small gears
Each scaled gear is created the same way - the normal post, pre, and
deploy hooks are executed. Each cartridge will have its own copy of
runtime data, so be sure to use a database if you need to share data
across your web cartridges.
If you deploy as a scaled application then the database would reside on a separate gear from your JBoss application, so the database would have 1GB of disk space all to itself. So you would basically have 1GB for your DB, and 1GB for JBoss. If that isn't enough then you would have to upgrade to a paid plan in order to have more disk space available on an individual gear.
I ran into the same issue and found that this was not well documented or at least not intuitively described, since 3*1GB initially seems to imply that you might just have 3GB total disk space, which is not quite the case.
Here is a quote from the documentation on scalable applications (if it is not scalable you only have 1 Gear anyways):
The HAProxy cartridge sits between your application and the public internet and routes web traffic to your web cartridges. When traffic increases, HAProxy notifies the OpenShift servers that it needs additional capacity. OpenShift checks that you have a free gear (out of your remaining account gears) and then creates another copy of your web cartridge on that new gear. The code in the git repository is copied to each new gear, but the data directory begins empty. When the new cartridge copy starts it will invoke your build hooks and then the HAProxy will begin routing web requests to it. If you push a code change to your web application all of the running gears will get that update.
Source: https://developers.openshift.com/en/managing-scaling.html (in section "How Scaling Works"
To summarise: GIT data is copied across all gears, so you have 3 times 1GB of the identical GIT data. #mbaird pointed out that this is not true for user data, which is not replicated. Also, depending on your cartridge, in a scaled application your database might be on a separate gear.
For static content hosting it seems like that if you need more disk space or INodes you have to change to a different plan or spread your data across multiple applications.

Google Compute Instance 100% CPU Utilisation

I am running n1-standard-1 (1 vCPU, 3.75 GB memory) Compute Instance , In my android app around 80 users are online write now and cpu Utilisation of instance is 99% and my app became less responsive. Kindly suggest me the workaround and If i need to upgrade , can I do that with same instance or new instance needs to be created.
Since your app is running already and users are connecting to it, you don't want to do the following process:
shut down the VM instance, keeping the boot disk and other disks
boot a more powerful instance, using the boot disk from step (1)
attach and mount any additional disks, if applicable
Instead, you might want to do the following:
create an additional VM instance with similar software/configuration
create a load balancer and add both the original and new VM to it as a backend
change your DNS name to point to the load balancer IP instead of the original VM instance
Now, your users will be randomly sent to a VM that's least-loaded to see the application, and you can add more VMs if your traffic increases.
You did not describe your application in detail, so it's unclear if each VM has local state (e.g., runs a database) or there's a database running externally. You will still need to figure out how to manage the stateful systems such as database or user-uploaded data from all the VM instances, which is hard to advise on given the little information in your quest.

Kvm/Qemu maximum vm count limit

For a research project I am trying to boot as many VM's as possible, using python libvirt bindings, in KVM under Ubuntu server 12.04. All the VM's are set to idle after boot, and to use a minimum amount of memory. At the most I was able to boot 1000 VM's on a single host, at which point the kernel (Linux 3x) became unresponsive, even if both CPU- and memory usage is nowhere near the limits (48 cores AMD, 128GB mem.) Before this, the booting process became successively slower, after a couple of hundred VM's.
I assume this must be related to the KVM/Qemu driver, as the linux kernel itself should have no problem handling this few processes. However, I did read that the Qemu driver was now multi-threaded. Any ideas of what the cause of this slowness may be - or at least where I should start looking?
You are booting all the VMs using qemu-kvm right, and after 100s of VM you feel it's becoming successively slow. So when you feels it stop using kvm, just boot using qemu, I expect you see the same slowliness. My guess is that after those many VMs, KVM (hardware support) exhausts. Because KVM is nothing but software layer for few added hardware registers. So KVM might be the culprit here.
Also what is the purpose of this experiment ?
The following virtual hardware limits for guests have been tested. We ensure host and VMs install and work successfully, even when reaching the limits and there are no major performance regressions (CPU, memory, disk, network) since the last release (SUSE Linux Enterprise Server 11 SP1).
Max. Guest RAM Size --- 512 GB
Max. Virtual CPUs per Guest --- 64
Max. Virtual Network Devices per Guest --- 8
Max. Block Devices per Guest --- 4 emulated (IDE), 20 para-virtual (using virtio-blk)
Max. Number of VM Guests per VM Host Server --- Limit is defined as the total number of virtual CPUs in all guests being no greater than 8 times the number of CPU cores in the host
for more limitations of KVm please refer this document link