Kvm/Qemu maximum vm count limit - qemu

For a research project I am trying to boot as many VM's as possible, using python libvirt bindings, in KVM under Ubuntu server 12.04. All the VM's are set to idle after boot, and to use a minimum amount of memory. At the most I was able to boot 1000 VM's on a single host, at which point the kernel (Linux 3x) became unresponsive, even if both CPU- and memory usage is nowhere near the limits (48 cores AMD, 128GB mem.) Before this, the booting process became successively slower, after a couple of hundred VM's.
I assume this must be related to the KVM/Qemu driver, as the linux kernel itself should have no problem handling this few processes. However, I did read that the Qemu driver was now multi-threaded. Any ideas of what the cause of this slowness may be - or at least where I should start looking?

You are booting all the VMs using qemu-kvm right, and after 100s of VM you feel it's becoming successively slow. So when you feels it stop using kvm, just boot using qemu, I expect you see the same slowliness. My guess is that after those many VMs, KVM (hardware support) exhausts. Because KVM is nothing but software layer for few added hardware registers. So KVM might be the culprit here.
Also what is the purpose of this experiment ?

The following virtual hardware limits for guests have been tested. We ensure host and VMs install and work successfully, even when reaching the limits and there are no major performance regressions (CPU, memory, disk, network) since the last release (SUSE Linux Enterprise Server 11 SP1).
Max. Guest RAM Size --- 512 GB
Max. Virtual CPUs per Guest --- 64
Max. Virtual Network Devices per Guest --- 8
Max. Block Devices per Guest --- 4 emulated (IDE), 20 para-virtual (using virtio-blk)
Max. Number of VM Guests per VM Host Server --- Limit is defined as the total number of virtual CPUs in all guests being no greater than 8 times the number of CPU cores in the host
for more limitations of KVm please refer this document link

Related

Why is GCE VM with COS image faster?

I have two GCE instances: one with a COS image running CentOS 7 in a container. Let's call it VM1. And another with a CentOS 7 image directly on it. Let's call it VM2. Both of them run the same PHP app (Laravel).
VM1
Image: COS container with CentOS 7
Type: n1-standard-1 (1 vCPUj, 3,75 GB)
Disk: persistent disk 10 GB
VM2
Image: CentOS 7
Type: n1-standard-1 (2 vCPUj, 3,75 GB)
Disk: persistent disk 20 GB
As you can see, VM2 has a slightly better spec than VM1. So it should perform better, right?
That said, when I request an specific endpoint, VM1 responds in ~ 1.6s, while VM2 responds in ~ 10s. It's about 10x slower. The endpoint does exactly the same thing on both VMs, it queries the data base on a GCP SQL instance, and returns the results. Nothing abnormal.
So, it's almost the same hardware, it's the same guest OS and the same app. The only difference is that the VM1 is running the app via Docker.
I search and tried to debug many things but have no idea of what is going on. Maybe I'm misunderstanding something.
My best guess is that the COS image has some optimization that makes the app execution faster. But I don't know what exactly. Firstly I thought it could be some disk IO problem. But the disk utilization is OK on VM2. Then I thought it could be some OS configuration, then I compared the sysctl settings of both VMs and there's a lot of differences, as well. But I'm not sure what could be the key for optimization.
My questions are: why is this difference? And what can I change to make VM2 as faster as VM1?
First of all, Container-Optimized OS is based on the open-source Chromium OS, it is not CentOS, basically it is another Linux distribution.
Having said that, you need to understand that this OS is optimized for running Docker containers.
It means that Container-Optimized OS instances come pre-installed with the Docker runtime and cloud-init and basically that is all that this OS contains, because it is a minimalistic container-optimized operating system.
So, this OS doesn’t waste resources with all the applications, libraries and packages that CentOS have that can consume extra resources.
I have installed both OSs in my own project to check the Disk usage of each OS, and Container-Optimized OS from Google only uses 740MB, and CentOS consumes 2.1GB instead.
$ hostnamectl | grep Operating
Operating System: Container-Optimized OS from Google
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/root 1.2G 740M 482M 61% /
$ hostnamectl | grep Operating
Operating System: CentOS Linux 7 (Core)
$ df -h
/dev/sda2 20G 2.1G 18G 11% /
I wasn't able to use a small persistent disk with CentOS, the minimum is 20 GB.
On the other hand, containers let your apps run with fewer dependencies on the host virtual machine (VM) and run independently from other containerized apps that you deploy to the same VM instance and optimize the resources used.
I don't have many experience with GCP (just Azure and AWS), but this problem maybe about latency. So You need confirm if all your assets are in the same region.
You can try to verify the time to response from each VM to your BD. With this information you will be able to know if this situation is about of latency or not.
https://cloud.google.com/compute/docs/regions-zones/

The total amount of free network bandwidth an always free compute can use for a month or some period of time

From the Oracle cloud infrastructure always free service document site, it says this:
All tenancies get two Always Free Compute virtual machine (VM)
instances.
You must create the Always Free Compute instances in your home region.
Details of the Always Free Compute instance Shape:
VM.Standard.E2.1.Micro Processor: 1/8th of an OCPU with the ability to
use additional CPU resources
Memory: 1 GB
Networking: Includes one
VNIC with one public IP address and up to 480 Mbps network bandwidth
Operating System: Your choice of one of the following Always
Free-eligible operating systems:
Oracle Linux (including Oracle Autonomous Linux) Canonical Ubuntu
Linux CentOS Linux
The VNIC with one public IP address and up to 480 Mbps network bandwidth describes the network speed not the amount limit from my point of view. So the question is how much bandwidth one always free compute can use freely for a month or some peroid of time.
Oracle's press release for Always Free launch explicitly mention
The new Always Free program includes the essentials users need to build and test applications in the cloud: Oracle Autonomous Database, Compute VMs, Block Volumes, Object and Archive Storage, and Load Balancer. Specifications include:
...
1 Load Balancer, 10 Mbps bandwidth
10 TB/month Outbound Data Transfer

EVE-NG QEMU based nodes are not starting

Setup:Dell PowerEdge R620 128GB Ram 12 Core server.
Vmware ESXI 6.5 Based setup: 1 VM for EVE-NG: 500GB SSD + 32 GB allocated RAM.
2nd VM for Windows Server 2016: 100GB HDD + 16 GB RAM.
On Windows client, I can access the EVE-NG via Firefox and Putty. I have tried cisco Dynamips images and nodes are starting (I can telnet with putty and change config)
When I try to created nodes based on Qemu Images(Cisco, Aruba, Paloalto, etc), the nodes do not start. I have followed the guidelines for qcow2 names as well as checked multiple sources. I have also edited the node and tried to play with all possible settings.
I have reinstalled the EVE-NG on ESXi as well but the issue remains the same.
Thanks a lot for your help and support.
I finally found an answer in the EVE-NG cookbook: https://www.eve-ng.net/wp-content/uploads/2020/06/EVE-Comm-BOOK-1.09-2020.pdf
Page33: Step 6: IMPORTANT Open VM Settings. Set the quantity of CPUs and number of
cores per socket. Set Intel VT-x/EPT. Hardware Virtualization engine to ON
(checked).
Once I checked this field, all nodes are started to work.
On CPU Settings on Set "Enable Virtualized CPU Performance Counter"
This helped me a bit. I was using mac and struggling to get the consoles up. Posting the below steps in case it helps someone like me:
Shut down the VM using command shutdown -h now
Go to the VM settings
Click Advanced -> Check "Remote display over VNC"
Check Enable IOMMU in this virtual machine

Pm2 cluster mode, The ideal number of workers?

I using PM2 to run my nodejs application.
When starting it in cluster mode "pm2 start server -i 0": PM2 will automatically spawn as many workers as you have CPU cores.
What is the ideal number of workers to run and why?
Beware of the context switch
When running multiple processes on your machine, try to make sure each CPU core will be kepy busy by a single application thread at a time. As a general rule, you should look to spawn N-1 application processes, where N is the number of available CPU cores. That way, each process is guaranteed to get a good slice of one core, and there’s one spare for the kernel scheduler to run other server tasks on. Additionally, try to make sure the server will be running little or no work other than your Node.JS application, so processes don’t fight for CPU.
We made a mistake where we deployed two busy node.js applications to our servers, both apps spawning N-1 processes each. The applications’ processes started vehemently competing for CPU, resulting in CPU load and usage increasing dramatically. Even though we were running these on beefy 8-core servers, we were paying a noticeable penalty due to context switching. Context switching is the behaviour whereby the CPU suspends one task in order to work on another. When context switching, the kernel must suspend all state for one process while it loads and executes state for another. After simply reducing the number of processes the applications spawned such that they each shared an equal number of cores, load dropped significantly:
https://engineering.gosquared.com/optimising-nginx-node-js-and-networking-for-heavy-workloads

How to prevent two CUDA programs from interfering

I've noticed that if two users try to run CUDA programs at the same time, it tends to lock up either the card or the driver (or both?). We need to either reset the card or reboot the machine to restore normal behavior.
Is there a way to get a lock on the GPU so other programs can't interfere while it's running?
Edit
OS is Ubuntu 11.10 running on a server. While there is no X Windows running, the card is used to display the text system console. There are multiple users.
If you are running on either Linux or Windows with the TCC driver, you can put the GPU into compute exclusive mode using the nvidia-smi utility.
Compute exclusive mode makes the driver refuse a context establishment request if another process already holds a context on that GPU. Any process trying to run on a busy compute exclusive GPU will receive a no device available error and fail.
You can use something like Task Spooler to queue the programs and run one at the time.
We use TORQUE Resource Manager but it's harder to configure than ts. With TORQUE you can have multiple queues (ie one for cuda jobs, two for cpu jobs) and assign a different job to each gpu.