will libvirt vm give batter real time performance then QEMU vm ? what are the reason for that? - libvirt

I am doing comparison for QEMU and Libvirt for RT performance.
I created rtvm from QEMU in which I pin the cpu and set proper affinity.
I used taskset command for pining the cpu
same way I defined libvirt xml with cpu pining and run some real time workload.
I saw batter performance of libvirt vm what could be the possible reason for that?
libvirt cpu pining.
<cputune>
<vcpupin vcpu="0" cpuset="12"/>
<vcpupin vcpu="1" cpuset="13"/>
</cputune>

Related

Why do my google cloud compute instances always unexpectedly restart?

Help! Help! Help!
It is really annoying and I almost cannot bear it anymore! I'm using google cloud compute engine instances but they often unexpectedly restart without any notification in advance. The restart of instances seems to happen randomly and I have no idea what's going wrong there! I'm pretty sure that the instances are been occupied (usage of CPUs > 50% and all GPUs are in use) when restart happens. Could anyone please tell me how to solve this problem? Thanks in advance!
The issue is right here:
all GPUs are in use
If you check the official documentation about GPU:
GPU instances must terminate for host maintenance events, but can automatically restart. These maintenance events typically occur once per week, but can occur more frequently when necessary. You must configure your workloads to handle these maintenance events cleanly. Specifically, long-running workloads like machine learning and high-performance computing (HPC) must handle the interruption of host maintenance events. Learn how to handle host maintenance events on instances with GPUs.
This is because an instance that has a GPU attached cannot be migrated to another host for maintenance as it happens for the rest of the virtual machines. To get a physical GPU attached to the instance and bare metal performance you are using GPU passthrough , which sadly means if the host has to go through maintenance the VM is going down with it.
This sounds like Preemptible VM instance.
Preemptible instances function like normal instances, but have the following limitations:
Compute Engine might terminate preemptible instances at any time due to system events. The probability that Compute Engine will terminate a preemptible instance for a system event is generally low, but might vary from day to day and from zone to zone depending on current conditions.
Compute Engine always terminates preemptible instances after they run for 24 hours.
To check if your instance is preemptible using gcloud cli, just run
gcloud compute instances describe instance-name --format="(scheduling.preemptible)"
Result
scheduling:
preemptible: false
change "instance-name" to real name.
Or simply via UI, click on compute instance and scroll down:
To check for system operations performed on your instance, you can review it using following command:
gcloud compute operations list

EC2 Instance is running very slow

I am running an EC2 Instance on Ubuntu Server machine. Tomcat and MySQL are installed and deployed java web-application on it since 1 month. It was running good with great performance for almost 1 month but now my application is responding very slow.
Also, point to note is: Earlier when I used to log into my Ubuntu Server through PuTTY, it was quick but now its taking time even when I enter Ubuntu password.
Is there any solution?
I would start with checking with memory/CPU/network availability to check if it is not bottleneck.
Try following commands:
To check memory availability:
free -m
To check CPU usage:
top
To check network usage:
ntop
To check disk usage:
df -h
To check disk io operations:
iotop
Please also check if when you disable your application you are able to quickly log in to that machine. If login is still slow, then you should contact your EC2 support complaining about poor performance and asking for assigning more resources for that machine.
You can use WAIT Tool to diagnose what is wrong with your server or your application. The tool will gather all information about CPU and memoru utilization, running threads etc.
In addition, I would definitely check Tomcat application server with VisualVM or some other profiler. For configuring JMX for Tomcat you can check article here.
For network monitoring - nload tool is worth your attention. You can launch it in screen so you always check network utilization stats when server is slown.
First check is there any application using too much cpu or memory. This can be checked by using top command. I'll tell you two simple shortcut keys that may be helpful while using top command. In top command result page, if you enter M it will sort application based on memory usage, from highest to lowest. If you enter P it will sort application based on cpu usage, from highest to lowest.
If you are unable to find any suspicious application using top you can use iotop it will show disk I/O usage details.
I was facing the same issue, the solution which worked for me was
Restart the ec2 instance
Edit
lately, I figure out this issue is happening due to the fewer resources (memory, CPU) available to the EC2 machine. So check available resources to the EC2 machine.

Does QEMU use Tiny Code Generator even host and target(guest) are both the same architecture?

I know generally QEMU use so called dynamic translation technique: it translates instructions of target machine into micro operations, and then translate these micro operations into host machine instructions via Tiny Code Generator(TCG). That is:
instruction of target -> micro operations
micro operations -> TCG -> instruction of host
However, if the architecture of target and host machine are the same, say both are x86, theoretically it does not necessary need to use the TCG to translate, since the instruction sets are the same. In this case, does QEMU still use TCG?
In addition to Robin's answer:
It is important to remember that unless you specify in the command line, QEMU will invariably, by default, use TCG for translation.
For example, the below command line arguments will start up QEMU in TCG mode, even when the host and target architecture is the same, in my case, it is x86_64
./qemu-system-x86_64 -m 10G -machine pc-i440fx-2.5 -drive file=~/ubuntu16.04.server.qcow2,format=qcow2
If the command for QEMU start-up was something like this -
./qemu-system-x86_64 -m 10G -machine pc-i440fx-2.5 -accel kvm -drive file=~/ubuntu16.04.server.qcow2,format=qcow2
where in it is clearly specified that the accelerator choice is kvm, only then will QEMU start-up in KVM mode.
However, it is true that if the target and host arhcitecture is the same, QEMU can be allowed to run in KVM mode (if that is what you want), rather than TCG mode.
From what I read on this blog: qemu can use KVM in this case
KVM is a virtualization feature in the Linux kernel that lets a program like qemu safely execute guest code directly on the host CPU. This is only possible when the target architecture is supported by the host CPU; today that means x86-on-x86 virtualization only.

How to prevent two CUDA programs from interfering

I've noticed that if two users try to run CUDA programs at the same time, it tends to lock up either the card or the driver (or both?). We need to either reset the card or reboot the machine to restore normal behavior.
Is there a way to get a lock on the GPU so other programs can't interfere while it's running?
Edit
OS is Ubuntu 11.10 running on a server. While there is no X Windows running, the card is used to display the text system console. There are multiple users.
If you are running on either Linux or Windows with the TCC driver, you can put the GPU into compute exclusive mode using the nvidia-smi utility.
Compute exclusive mode makes the driver refuse a context establishment request if another process already holds a context on that GPU. Any process trying to run on a busy compute exclusive GPU will receive a no device available error and fail.
You can use something like Task Spooler to queue the programs and run one at the time.
We use TORQUE Resource Manager but it's harder to configure than ts. With TORQUE you can have multiple queues (ie one for cuda jobs, two for cpu jobs) and assign a different job to each gpu.

Kvm/Qemu maximum vm count limit

For a research project I am trying to boot as many VM's as possible, using python libvirt bindings, in KVM under Ubuntu server 12.04. All the VM's are set to idle after boot, and to use a minimum amount of memory. At the most I was able to boot 1000 VM's on a single host, at which point the kernel (Linux 3x) became unresponsive, even if both CPU- and memory usage is nowhere near the limits (48 cores AMD, 128GB mem.) Before this, the booting process became successively slower, after a couple of hundred VM's.
I assume this must be related to the KVM/Qemu driver, as the linux kernel itself should have no problem handling this few processes. However, I did read that the Qemu driver was now multi-threaded. Any ideas of what the cause of this slowness may be - or at least where I should start looking?
You are booting all the VMs using qemu-kvm right, and after 100s of VM you feel it's becoming successively slow. So when you feels it stop using kvm, just boot using qemu, I expect you see the same slowliness. My guess is that after those many VMs, KVM (hardware support) exhausts. Because KVM is nothing but software layer for few added hardware registers. So KVM might be the culprit here.
Also what is the purpose of this experiment ?
The following virtual hardware limits for guests have been tested. We ensure host and VMs install and work successfully, even when reaching the limits and there are no major performance regressions (CPU, memory, disk, network) since the last release (SUSE Linux Enterprise Server 11 SP1).
Max. Guest RAM Size --- 512 GB
Max. Virtual CPUs per Guest --- 64
Max. Virtual Network Devices per Guest --- 8
Max. Block Devices per Guest --- 4 emulated (IDE), 20 para-virtual (using virtio-blk)
Max. Number of VM Guests per VM Host Server --- Limit is defined as the total number of virtual CPUs in all guests being no greater than 8 times the number of CPU cores in the host
for more limitations of KVm please refer this document link