CPU_ON on QEMU ARMv8A using PSCI from EL2/EL3 - qemu

I have 4 core ARMv8A (Cortex-A53) emulated on QEMU 6.2.0. The primary code (CPU#0) is running and I am able to debug it using GDB. I wanted to bring up other cores. For that I have used the following GDB commands. From the different experiments conducted, my conclusion is that only CPU#0 is running and all other CPUs never started.
(gdb) thread 3
(gdb) info thread
Id Target Id Frame
1 Thread 1.1 (CPU#0 [running]) 0x0000000040000008 in ?? ()
2 Thread 1.2 (CPU#1 [halted ]) 0x0000000040000000 in ?? ()
* 3 Thread 1.3 (CPU#2 [halted ]) 0x0000000040000000 in ?? ()
4 Thread 1.4 (CPU#3 [halted ]) 0x0000000040000000 in ?? ()
(gdb) where
#0 0x0000000040000000 in ?? ()
Exploting further, came across this thread about turning on CPU using PSCI. Start a second core with PSCI on QEMU.
Also came across SMC calling convention related to this. I have gone through the documentation of SMCCC and PSCI.
I am implementing a minimal hypervisor. The guest is a Linux and it is booting. The Linux boot log shows
[ 0.072225] psci: failed to boot CPU1 (-22)
...
Further debugging the code revealed that Linux is throwing a synchronous exception to the hypervisor with necessary parameters as per the specification using the "HVC" instruction.
If my understanding is correct, PSCI implementation is vendor specific- that is, the code running at EL2/EL3 has to use some vendor provided mechanism to turn on the CPU(core). Is this correct? on on system without EL3, how the code running at EL2 turn on the CPU?
My QEMU command line is given below
$qemu-system-aarch64 -machine virt,gic-version=2,virtualization=on -cpu cortex-a53 -nographic -smp 4 -m 4096 -kernel hypvisor.elf -device loader,file=linux-5.10.155/arch/arm64/boot/Image,addr=0x80200000 -device loader,file=1gb_4core.dtb,addr=0x88000000
Any hint is greatly appreciated.

When the guest is not booting at EL3, the QEMU virt machine implements its own internal PSCI emulation. This is described in the DTB file passed to the guest, and will say that PSCI calls should be done via the SMC instruction (if the guest is starting at EL2) or the HVC instruction (if the guest is starting at EL1). Effectively, QEMU is emulating an EL3 firmware for you.
(If the guest does boot at EL3, then QEMU assumes that the EL3 guest code will be implementing PSCI; in that case it provides some simple emulated hardware that does the power on/off operation and which the EL3 guest's PSCI implementation will manipulate as part of its implementation of the CPU_ON and CPU_OFF calls. But that's not the case you're in.)
If you are running a hypervisor in your guest at EL2, then it is your hypervisor's job to implement PSCI for your EL1 guests (it's unlikely that you want to allow an EL1 guest to be able to directly shut down a CPU under your hypervisor's feet, for instance). So you want to pass your EL1 guest a different DTB that describes the view an EL1 guest has of its emulated hardware, and which says "PSCI via HVC". Then your hypervisor's HVC handling should emulate PSCI. Separately, your hypervisor's bootup code should be using the real PSCI-via-SMC to power up the secondary CPUs as part of its bootup sequence.

Related

Errors thrown when trying to run basic.sh in sosumi

I was hoping that you could help me. I've been stuck on this problem for quite a while.
When I try to start up the clover boot loader or run the basic.sh file, I get these errors in the terminal:
qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.sse4.1 [bit 19]
qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.sse4.2 [bit 20]
qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.movbe [bit 22]
qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.aes [bit 25]
qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.xsave [bit 26]
qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.avx [bit 28]
qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.07H:EBX.bmi1 [bit 3]
qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.07H:EBX.avx2 [bit 5]
etc.
I have no idea what they mean. Could you please tell me a solution? I've tried uninstalling and reinstalling manually. It didn't work and it threw these errors at me again. I followed the instructions in the readme: https://github.com/foxlet/macOS-Simple-KVM
Qemu and everything it needs, all the dependencies are installed on my computer.
When I run the clover bootloader, it just shows a bunch of text then brings me back to the menu. I hit enter again. last time i kept ending up in the shell, and I don't know why.
Why does it keep crashing? Could you tell me pls how to fix it?
This is the second time I'm struggling with this, please help.
UPDATE: I tried using this repo: https://github.com/kholia/OSX-KVM and got the same errors. It's still not working.
The shell script you're running starts QEMU asking it to provide a guest CPU with various features (including SSE4, AVX and AVX2). With KVM, the only way we can give the guest a CPU with a feature like AVX is if the host CPU has it, because we run guest code directly on the host CPU. QEMU is warning you that you asked for something it can't do, because the host CPU you're running it on doesn't have those features. QEMU removes the features it can't provide from the set of things it tells the guest about via the CPUID registers.
If the guest OS really needs a CPU with AVX2 and all the rest of it, you need to run on a newer host CPU.
If the guest OS is happy to read the CPUID registers and adjust itself to avoid using features that aren't there, then you could adjust the -cpu options the script is passing to make it request something with fewer features, but all this will do is mean that QEMU won't print the warnings -- it won't change how the guest runs on that kind of CPU.

qemu-system-aarch64 - 'virtio-vga-gl' is not a valid device model name

I configured and built qemu 6.2.0 with --enable-sdl --enable-opengl --enable-virglrenderer parameters as qemu-system-aarch64 target for an amd64 ubuntu host. When I try to enable -device virtio-vga-gl is tells me that it is not a valid device model name.
Did I miss something?
Regards.
I think the virtio-vga device is not compiled in by default for aarch64, because the intention is that it's only for machine types where there is legacy firmware that does not know about virtio-gpu but only about VGA (such as the x86 PC machine types). The recommended graphics type for the aarch64 'virt' board (according to the documentation) is virtio-gpu-pci. Your guest OS will obviously need support for that device type.

Ethminer Ubuntu 16 not using NVIDIA GPU

I have followed instructions here and successfully build and setup geth.
Ethminer seems to work except it doesn't use the Titan X GPU and the mining rate is only 341022 H/s.
Also when I try to use the -G option ethminer says it is an invalid argument; the -G flag also doesn't appear in the ethminer help command.
Your GPU must have a minimum memory to perform mining. Upgrade to GPU you with higher memories (minimum 4GB is preferable)
The current DAG size is above (2GB). That means you cant mine with GPU with memory less than 2GB.

gdb: unknown target exception

When trying to run a program using gdb I get
[New Thread 4612.0x158c]
[New Thread 4612.0x1cb8]
[New Thread 4612.0x11e8]
[New Thread 4612.0x1190]
gdb: unknown target exception 0x406d1388 at 0x746623d2
Program received signal ?, Unknown signal.
0x746623d2 in RaiseException () from /cygdrive/c/WINDOWS/System32/KERNELBASE.dll
I researched this and found three possible causes: (1) path environment variable not set, (2) drive not mapped, and (3) using the wrong version of gdb (32-bit or 64-bit). So I added C:\cygwin\bin to the path environment variable, typed mount and got
C:/cygwin/bin on /usr/bin type ntfs (binary,auto)
C:/cygwin/lib on /usr/lib type ntfs (binary,auto)
C:/cygwin on / type ntfs (binary,auto)
C: on /cygdrive/c type ntfs (binary,posix=0,user,noumount,auto)
D: on /cygdrive/d type ntfs (binary,posix=0,user,noumount,auto)
When I type show configuration get
This GDB was configured as follows:
configure --host=i686-pc-cygwin --target=i686-pc-cygwin
--with-auto-load-dir=$debugdir:$datadir/auto-load
--with-auto-load-safe-path=$debugdir:$datadir/auto-load
--with-expat
--with-gdb-datadir=/usr/share/gdb (relocatable)
--with-jit-reader-dir=/usr/lib/gdb (relocatable)
--without-libunwind-ia64
--with-lzma
--with-python=/usr (relocatable)
--without-guile
--with-separate-debug-dir=/usr/lib/debug (relocatable)
--without-babeltrace
and my computer is 32 bits, so it appears to be the correct version.
gdb itself seems to work, e.g. I can type watch followed by an address and it will set a watchpoint; gcc and g++ work fine, and the program I am debugging will start if I run it from the command line but not from gdb.
What other things should I check?
This is a special technical exception that communicates thread name to supporting debugger (Delphi RAD Studio, Visual Mess etc.). It is convenient to look at the thread list in the debugger and understand what is going on by looking at names. Threads throw this exception and instantly catch it, doing nothing in the handler. Until recent SetThreadName introduction, it was the only common way to set thread name. SetThreadName is Unicode, but SetThreadName not widely supported yet, so many libraries use supported method. It can be IME, OLE, whatever spawns threads.
I guess, gdb is aware of neither method. Just ignore this exception.
I had the same problem. I am also using x86 with eclipse mars.2 on a Vista, and by default, gdb 7.10 was downloaded by setup. I also tried all you have tried to no avail.
Lastly, I noticed the link below and upgraded gdb to 7.11 and the problem was fixed.
https://cygwin.com/ml/cygwin/2016-10/msg00243.html

Setting up GPUDirect for infiniband

I try to setup GPUDirect to use infiniband verbs rdma calls directly on device memory without the need to use cudaMemcpy.
I have 2 machines with nvidia k80 gpu cards each with driver version 367.27. CUDA8 is installed and Mellanox OFED 3.4
Also the Mellanox-nvidia GPUDirect plugin is installed:
-bash-4.2$ service nv_peer_mem status
nv_peer_mem module is loaded.
According to this thread "How to use GPUDirect RDMA with Infiniband"
I have all the requirements for GPUDirect and the following code should run successfully. But it does not and ibv_reg_mr fails with the error "Bad Address" as if GPUDirect is not properly installed.
void * gpu_buffer;
struct ibv_mr *mr;
const int size = 64*1024;
cudaMalloc(&gpu_buffer,size); // TODO: Check errors
mr = ibv_reg_mr(pd,gpu_buffer,size,IBV_ACCESS_LOCAL_WRITE|IBV_ACCESS_REMOTE_WRITE|IBV_ACCESS_REMOTE_READ);
Requested Info:
mlx5 is used.
Last Kernel log:
[Nov14 09:49] mlx5_warn:mlx5_0:mlx5_ib_reg_user_mr:1418:(pid 4430): umem get failed (-14)
Am I missing something? Do I need some other packets or do I have to activate GPUDirect in my code somehow?
A common reason for nv_peer_mem module failing is interaction with Unified Memory (UVM). Could you try disabling UVM:
export CUDA_DISABLE_UNIFIED_MEMORY=1
?
If this does not fix your problem, you should try running validation and copybw tests from https://github.com/NVIDIA/gdrcopy to check GPUDirectRDMA. If it works then your Mellanox stack is misconfigured.