Why can't QEMU get even close to Rosetta 2's performance when translating x86 to M1?

Why can't QEMU get even close to Rosetta 2's performance when translating x86 to M1? - qemu

Apparently, QEMU is the only piece of open source code that can emulate an x86 operating system on the new Apple silicon (M1, M2, etc.).
Apple built Rosetta 2, which, in theory, does the exact same thing that QEMU would be doing in these scenarios. It translates x86 (Intel) instructions into the instruction set supported by the new Apple silicon processors.
Rosetta 2 does it with remarkable performance, and some x86 applications even run with better performance than on native x86 hardware. QEMU, on the other hand, doesn't get even close when running x86 Linux on Apple silicon.
How can Rosetta have such superior performance? Are there any "secrets" that only Apple knows about their architecture that were never shared with the QEMU project? Any forbidden APIs that QEMU is not allowed to access?

Rosetta and QEMU are both emulators. However, they tackle the problem in vastly different ways.
QEMU
In order to emulate a a Linux system, QEMU must also emulate storage devices, console output devices, ethernet devices, keyboards, and the entire CPU. With this framework, it emulates every instruction doing everything with Just in Time translation. From the Linux kernel down to your /bin/ls command.
There are generally few limitations to QEMU's Intel emulate. You can run most any Intel Operating System and associated applications.
Rosetta 2
Apple's emulate, on the other hand, happens before the application launches. The entire binary is translated from x86 to Apple Silicon and launched. Once translated, the application is in effect a native arm64 binary making native macOS system calls.
Apple's documentation explains it thus:
If an executable contains only Intel instructions, macOS automatically
launches Rosetta and begins the translation process. When translation
finishes, the system launches the translated executable in place of
the original. However, the translation process takes time, so users
might perceive that translated apps launch or run more slowly at times
Rosetta 2 has a number of significant limitations. For example you can't use Intel Kernel extensions, Virtual Machine apps that virtualize x86_64 computer platforms (Parallels for example), or AVX/AVX2/AVX512 vector instructions.

Related

Booting a QorIQ PowerPC firmware in Qemu

I have a QorIQ (P2041) processor based IoT device firmware. I have uBoot, Kernel and initrd ramdisk. Whatever I do with qemu-system-ppc I can't get it to work. I suspect that qemu-system-ppc doesn't support QorIQ processors. Is there anyway for me to load and boot this firmware in Qemu or any other emulator?

U-Boot has configuration file qemu-ppce500_defconfig. You should be able to run the U-Boot built with this configuration using command
qemu-system-ppc -nographic -bios u-boot -M ppce500
The CPU can be specified via the -cpu parameter as e500mc.
To run your kernel it will need drivers for the hardware provided by the emulated machine like the E1000 network card and the NS16550 console.
Use the fdt command of U-Boot to get an overview of the available devices in the emulated machine.

Firmware binaries are generally very closely tied to the hardware they're built to run on -- they make assumptions about what hardware is available, what addresses in memory it can be found at, and so on. You need to use a firmware blob that corresponds to the hardware you're asking QEMU to emulate. Since QEMU doesn't emulate whatever your random IoT device is, you need to use a u-boot which matches the hardware QEMU actually has (as for example suggested in Xypron's answer).
Once you have booting firmware, you will likely still find you have exactly the same problem with the kernel -- it is built to run on one bit of hardware, and you're trying to run it on something different, and this simply won't work.

Can you Program/Test CUDA in a Virtual Machine?

I ask this as a programming and environment question. Can you test/program CUDA within a virtual machine accessing the physical GPU card?
I am buying a new (really nice system) to, in part, experiment with basic CUDA programming. The processor will be an Intel i7-4770 which supports VT-d (direct IO pass-through) OR a i7-4770K which does not. Will the VT-d support allow access to the GPU card from the VMs? (I have looked at Intel, motherboard mfg. sites, and docs on VMs but did not see an answer to this question.)
I plan to run Linux as my base operating system on the new development box with virtual machines (probably via QEMU/KVM) to test the software in other environments such as Windows and Mac OS. I other words, I would do the major development on the Linux box and then need to test on a virtual machine running on the same box.
But, will the VM OSs be able to access the GPU card for testing/development?
[First asked July 2013]

It depends on what NVIDIA card you're using. See for example: (this is in regards to Xen)
http://wiki.xen.org/wiki/XenVGAPassthroughTestedAdapters#Nvidia_display_adapters
The short answer is you probably would need to rely on modifying a consumer card as they link above as 'Australian crazy guy'.

Anyone can introduce some primers about qemu-kvm and kvm?

I am a fresh man in kvm,qemu-kvm and kvm are both very complicated now.
Anyone can introduce some primers about qemu-kvm and kvm?
thanks very much!

KVM stands for kernel based virtual machine. it enables you to create as many number of virtual machine as you like. These machine can be of two types LVM based or Non-LVM based.
Those machine which are LVM based you can take live backup for them. for non-lvm based VM you cannot take live backup i.e. they will be paused when you take backup for them. please refer KVM home page KVM Home Page.
QEMU is a generic and open source machine emulator and virtualizer.When used as a machine emulator, QEMU can run OSes and programs made for one machine (e.g. an ARM board) on a different machine (e.g. your own PC). By using dynamic translation, it achieves very good performance. When used as a virtualizer, QEMU achieves near native performances by executing the guest code directly on the host CPU. QEMU supports virtualization when executing under the Xen hypervisor or using the KVM kernel module in Linux. When using KVM, QEMU can virtualize x86, server and embedded PowerPC, and S390 guests.
For managing the KVM VM's you need to install Libvirt which is the virtualization library. It provides you the tools for starting, suspending, resuming, cloning, restarting, listing of virtual machine. Please refer Libvirt home page for more reference.
If you are working on some backup or recovery process then I suggest you to go through this excellent perl script as well it will give a fair idea of how the backup and snapshot is being taken for KVM VM's.
KVM based virtual machines are not complicated once you go through the theory of them and start implementing them. I believe once you start working on them you will find fun in managing them.

Putting in a nutshell
QEMU : An emulator which translates the instruction of guest operating system to host operating system. As you can guess that translation has a certain cost, you will not see Guest machine working as fast as host machine.
For more info see the QEMU wiki
KVM (Kernal Virtual Machine): A module in Kernel which support Virtual Machine (host operating system) in hardware. By support I mean that if your guest architecture is same as host architecture, then certainly there is no need to translate the instructions as they can directly be executed by host. For this modern hardware are equipped with special registers and storage location which is leveraged by KVM. Also KVM is a module, some driver is needed to use the KVM, which is qemu also.
For more info see the KVM section in the same wiki.
QEMU-KVM : As I above mentioned, KVM is a module only, qemu is needed (or other) to use KVM. When KVM is used with QEMU, control transfers from QEMU to KVM and vice-versa over the execution.

Talking about KVM is talking about virtualization technology or about kernel modules (kvm.ko, kvm-intel.ko or kvm-amd-ko). Sometimes KVM is mentioned as a virtual machine, this is not correct, because KVM does not provide virtualized hardware.
Source

How to emulate CUDA on windows

is there any way I can test the CUDA samples and codes from a computer with no NVIDIA graphic card?
I am using Windows and the latest version of CUDA.

There are several possibilities:
Use older version of CUDA, which has built-in emulator (2.3 has it for sure). Emulator is far from good, and you won't have features from latest CUDA releases.
Use OpenCL, it can run on CPUs (though not with nVidia SDK, you will have to install either AMD or Intel OpenCL implementation (AMD works fine on Intel CPUs, btw)). In my experience, OpenCL is usually slightly slower than CUDA.
There is windows branch of Ocelot emulator: http://code.google.com/p/gpuocelot/. I haven't tried it, though.
However, I would recommend buying some CUDA-capable card. 8xxx or 9xxx series is ok and really cheap. Emulation would allow you to get some basic skills of GPGPU programming, but is useless when you write some real-world application since it doesn't allow you to debug and tune performance.

What exactly is a subsystem?

I'm reading a book which says there are these subsystems:
win32,os/2,posix,etc..
But I don't have any perceptual knowledge with these notations, can you explain it in short words?

I get the feeling the concept of a "subsystem" is somewhat ill-defined, or at least used with different meanings in different contexts.
According to MSDN documentation:
Environment subsystems are Windows NT processes that emulate different operating system environments. The Windows NT executive provides generic services that all environment subsystems can call to perform basic operating system functions.
Windows Internals book talks about the following two subsystems:
Windows subsystem of which it says - "this [subsystem] is special in that Windows can't run without it. (It owns the keyboard, mouse and display and it is required to be present even on server systems with no interactive users logged in. In fact the other two (which two?) subsystems are configured to start on demand, whereas the Windows subsystem must always be running."
Subsystem for Unix-based Applications, also known as SUA[POSIX] subsystem
Now, the /SUBSYSTEM option that can be sent to the Microsoft VS C++ linker in its documentation says and I quote
You can specify any of the following subsystems:
BOOT_APPLICATION
An application that runs in the Windows boot environment. For more information about boot applications, see About the BCD WMI Provider.
CONSOLE
A Windows character-mode application. The operating system provides a console for console applications.
Extensible Firmware Interface (EFI) Image
The EFI subsystem options describe executable images that run in the Extensible Firmware Interface environment. This environment is typically provided with the hardware and executes before the operating system is loaded. The major differences between EFI image types are the memory location that the image is loaded into and the action that's taken when the call to the image returns. An EFI_APPLICATION image is unloaded when control returns. An EFI_BOOT_SERVICE_DRIVER or EFI_RUNTIME_DRIVER is unloaded only if control returns with an error code. An EFI_ROM image is executed from ROM. For more information, see the specifications on the Unified EFI Forum website.
NATIVE
Code that runs without a subsystem environment—for example, kernel mode device drivers and native system processes. This option is usually reserved for Windows system features.
POSIX
An app that runs in the POSIX subsystem in Windows.
WINDOWS
An app that runs in the Windows graphical environment. This includes both desktop apps and Windows Store apps.
WINDOWSCE
The WINDOWSCE subsystem indicates that the app is intended to run on a device that has a version of the Windows CE kernel. Versions of the kernel include PocketPC, Windows Mobile, Windows Phone 7, Windows CE V1.0-6.0R3, and Windows Embedded Compact 7.
So there you go. Finally, people sometimes talk about the "Win32" subsystem, which I don't know if I should take to mean the "windows" subsystem or the "console" subsystem in the linker option sense.
Back to the Windows Internals book, it further says "each executable image (.exe) is bound to one and only one subsystem" which would explain the need to specify the subsystem your app is for at link-time.

Windows starting from NT (NT 3.1) is able to support semantics of different operating systems (or OS families) that existed at that time (1993). Microsoft called them Subsystems (today they would probably call them emulation layers).
When linking against a subsystem, it decides how your semantics will be. For the Win32 subsystem, for example, filenames are case insensitive (foo.txt and fOo.Txt refer to the same file) and device files (like con or nul) exist in every directory. For the POSIX subsystem, file names are case sensitive and device files exist only at one place. By linking existing (legacy) applications against a subsystem different from Win32, these apps "feel" more like the respective OSes and porting work is reduced.
If you want to know the subsystem of an EXE/DLL, you can open it in DependencyWalker - if it (directly or indirectly) depends on KERNEL32.DLL it is Win32 subsystem, if it (directly) depends on NTDLL.DLL it is native subsystem (Note that KERNEL32.DLL will itself depend on NTDLL.DLL, providing the compatibility layer for the Win32 subsystem).
This is mostly obsolete today. I say mostly as Microsoft included a new "Linux subsystem" in Windows 10 Anniversary update (which is a subsystem like Native, Win32 or POSIX), that behaves binary equivalent to Linux, making it easy to compile Linux applications to be run on Windows (or more precisely, its Linux subsystem).
The /SUBSYSTEM linker switch started out to do exactly the same, but was augmented with more options later (/SUBSYSTEM:CONSOLE also compiles for the Win32 subsystem but the application will allocate a console window if it did not inherit one from its parent process, /SUBSYSTEM:EFI_APPLICATION will compile an executable that cannot run on Windows at all, but will run in the Exensible Firmware Interface (EFI/UEFI) boot environment, etc.).

It might help if we knew which book you're referring to!
More generally, Win32 (which is 32-bit Windows, i.e. Windows NT 3.5 or later), OS/2 and the POSIX family are all operating systems. (POSIX is a standard family of APIs into the UNIX-like operating systems - see here for more.)
It sounds like what you describe is a program that can run on many different operating systems and which has operating-system specific components -- these would be the "subsystems".
However, creating an application in this way does sound like the kind of thing that was done fifteen or twenty years ago. That's about the time that people used to refer to those three families of operating systems, too...

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008