UEFI ARM64 Synchronous Exception - exception

I am developing an UEFI application for ARM64 (ARMv8-A) and I have come across the issue: "Synchronous Exceptions at 0xFF1BB0B8."
This value (0x0FF1BB0B8) is exception link register (ELR). ELR holds the exception return address.
There are a number of sources of Synchronous exceptions (https://developer.arm.com/documentation/den0024/a/AArch64-Exception-Handling/Synchronous-and-asynchronous-exceptions):
Instruction aborts from the MMU. For example, by reading an
instruction from a memory location marked as Execute Never.
Data Aborts from the MMU. For example, Permission failure or alignment checking.
SP and PC alignment checking.
Synchronous external aborts. For example, an abort when reading translation table.
Unallocated instructions.
Debug exceptions.
I can't update BIOS firmware to output more debug information. Is there any way to detect more precisely what causes Synchronous Exception?
Can I use the value of ELR (0x0FF1BB0B8) to locate the issue? I compile with -fno-pic -fno-pie options.

Related

Is my undersanding of Interrupts, Exceptions vs signal definitions correct?

Interrupts are hardware based and happen asynchronous (data incoming to a socket, some I/O is ready to read from or write to, A user pressed the keyboard.
Exceptions are are also hardware based but they are synchronous and caused by the CPU when executing instructions. for example a page in the virtual memory address space that has no actual chunk of RAM mapped to it will cause a page fault. Exceptions are the general name for fault, trap and abort.
Interrupts and Exceptions are generated by hardware and handled by handlers in the kernel space. They can be seen as mean of communication between the Hardware and the kernel.
Signals Signals can be viewed as a mean of communication between the running processes and the kernel. In some cases an Interrupt/Exception will use signals as part of the handling by the kernel.
Interrupts
In computing, an interrupt is an asynchronous signal indicating the
need for attention or a synchronous event in software indicating the
need for a change in execution.
(definition obtained from stackoverflow's tag description)
So, it's not unnecessarily asynchronous. It's asynchronous only if it is emitted by the hardware. Think of a virtual device or an emulator for examples of synchronous interrupts, when you are programming a camera and instead of the real device you have an emulator, which you can program to simulate interrupts.
Exceptions
From Microsoft docs:
Most of the standard exceptions recognized by the operating system are
hardware-defined exceptions. Windows recognizes a few low-level
software exceptions, but these are usually best handled by the
operating system.
Windows maps the hardware errors of different processors to the
exception codes in this section. In some cases, a processor may
generate only a subset of these exceptions. Windows preprocesses
information about the exception and issues the appropriate exception
code.
Exceptions are not necessarily hardware-generated and not necessarily synchronous.
If they are synchronous, then a software emitted it (like a camera emulator). Asynchronous exceptions can be raised just about anywhere.
In more advanced programming languages one can use exception handlers and different kinds of exceptions have their own exception subclass. The program can emit an exception with a command, usually the throw keyword which is paired with an exception instance. See: https://www.geeksforgeeks.org/throw-throws-java/
One can implement custom exception classes according to business logic, see https://www.baeldung.com/java-new-custom-exception.
So, the realm of exceptions is much wider than you first thought.
Signals
A signal is a notification to a process that an event occurred. Signals are sometimes described as software interrupts. Signals are analogous to hardware interrupts in that they interrupt the normal flow of execution of a program; in most cases, it is not possible to predict exactly when a signal will arrive. They are defined in the C standards and extended in POSIX, but many other programming languages/systems provide access to them as well.
You are more-or-less correct about signals.

Servicing an interrupt vs servicing an exception

My understanding : An interrupt (hardware interrupt) occurs asynchronously generally been caused by an external event directly interrupting the CPU. The CPU will then vector to the particular ISR to handle the interrupt.
Obviously an ISR cannot have a return value or have parameters passed because the event happen at anytime at any point of execution in our code.
With exceptions however, my understanding is that this is a software interrupt which is caused by a special instruction pd within the software.
I've heard that exceptions are handled in a similar fashion to handling an ISR. Can an exception handler in that case behave differently to an ISR , by taking arguments from the code and return a value, because we know where we were in our code when it was executed?
Thanks in advance
A hardware exception is not a software interrupt, you do not explicitly call it - it occurs on some hardware detectable error such as:
invalid address
invalid instruction
invalid alignment
divide-by-zero
You can of course write code to deliberately cause any of these and therefore use them as software interrupts, but then you may loose the utility of them as genuine error traps. Exceptions are in some cases used for this purpose - for example in a processor without an FPU on an architecture where and FPU is an option, the invalid instruction handler can be used to implement software emulation of an FPU so that the compiler does not need to generate different code for FPU and non-FPU variants. Similarly an invalid address exception can invoke a memory manager to implement a virtual memory swap-file (on devices with an MMU).
A software interrupt is explicitly called by an SWI instruction. It's benefit over a straightforward function call is that the application does not need to know the location of the handler - that is determined by the vector table, and is often used to make operating system or BIOS calls in simple operating systems can dynamically load code, but that do not support dynamic-linking (MS-DOS for example worked in this way).
What hardware interrupts, software interrupts and exceptions all have in common is that they execute in different processor context that normal code - typically switching to an independent stack and automatically pushing registers (or using an alternate register bank). They all operate via a vector table, and you cannot pass or return parameters via formal function parameter passing and return. With SWI and forced-exceptions, it is possible to load values into specific registers or memory locations known to the handler.
The above are general principles - the precise details will vary between different architectures. Consult technical reference for the specific device used.
The term "exception" can mean completely different things.
There are "software exceptions" in the form of exception handling, as a high-level language feature in languages like C++. An "exception handler" in this context would be something like a try { } catch block.
And there are "hardware exceptions" which is a term used by some CPU cores like PowerPC. These are a form of critical interrupts corresponding to an error state. An "exception handler" in this context would be similar to an interrupt vector table, although when a hardware exception occurs, there is usually nothing the software can do about it.
Hardware exceptions take no arguments and return no data, just like interrupts. Architectures like PowerPC separate hardware exceptions from hardware interrupts, where the former are various error states, and the later are interrupts from the application-specific hardware.
It isn't all that meaningful for a hardware exception to communicate with the software, as they would be generated from critical failures like wrong OP code executed, CPU clock gone bad, runaway code etc etc. That is, the execution environment has been compromised so there is nothing meaningful that the software executing in that environment can possibly do.

Ollydbg Debugging - Pass exception to application / Step into instruction

I'm trying to identify a bug in a program (32bit) which could probably lead to code execution. So far I debugged the application with ollydbg and ran my exploit code. Then ollydbg gives me an exception.
If I press "Ctrl+F9" nothing seems to be executed of my shellcode
In contrast when the exception occures and I step through the next instructions with "F8" I finally reach my shellcode and it gets executet
If I run the application without ollydbg, my shellcode also doesn't get executed
Why does my shellcode get executed when I step to the next instructions an otherwise not? What's then the normal case when I run my application without a debugger?
Thanks a lot!
When an exception is raised in a thread the system will first check if a debugger is attached.
If a debugger is attached the exception is reported to the debugger (and not to the faulting process or thread). In ollydbg (and most debuggers) you then have the choice to do something with that exception.
The 1st one is to pass that exception to the faulting thread (CTRL+F9) in ollydbg.
The system will look at the EXCEPTION_REGISTRATION_RECORD for the current thread and walks the list of EXCEPTION_REGISTRATION structures (each of these structures has an exception handler) and check if a handler can handle the exception.
If a handler can handle the exception, the stack is unwind (to a certain point) and the thread might continue its life.
If no handler can handle the exception, the final handler is called and the program crashes (the system will then usually display a dialog box informing the user that the process crashed).
This is exactly the same behavior in the case no debugger is attached.
Thus, in your case, passing the exception to the debugger will probably unwind the stack, and the thread will continue its execution after the location of the exception (or simply crash the whole application if the exception couldn't be handled).
The second option - when a debugger is attached - is to not pass the exception to the faulting thread (using one of the step [into | over] / run button). In this case the system will not search for any handler and the thread will either simply rethrow the exception (if it can't pass over it) or continue execution like nothing happened (if the debugger knows how to handle it).
You should check which type (most probably one of: Access violation in read / write ; breakpoint exception) of exception is raised and correct the problem (see at the bottom of the ollydbg window, it will tell you which kind of exception has been raised) if you want to execute your shellcode without problem.

States of memory data after cuda exceptions

CUDA document is not clear on how memory data changes after CUDA applications throws an exception.
For example, a kernel launch(dynamic) encountered an exception (e.g. Warp Out-of-range Address), current kernel launch will be stopped. After this point, will data (e.g. __device__ variables) on device still kept or they are removed along with the exceptions?
A concrete example would be like this:
CPU launches a kernel
The kernel updates the value of __device__ variableA to be 5 and then crashes
CPU memcpy the value of variableA from device to host, what is the value the CPU gets in this case, 5 or something else?
Can someone show the rationale behind this?
The behavior is undefined in the event of a CUDA error which corrupts the CUDA context.
This type of error is evident because it is "sticky", meaning once it occurs, every single CUDA API call will return that error, until the context is destroyed.
Non-sticky errors are cleared automatically after they are returned by a cuda API call (with the exception of cudaPeekAtLastError). Any "crashed kernel" type error (invalid access, unspecified launch failure, etc.) will be a sticky error. In your example, step 3 would (always) return an API error on the result of the cudaMemcpy call to transfer variableA from device to host, so the results of the cudaMemcpy operation are undefined and unreliable -- it is as if the cudaMemcpy operation also failed in some unspecified way.
Since the behavior of a corrupted CUDA context is undefined, there is no definition for the contents of any allocations, or in general the state of the machine after such an error.
An example of a non-sticky error might be an attempt to cudaMalloc more data than is available in device memory. Such an operation will return an out-of-memory error, but that error will be cleared after being returned, and subsequent (valid) cuda API calls can complete successfully, without returning an error. A non-sticky error does not corrupt the CUDA context, and the behavior of the cuda context is exactly the same as if the invalid operation had never been requested.
This distinction between sticky and non-sticky error is called out in many of the documented error code descriptions, for example:
synchronous, non-sticky, non-cuda-context-corrupting:
cudaErrorMemoryAllocation = 2
The API call failed because it was unable to allocate enough memory to perform the requested operation.
asynchronous, sticky, cuda-context-corrupting:
cudaErrorMisalignedAddress = 74
The device encountered a load or store instruction on a memory address which is not aligned. The context cannot be used, so it must be destroyed (and a new one should be created). All existing device memory allocations from this context are invalid and must be reconstructed if the program is to continue using CUDA.
Note that cudaDeviceReset() by itself is insufficient to restore a GPU to proper functional behavior. In order to accomplish that, the "owning" process must also terminate. See here.

Is "I/O device request" external interrupt or internal exceptions?

From Computer Organization and Design, by Patterson et al
Why is "I/O device request" external interrupt?
Does "I/O device request" mean that a user program request I/O device services by system calls? If yes, isn't a system call an internal exception?
Thanks.
It's referring to peripheral devices signaling that they require attention, eg. disk controller hardware that is now ready to satisfy a read request that it received earlier, (or has finished DMAing in data for the read request).
The path in to the operating system is an array of pointers. This carry may have different names depending upon the system. I will call it the "dispatch table." The dispatch table handles everything that needs the attention of the operating system: Interrupts, faults, and traps. The last two are collectively "exceptions".
An exception is caused by executing an instruction. They synchronous.
An interrupt is as caused by by something occurring outside the executing process/thread.
A user invokes the the operating system synchronously by executing an instruction that causes a trap (On intel chips they misname such a trap a "software interrupt"). Such an even is a synchronous, predictable result of the instruction stream.
Such a trap would be used to queue an I/O request to the device. "Invoke the operating system from user program" in your table.
The device wold cause an interrupt when the request is completed. That what is meant by an "I/O Device Request" in your table.
The confusion is that interrupts, faults and traps are all handled the same way by the operating system through the dispatch table. And, as I said, in Intel land they call both traps and interrupts "Interrupts".
Because the interrupt isn't generated by the processor or the program. It is a physical wire connected to the interrupt controller whose state changes. Driven by the controller for the device, external to the processor. The interrupt handler is usually located in a driver that knows how to handle the device controller's request for service.
"Invoke the operating system" is a software interrupt, usually switches the processor into protected mode to handle the request.
"Arithmetic overflow" is typically a trap that's generated by the floating point unit on the processor.
"Using an undefined instruction" is another trap, generated by the processor itself when it can't execute code anymore because the instruction is invalid.
Processor usually have more traps like that. Like division by zero. Or executing a privileged instruction. Or a page fault when virtual memory isn't mapped to physical memory yet. Or a protection fault when the program reads an unmapped virtual memory address.