I am reading the Intel Manual 3A Chapter 6 Interrupt and Exception Handling.
Interrupt and Exception have 3 sources respectively.
For Software-Generated Interrupt, it says:
The INT n instruction permits interrupts to be generated from within
software by supplying an interrupt vector number as an operand. For
example, the INT 35 instruction forces an implicit call to the
interrupt handler for interrupt 35. Any of the interrupt vectors from
0 to 255 can be used as a parameter in this instruction. If the
processor’s predefined NMI vector is used, however, the response of
the processor will not be the same as it would be from an NMI
interrupt generated in the normal manner. If vector number 2 (the NMI
vector) is used in this instruction, the NMI interrupt handler is
called, but the processor’s NMI-handling hardware is not activated.
Interrupts generated in software with the INT n instruction cannot be
masked by the IF flag in the EFLAGS register.
For Software-Generated Exceptions, it says:
The INTO, INT 3, and BOUND instructions permit exceptions to be
generated in software. These instructions allow checks for exception
conditions to be performed at points in the instruction stream. For
example, INT 3 causes a breakpoint exception to be generated. The INT
n instruction can be used to emulate exceptions in software; but there
is a limitation. If INT n provides a vector for one of the
architecturally-defined exceptions, the processor generates an
interrupt to the correct vector (to access the exception handler) but
does not push an error code on the stack. This is true even if the
associated hardware-generated exception normally produces an error
code. The exception handler will still attempt to pop an error code
from the stack while handling the exception. Because no error code was
pushed, the handler will pop off and discard the EIP instead (in place
of the missing error code). This sends the return to the wrong
location.
So, what's the difference? Seems both leverage the int n instruction. How can I tell whether it generates an exception or an interrupt in a piece of assembly code?
In the x86 architecture an exception is handled as an interrupt, nominally with an interrupt handler.
So interrupts and exceptions are terms that overlaps, the latter are a kind of the former.
Interrupt numbers from 0 to 31 are reserved for CPU exceptions, for example interrupt number 0 is the #DE (Divide error), interrupt number 13 is the #GP (General Protection).
When the CPU detects a condition that should rise an exception (like an access to a non present page) it performs a series of tasks.
First it pushes an error code if needed, some exceptions (like #PF and #GP) do, some (like #DE) don't.
The Section 6.15 of the Intel manual 3 lists all the exceptions with their eventual error code.
Secondly it "calls" the appropriate interrupt handler which is like a far call but with EFLAGS pushed on the stack.
int n does only the second step, it calls an interrupt but doesn't push any error code as there is no error condition in the hardware in the first place (and because int n was there before the concept of error codes).
So it can be used to emulate exceptions, the software has to eventually push an appropriate error code.
When you see int n in the code, it is never an exception. It is an interrupt, that eventually is used to steer the control flow into a particular OS exception handler.
Trivia: int3 (with no space) is special because it is encoded as CC which is only one byte (normal int n is CD imm8). This is useful for debugging, since the debugger can put it anywhere in the code segment.
into only generates the #OF exception if OF = 1.
Related
Are there any CPU-state bits indicating being in an exception/interrupt handler in ARM Cortex-A processors (like e.g. IPSR reister in ARM Cortex-M CPUs)? In other words, can we tell whether the main thread or exception handler is currently executed based only on the CPU registers' state?
The CPSR mode field indicates what mode the processor is currently executing in. You cannot act on it directly, you have to move it into a gpr to examine it.
CUDA document is not clear on how memory data changes after CUDA applications throws an exception.
For example, a kernel launch(dynamic) encountered an exception (e.g. Warp Out-of-range Address), current kernel launch will be stopped. After this point, will data (e.g. __device__ variables) on device still kept or they are removed along with the exceptions?
A concrete example would be like this:
CPU launches a kernel
The kernel updates the value of __device__ variableA to be 5 and then crashes
CPU memcpy the value of variableA from device to host, what is the value the CPU gets in this case, 5 or something else?
Can someone show the rationale behind this?
The behavior is undefined in the event of a CUDA error which corrupts the CUDA context.
This type of error is evident because it is "sticky", meaning once it occurs, every single CUDA API call will return that error, until the context is destroyed.
Non-sticky errors are cleared automatically after they are returned by a cuda API call (with the exception of cudaPeekAtLastError). Any "crashed kernel" type error (invalid access, unspecified launch failure, etc.) will be a sticky error. In your example, step 3 would (always) return an API error on the result of the cudaMemcpy call to transfer variableA from device to host, so the results of the cudaMemcpy operation are undefined and unreliable -- it is as if the cudaMemcpy operation also failed in some unspecified way.
Since the behavior of a corrupted CUDA context is undefined, there is no definition for the contents of any allocations, or in general the state of the machine after such an error.
An example of a non-sticky error might be an attempt to cudaMalloc more data than is available in device memory. Such an operation will return an out-of-memory error, but that error will be cleared after being returned, and subsequent (valid) cuda API calls can complete successfully, without returning an error. A non-sticky error does not corrupt the CUDA context, and the behavior of the cuda context is exactly the same as if the invalid operation had never been requested.
This distinction between sticky and non-sticky error is called out in many of the documented error code descriptions, for example:
synchronous, non-sticky, non-cuda-context-corrupting:
cudaErrorMemoryAllocation = 2
The API call failed because it was unable to allocate enough memory to perform the requested operation.
asynchronous, sticky, cuda-context-corrupting:
cudaErrorMisalignedAddress = 74
The device encountered a load or store instruction on a memory address which is not aligned. The context cannot be used, so it must be destroyed (and a new one should be created). All existing device memory allocations from this context are invalid and must be reconstructed if the program is to continue using CUDA.
Note that cudaDeviceReset() by itself is insufficient to restore a GPU to proper functional behavior. In order to accomplish that, the "owning" process must also terminate. See here.
My understanding : An interrupt (hardware interrupt) occurs asynchronously generally been caused by an external event directly interrupting the CPU. The CPU will then vector to the particular ISR to handle the interrupt.
Obviously an ISR cannot have a return value or have parameters passed because the event happen at anytime at any point of execution in our code.
With exceptions however, my understanding is that this is a software interrupt which is caused by a special instruction pd within the software.
I've heard that exceptions are handled in a similar fashion to handling an ISR. Can an exception handler in that case behave differently to an ISR , by taking arguments from the code and return a value, because we know where we were in our code when it was executed?
Thanks in advance
A hardware exception is not a software interrupt, you do not explicitly call it - it occurs on some hardware detectable error such as:
invalid address
invalid instruction
invalid alignment
divide-by-zero
You can of course write code to deliberately cause any of these and therefore use them as software interrupts, but then you may loose the utility of them as genuine error traps. Exceptions are in some cases used for this purpose - for example in a processor without an FPU on an architecture where and FPU is an option, the invalid instruction handler can be used to implement software emulation of an FPU so that the compiler does not need to generate different code for FPU and non-FPU variants. Similarly an invalid address exception can invoke a memory manager to implement a virtual memory swap-file (on devices with an MMU).
A software interrupt is explicitly called by an SWI instruction. It's benefit over a straightforward function call is that the application does not need to know the location of the handler - that is determined by the vector table, and is often used to make operating system or BIOS calls in simple operating systems can dynamically load code, but that do not support dynamic-linking (MS-DOS for example worked in this way).
What hardware interrupts, software interrupts and exceptions all have in common is that they execute in different processor context that normal code - typically switching to an independent stack and automatically pushing registers (or using an alternate register bank). They all operate via a vector table, and you cannot pass or return parameters via formal function parameter passing and return. With SWI and forced-exceptions, it is possible to load values into specific registers or memory locations known to the handler.
The above are general principles - the precise details will vary between different architectures. Consult technical reference for the specific device used.
The term "exception" can mean completely different things.
There are "software exceptions" in the form of exception handling, as a high-level language feature in languages like C++. An "exception handler" in this context would be something like a try { } catch block.
And there are "hardware exceptions" which is a term used by some CPU cores like PowerPC. These are a form of critical interrupts corresponding to an error state. An "exception handler" in this context would be similar to an interrupt vector table, although when a hardware exception occurs, there is usually nothing the software can do about it.
Hardware exceptions take no arguments and return no data, just like interrupts. Architectures like PowerPC separate hardware exceptions from hardware interrupts, where the former are various error states, and the later are interrupts from the application-specific hardware.
It isn't all that meaningful for a hardware exception to communicate with the software, as they would be generated from critical failures like wrong OP code executed, CPU clock gone bad, runaway code etc etc. That is, the execution environment has been compromised so there is nothing meaningful that the software executing in that environment can possibly do.
I've been studying assembly lately and i can't seem to understand how the exceptions work exactly. More specific, i get the message Exception 6 occurred and ignored. Can someone please explain what exactly does this mean? I am using qtspim.
Exceptions may be caused by hardware or software. An exception is like an unscheduled function call that jumps to a new address.
The program may encounter an error condition such as
an undefined instruction. The program then jumps to code in the operating system (OS), which may choose to terminate the program. Other causes of exceptions are division by zero, attempts to read some nonexistent memory, hardware malfunctions, debugger breakpoints, and arithmetic overflow.
The processor records the cause of an exception and the value of the PC
at the time the exception occurs. It then jumps to the exception handler function. The exception handler is code (usually in the OS) that examines the
cause of the exception and responds appropriately, It then returns to the program that
was executing before the exception took place.
In MIPS, the exception handler is always located at 0x80000180. When an exception occurs, the processor always jumps to this instruction address, regardless of the cause.
The MIPS architecture uses a special-purpose register, called the Cause
register, to record the cause of the exception.
MIPS uses another special-purpose register called the Exception
Program Counter (EPC) to store the value of the PC at the time an
exception takes place. The processor returns to the address in EPC after
handling the exception. This is analogous to using $ra to store the old
value of the PC during a jal instruction.
CUDA document is not clear on how memory data changes after CUDA applications throws an exception.
For example, a kernel launch(dynamic) encountered an exception (e.g. Warp Out-of-range Address), current kernel launch will be stopped. After this point, will data (e.g. __device__ variables) on device still kept or they are removed along with the exceptions?
A concrete example would be like this:
CPU launches a kernel
The kernel updates the value of __device__ variableA to be 5 and then crashes
CPU memcpy the value of variableA from device to host, what is the value the CPU gets in this case, 5 or something else?
Can someone show the rationale behind this?
The behavior is undefined in the event of a CUDA error which corrupts the CUDA context.
This type of error is evident because it is "sticky", meaning once it occurs, every single CUDA API call will return that error, until the context is destroyed.
Non-sticky errors are cleared automatically after they are returned by a cuda API call (with the exception of cudaPeekAtLastError). Any "crashed kernel" type error (invalid access, unspecified launch failure, etc.) will be a sticky error. In your example, step 3 would (always) return an API error on the result of the cudaMemcpy call to transfer variableA from device to host, so the results of the cudaMemcpy operation are undefined and unreliable -- it is as if the cudaMemcpy operation also failed in some unspecified way.
Since the behavior of a corrupted CUDA context is undefined, there is no definition for the contents of any allocations, or in general the state of the machine after such an error.
An example of a non-sticky error might be an attempt to cudaMalloc more data than is available in device memory. Such an operation will return an out-of-memory error, but that error will be cleared after being returned, and subsequent (valid) cuda API calls can complete successfully, without returning an error. A non-sticky error does not corrupt the CUDA context, and the behavior of the cuda context is exactly the same as if the invalid operation had never been requested.
This distinction between sticky and non-sticky error is called out in many of the documented error code descriptions, for example:
synchronous, non-sticky, non-cuda-context-corrupting:
cudaErrorMemoryAllocation = 2
The API call failed because it was unable to allocate enough memory to perform the requested operation.
asynchronous, sticky, cuda-context-corrupting:
cudaErrorMisalignedAddress = 74
The device encountered a load or store instruction on a memory address which is not aligned. The context cannot be used, so it must be destroyed (and a new one should be created). All existing device memory allocations from this context are invalid and must be reconstructed if the program is to continue using CUDA.
Note that cudaDeviceReset() by itself is insufficient to restore a GPU to proper functional behavior. In order to accomplish that, the "owning" process must also terminate. See here.