Understanding hardware interrupts and exceptions at processor and hardware level

Understanding hardware interrupts and exceptions at processor and hardware level - exception

After a lot of reading about interrupt handling etcetera, i still can figure out the full process of interrupt handling from the very beginning.
For example:
A division by zero.
The CPU fetches the instruction to divide a number by zero and send it to the ALU.
Assuming the the ALU started the process of the division or run some checks before starting it.
How the exception is signaled to the CPU ?
How the CPU knows what exception has occurred from only one bit signal ? Is there a register that is reads after it gets interrupted to know this ?
2.How my application catches the exception?
Do i need to write some function to catch a specipic SIGNAL or something else? And when i write expcepion handling routine like
Try {}
Catch {}
And an exception occurres how can i know what exeption is thrown and handle it well ?
The most important part that bugs me is for example when an interupt is signaled from the keyboard to the PIC the pic in his turn signals to the CPU that an interrupt occurred by changing the wite INT.
But how does the CPU knows what device need to be served ?
What is the processes the CPU is doing when his INTR pin turns on ?
Does he has a routine that checks some register that have a value of the interrupt (that set by the PIC when it turns on the INT wire? )
Please don't ban the post, it's really important for me to understand this topic, i read a researched a couple of weaks but connot connect the dots in my head.
Thanks.

There are typically several thing associated with interrupts other than just a pin. Normally for more recent micro-controllers there is a interrupt vector placed on memory that addresses each interrupt call, and a register that signals the interrupt event/flag.
When a event that is handled by an interruption occurs and a specific flag is set. Depending on priority's and current state of the CPU the context switch time may vary for example a low priority interrupt flagged duding a higher priority interrupt will have to wait till the high priority interrupt is finished. In the event that nesting is possible than higher priority interrupts may interrupt lower priority interrupts.
In the particular case of exceptions like dividing by 0, that indeed would be detected by the ALU, the CPU may offer or not a derived interruption that we will call in events like this. For other types of exceptions an interrupt might not be available and the CPU would just act accordingly for example rebooting.
As a conclusion the interrupt events would occur in the following manner:
Interrupt event is flagged and the corresponding flag on the register is set
When the time comes the CPU will switch context to the interruption handler function.
At the end of the handler the interruption flag is cleared and the CPU is ready to re-flag the interrupt when the next event comes.
Deciding between interrupts arriving at the same time or different priority interrupts varies with different hardware.

It may be simplest to understand interrupts if one starts with the way they work on the Z80 in its simplest interrupt mode. That processor checks the state of a
pin called /IRQ at a certain point during each instruction; if the pin is asserted and an "interrupt enabled" flag is set, then when it is time to fetch the next instruction the processor won't advance the program counter or read a byte from memory, but instead disable the "interrupt enabled" flag and "pretend" that it read an "RST 38h" instruction. That instruction behaves like a single-byte "CALL 0038h" instruction, pushing the program counter and transferring control to that address.
Code at 0038h can then poll various peripherals if they need any service, use an "ei" instruction to turn the "interrupt enabled" flag back on, and perform a "ret". If no peripheral still has an immediate need for service at that point, code can then resume with whatever it was doing before the interrupt occurred. To prevent problems if the interrupt line is still asserted when the "ret" is executed, some special logic will ensure that the interrupt line will be ignored during that instruction (or any other instruction which immediately follows "ei"). If another peripheral has developed a need for service while the interrupt handler was running, the system will return to the original code, notice the state of /IRQ while it processes the first instruction after returning, and then restart the sequence with the RST 38h.
In the simple Z80 approach, there is only one kind of interrupt; any peripheral can assert /IRQ, and if any peripheral does so the Z80 will need to ask every peripheral if it wants attention. In more advanced systems, it's possible to have many different interrupts, so that when a peripheral needs service control can be dispatched to a routine which is designed to handle just that peripheral. The same general principles still apply, however: an interrupt effectively inserts a "call" instruction into whatever the processor was doing, does something to ensure that the processor will be able to service whatever needed attention without continuously interrupting that process [on the Z80, it simply disables interrupts, but systems with multiple interrupt sources can leave higher-priority sources enabled while servicing lower ones], and then returns to whatever the processor had been doing while re-enabling interrupts.

Related

Are there any CPU-state bits indicating being in an exception/interrupt handler in x86 and x86-64?

Are there any CPU-state bits indicating being in an exception/interrupt handler in x86 and x86-64? In other words, can we tell whether the main thread or exception handler is currently executed based only on the CPU registers' state?

Not, there's no bit in the CPU itself (e.g. a control register) that means "we're in an exception or interrupt handler".
But there is hidden state indicating that you're in an NMI (Non-Maskable Interrupt) handler. Since you can't block them by disabling interrupts, and unblockable arbitrary nesting of NMIs would be inconvenient, another NMI won't get delivered until you run an iret. Even if an exception (like #DE div by 0) happens during an NMI handler, and that exception handler itself returns with iret even if you're not done handling the NMI. See The x86 NMI iret problem on LWN.
For normal interrupts, you can disable interrupts (cli) if you don't want another interrupt to be delivered while this one is being handled.
However, the interrupt controller (logically outside the CPU core, but actually part of modern CPUs) may need to be told when you're done handling an external interrupt. (Not a software-interrupt or exception). https://wiki.osdev.org/IDT_problems#I_can_only_receive_one_IRQ shows the outb instructions needed to keep the legacy PIC happy. (I don't know if this applies to more modern ways of doing interrupts, like MSI-X message-signalled interrupts.
That part of the OSdev wiki page might be specific to toy OSes that let the BIOS emulate legacy IBM-PC stuff.) But either way, that's only for external interrupts like PS/2 keyboard controller, hard drive DMA complete, or whatever (not exceptions), so it's unrelated to your Are Linux system calls executed inside an exception handler? question.
The lack of exception-state means there's no special instruction you have to run to "acknowledge" an exception before calling schedule() from what was an interrupt handler. All you have to do is make sure interrupts are enabled or not when they should or shouldn't be. (sti / cli, or pushf / popf to save/restore the old interrupt state.) And of course that your software data structures remain consistent and appropriate for what you're doing. But there isn't anything you have to do specifically to keep the CPU happy.
It's not like with user-space where a signal handler should tell the OS it's done instead of just jumping somewhere and running indefinitely. (In Linux, a signal handler can modify the main-thread program-counter so sigreturn(2) resumes execution somewhere other than where you were when it was delivered.) If POSIX or Linux signals were the (mental) model you were wondering about for interrupts/exceptions, no, it's not like that.
There is an interrupt-priority mechanism (CR8 in x86-64, or the LAPIC TPR (Task Priority Register)), but it does not automatically get set when the CPU delivers an interrupt. You can set it once (e.g. if you have a lot of high-priority interrupts to process on this core) and it persists across interrupts. (How is CR8 register used to prioritize interrupts in an x86-64 CPU?).
It's just a filter on what interrupt-numbers can get delivered to this core when interrupts are enabled (sti, IF=1 bit in RFLAGS). Apparently Windows makes some use of it, or did back in 2007, but Linux doesn't (or didn't).
It's not like you have to tell the CPU / LAPIC that you're done with this interrupt so it's ok for it to deliver another interrupt of this or lower priority.

How are pending exceptions managed by the RISC-V specification?

I'm working with the RISC-V specification and have a problem with pending interrupts/exceptions. I'm reading version 1.10 of volume II, published in May 7, 2017.
In section 3.1.14, describing the registers mip and mie it is said that:
Multiple simultaneous interrupts and traps at the same privilege level are handled in the following decreasing priority order: extern interrupts, software interrupts, timer interrupts, then finally any synchronous traps.
Up until that point I thought that exceptions, e.g. a misaligned instruction fetch exception on a JAL/JALR instruction, would be handled immediately by a trap because
a) there is no way to continue executing your stream of instructions and
b) there is no description of how an exception could be pending, i.e. there are no concepts described by the specification that could manage state for exceptions (for example registers like mip but for exceptions).
However, the paragraph cited above indicates something different.
My questions are:
Are there pending exceptions in RISC-V?
If yes, how is it possible that the exception still can be handled after an interrupt was handled and isn't forgotten?

In my option there are pending exceptions in RISCV-V, exactly by the reason you stated. It is a matter of semantics, if two events occur simultaneously, and one is deferred, it must be pending. One must cater for the possibility of an asynchronous event (interrupt) occurring simultaneously with a trap, and (by section 3.1.14) the asynchronous event has priority. Depending on the implementation one does not neccesarely need to save any state in this case, after the interrupt is handled, the instruction that triggers a trap is re-fetched, and duly leads to an exception. In my view section 3.1.14 describes the serialization of asynchronous events.

Examples of a Software Interrupt and Exception at application level

My understanding on the both are slightly unclear. Many people on the internet say they are both the same. There are a few questions similar to my one, however none of them give a good real life example at a software level.
Would it be possible for someone to give me a clear example of both which will help me understand the differences between one another?
For example, is a division by zero a software interrupt? Or an exception?

Interrupt and exceptions have the same method of dispatch (usually through the system interrupt vector). However, interrupts and exceptions are triggered differently.
An exception occurs through the execution of the instruction stream. Thus, exceptions occur at predictable points in an application.
Interrupts occur as the result of events external to the execution stream.
Division by zero is occurs as the result of the instruction stream making it an exception.
Some operating systems are interrupt-based (e.g., Windoze and VMS).They allow the application to be interrupted in user (or other modes) for various reasons.
For example. in both those operating systems you can queue I/O operation and then have the application be interrupted when the I/O completes (a software interrupt triggered by the operating system rather than the hardware).

OS development: How to avoid an infinite loop after an exception routine

For some months I've been working on a "home-made" operating system.
Currently, it boots and goes into 32-bit protected mode.
I've loaded the interrupt table, but haven't set up the pagination (yet).
Now while writing my exception routines I've noticed that when an instruction throws an exception, the exception routine is executed, but then the CPU jumps back to the instruction which threw the exception! This does not apply to every exception (for example, a div by zero exception will jump back to the instruction AFTER the division instruction), but let's consider the following general protection exception:
MOV EAX, 0x8
MOV CS, EAX
My routine is simple: it calls a function that displays a red error message.
The result: MOV CS, EAX fails -> My error message is displayed -> CPU jumps back to MOV CS -> infinite loop spamming the error message.
I've talked about this issue with a teacher in operating systems and unix security.
He told me he knows Linux has a way around it, but he doesn't know which one.
The naive solution would be to parse the throwing instruction from within the routine, in order to get the length of that instruction.
That solution is pretty complex, and I feel a bit uncomfortable adding a call to a relatively heavy function in every affected exception routine...
Therefore, I was wondering if the is another way around the problem. Maybe there's a "magic" register that contains a bit that can change this behaviour?
--
Thank you very much in advance for any suggestion/information.
--
EDIT: It seems many people wonder why I want to skip over the problematic instruction and resume normal execution.
I have two reasons for this:
First of all, killing a process would be a possible solution, but not a clean one. That's not how it's done in Linux, for example, where (AFAIK) the kernel sends a signal (I think SIGSEGV) but does not immediately break execution. It makes sense, since the application can block or ignore the signal and resume its own execution. It's a very elegant way to tell the application it did something wrong IMO.
Another reason: what if the kernel itself performs an illegal operation? Could be due to a bug, but could also be due to a kernel extension. As I've stated in a comment: what should I do in that case? Shall I just kill the kernel and display a nice blue screen with a smiley?
That's why I would like to be able to jump over the instruction. "Guessing" the instruction size is obviously not an option, and parsing the instruction seems fairly complex (not that I mind implementing such a routine, but I need to be sure there is no better way).

Different exceptions have different causes. Some exceptions are normal, and the exception only tells the kernel what it needs to do before allowing the software to continue running. Examples of this include a page fault telling the kernel it needs to load data from swap space, an undefined instruction exception telling the kernel it needs to emulate an instruction that the CPU doesn't support, or a debug/breakpoint exception telling the kernel it needs to notify a debugger. For these it's normal for the kernel to fix things up and silently continue.
Some exceptions indicate abnormal conditions (e.g. that the software crashed). The only sane way of handling these types of exceptions is to stop running the software. You may save information (e.g. core dump) or display information (e.g. "blue screen of death") to help with debugging, but in the end the software stops (either the process is terminated, or the kernel goes into a "do nothing until user resets computer" state).
Ignoring abnormal conditions just makes it harder for people to figure out what went wrong. For example, imagine instructions to go to the toilet:
enter bathroom
remove pants
sit
start generating output
Now imagine that step 2 fails because you're wearing shorts (a "can't find pants" exception). Do you want to stop at that point (with a nice easy to understand error message or something), or ignore that step and attempt to figure out what went wrong later on, after all the useful diagnostic information has gone?

If I understand correctly, you want to skip the instruction that caused the exception (e.g. mov cs, eax) and continue executing the program at the next instruction.
Why would you want to do this? Normally, shouldn't the rest of the program depend on the effects of that instruction being successfully executed?
Generally speaking, there are three approaches to exception handling:
Treat the exception as an unrepairable condition and kill the process. For example, division by zero is usually handled this way.
Repair the environment and then execute the instruction again. For example, page faults are sometimes handled this way.
Emulate the instruction using software and skip over it in the instruction stream. For example, complicated arithmetic instructions are sometimes handled this way.

What you're seeing is the characteristic of the General Protection Exception. The Intel System Programming Guide clearly states that (6.15 Exception and Interrupt Reference / Interrupt 13 - General Protection Exception (#GP)) :
Saved Instruction Pointer
The saved contents of CS and EIP registers point to the instruction that generated the
exception.
Therefore, you need to write an exception handler that will skip over that instruction (which would be kind of weird), or just simply kill the offending process with "General Protection Exception at $SAVED_EIP" or a similar message.

I can imagine a few situations in which one would want to respond to a GPF by parsing the failed instruction, emulating its operation, and then returning to the instruction after. The normal pattern would be to set things up so that the instruction, if retried, would succeed, but one might e.g. have some code that expects to access some hardware at addresses 0x000A0000-0x000AFFFF and wish to run it on a machine that lacks such hardware. In such a situation, one might not want to ever bank in "real" memory in that space, since every single access must be trapped and dealt with separately. I'm not sure whether there's any way to handle that without having to decode whatever instruction was trying to access that memory, although I do know that some virtual-PC programs seem to manage it pretty well.
Otherwise, I would suggest that you should have for each thread a jump vector which should be used when the system encounters a GPF. Normally that vector should point to a thread-exit routine, but code which was about to do something "suspicious" with pointers could set it to an error handler that was suitable for that code (the code should unset the vector when laving the region where the error handler would have been appropriate).
I can imagine situations where one might want to emulate an instruction without executing it, and cases where one might want to transfer control to an error-handler routine, but I can't imagine any where one would want to simply skip over an instruction that would have caused a GPF.

Exceptions & Interrupts

When I was searching for a distinction between Exceptions and Interrupts,
I found this question Interrupts and exceptions on SO...
Some answers there were not suitable (at least for assembly level):
"Exception are software-version of an interrupt" But there exist software interrupts!!
"Interrupts are asynchronous but exceptions are synchronous" Is that right?
"Interrupts occur regularly"
"Interrupts are hardware implemented trap, exceptions are software implemented" Same as above!
I need to find if some of these answers were right , also I would be grateful if anyone could provide a better answer...
Thanks!

Interrupts are typically a method of signaling a change in hardware state. Peripherals will be tied by electrical signal to an interrupt controller which prioritizes and assigns address vectors to each possible signal. the interrupt controller forwards a detected interrupt condition to the CPU which may or may not 'interrupt' its present execution state to process the signaled state change (depending on whether interrupts are enabled and/or whether this particular input is non-maskable). Interrupt conditions may, on some architectures, be initiated by software (such as on the x86 there is an int mnemonic) in addition to hardware input.
Exceptions span a greater range of implementation. In some CPU architectures such as 68K, an exception can be similar to an interrupt but is generated by some CPU state that needs to be handled. For example there are conditions such as divide by zero, illegal instruction, I/O bus timeout, etc. that generate exceptions. By handling those exceptions one can do things such as emulate instructions and virtually extend the instruction set.
Exceptions may also be a software-only concept such as in the C++ language where certain error conditions can be trapped and handled.
So in general, the statements you are trying to find the validity of may be true or false depending on the exact platform you are applying them to.

An exception as it is used most often is a form of control flow in a programming language to deal with events outside the normal logic flow of the program to avoid that the business logic of a program drowns in the error handling logic. The 'handling' of the exception is context specific. It is more like a kind of GoTo for a number of use-cases where it was useful.
An interrupt is a hardware assisted 'trap' to trigger certain actions when certain events occur, as a timer tick or program "calling" INT21. There is a handler registered which does something predefined.
Both may or may not be synchronous or asynchronous.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008