MIPS: why is ISR surrounded with rdpgpr $sp, $sp; wrpgpr $sp, $sp instructions? - mips

I'm working with PIC32 MCUs (MIPS M4K core), I'm trying to understand how do interrupts work in MIPS; I'm armed with "See MIPS Run" book, official MIPS reference and Google. No one of them can help me understand the following:
I have interrupt declared like this:
void __ISR(_CORE_TIMER_VECTOR) my_int_handler(void)
I look at disassembly, and I see that RDPGPR SP, SP is called in the ISR prologue (first instruction, actually); and balancing WRPGPR SR, SR instruction is called in the ISR epilogue (before writing previously-saved Status register to CP0 and calling ERET).
I see that these instruction purposes are to read from and save to previous shadow register set, so, RDPGPR SP, SP reads $sp from shadow register set and WRPGPR SR, SR writes it back, but I can't understand the reason for this. This ISR intended not to use shadow register set, and actually in disassembly I see that context is saved to the stack. But, for some reason, $sp is read from and written to shadow $sp. Why is this?
And, related question: is there some really comprehensive resource (book, or something) on MIPS assembly language? "See MIPS Run" seems really good, it's great starting point for me to dig into MIPS architecture, but it does not cover several topics good enough, several things off the top of my head:
Very little information about EIC (external interrupt controller) mode: it has the diagram with Cause register that shows that in EIC mode we have RIPL instead of IP7-2, but there is nothing about how does it work (say, that interrupt is caused if only Cause->RIPL is more than Status->IPL. There's even no explanation what RIPL does mean ("Requested Interrupt Priority Level", well, Google helped). I understand that EIC is implementation-dependent, but the things I just mentioned are generic.
Assembly language is covered not completely enough: say, nothing about macro (.macro, .endm directives), I couldn't find anything about some assembler directives I've seen in the existing code, say, .set mips32r2, and so on.
I cant find anything about using rdpgpr/wrpgpr in the ISR, it covers these instructions (and shadow register sets in general) very briefly
Official MIPS reference doesn't help much in these topics as well. Is there really good book that covers all possible assembly directives, and so on?

When the MIPS core enters an ISR it can swap the interrupted code's active register set with a new one (there can be several different shadow register sets), specific for that interrupt priority.
Usually the interrupt routines don't have a stack of their own, and because the just switched-in shadow register set certainly have its sp register with a different value than the interrupted code's, the ISR copies the sp value from the just switched-out shadow register set to its own, to be able to use the interrupted code's stack.
If you wish, you could set your ISR's stack to a previously allocated stack of its own, but that is usually not useful.

Related

Assembly Commands Are Running without Me Explicitly Calling Them [duplicate]

I hope this question isn't to stupid cause it may seem obvious.
As I'm doing a little research on Buffer overflows I stumble over a simple question:
After going to a new Instruction Address after a call/return/jump:
Will the CPU execute the OP Code at that address and then move one byte to the next address and execute the next OP Code and so on until the next call/return/jump is reached? Or is there something more tricky involved?
A bit boringly extended explanation (saying the same as those comments):
CPU has special purpose register instruction pointer eip, which points to the next instruction to execute.
A jmp, call, ret, etc. ends internally with something similar to:
mov eip,<next_instruction_address>.
While the CPU is processing instructions, it does increment eip by appropriate size of last executed instruction automatically (unless overridden by one of those jmp/j[condition]/call/ret/int/... instructions).
Wherever you point the eip (by whatever means), CPU will try it's best to execute content of that memory as next instruction opcode(s), not aware of any context (where/why did it come from to this new eip). Actually this amnesia sort of happens ahead of each instruction executed (I'm silently ignoring the modern internal x86 architecture with various pre-execution queues and branch predictions, translation into micro instructions, etc... :) ... all of that are implementation details quite hidden from programmer, usually visible only trough poor performance, if you disturb that architecture much by jumping all around mindlessly). So it's CPU, eip and here&now, not much else.
note: some context on x86 can be provided by defining the memory layout by supervising code (like OS), ie. marking some areas of memory as non-executable. CPU detecting it's eip pointing to such area will signal a failure, and fall into "trap" handler (usually managed by OS also, killing the interfering process).
The call instruction saves (onto the stack) the address to the instruction after it onto the stack. After that, it simply jumps. It doesn't explicitly tell the cpu to look for a return instruction, since that will be handled by popping (from the stack) the return address that call saved in the first place. This allows for multiple calls and returns, or to put it simply, nested calls.
While the CPU is processing instructions, it does increment eip by
appropriate size of last executed instruction automatically (unless
overridden by one of those jmp/j[condition]/call/ret/int/... instructions).
That's what i wanted to know.
I'm well aware that thers more Stuff arround (NX Bit, Pipelining ect).
Thanks everybody for their replys

Function call and context save to stack

I am very interested in real time operating systems for micro-controllers, so I am doing a deep research on the topic. At the high level I understand all the general mechanisms of an OS.
In order to better learn it I decided to write a very simple kernel that does nothing but the context switch. This raised a lot of additional - practical questions to me. I was able to cope with many of them but I am still in doubt with the main thing - Saving context (all the CPU registers, and stack pointer) of a current task and restore context of a new task.
In general, OS use some function (lets say OSContextSwitch()) that preserves all the actions for the context switch. The body of the OSContextSwitch() is mainly written in assembly (inline assembly in C body function). But when the OSContextSwitch() is called by the scheduler, as far as I know, on a function call some of the CPU registers are preserved on the stack by the compiler (actually by the code generated by the compiler).
Finally, the question is: How to know which of the CPU registers are already preserved by the compiler to the stack so I can preserve the rest ? If I preserved all the registers regardless of the compiler behaviour, obviously there will be some stack leakage.
Such function should be written either as pure assembly (so NOT an assembly block inside a C function) or as a "naked" C function with nothing more than assembly block. Doing anything in between is a straight road to messing things up.
As for the registers which you should save - generally you need to know the ABI for your platform, which says that some registers are saved by caller and some should be saved by callee - you generally need to save/restore only the ones which are normally saved by callee. If you save all of them nothing wrong will happen - your code will only be slightly slower and use a little more RAM, but this will be a good place to start.
Here's a typical context switch implementation for ARM Cortex-M microcontrollers - https://github.com/DISTORTEC/distortos/blob/master/source/architecture/ARM/ARMv6-M-ARMv7-M/ARMv6-M-ARMv7-M-PendSV_Handler.cpp#L76

Query about MIPS R3051 pipeline behaviour (MIPS-I architecture)

I am currently implementing a MIPS R3051 in software as part of my university project.
I notice in the programmers manual from IDT it specifies that computational instructions can access the results of other computational instructions ahead of them in the pipeline at their RD stage, even though the ahead instruction has not yet committed its results to the relevant register in the WB stage. This is done via "special logic within the execution engine" to prevent a stall being necessary.
My query is does this also apply to non-computational instructions (like a jump-type instruction for example)?
An example: if an ADD instruction calculates a value at its ALU stage destined for r1, with a JR [r1] instruction behind it in the pipeline at RD, will the JR instruction get:
(a) the old contents of r1
or
(b) will this "special logic" allow the new value of r1 to be forwarded to it? or
(c) will the pipeline stall until r1 has been committed properly at WB?
Apologies if this is asked elsewhere (I have not spotted it). Many thanks.
Regards,
Phil
The key here is to keep well in mind that this "special logic" is only an optimization: it makes things faster, here bypassing something so to avoid a stall, but it must still insure that the result is unchanged. Otherwise it would be impossible or at least to difficult to program with this hardware.
So, to answer your question, you will see either case (b) or (c) but never case (a).

Instruction Execution in MIPS

This is an abstract view of the implementation of the MIPS subset showing the
major functional units and the major connections between them
Why we need to add the result of (PC+4) with instruction address?
I know that the PC (Program Counter) is a register in a computer processor that contains the address (location) of the instruction being executed at the current time, but i didn't understand why we add the second adder in this picture?
Some of the operations that can be performed by the CPU are 'jumps'.
If your operation is a Jump, from the second block you get the address of the new instructions OR the lenght of the jump you have to do.
It's not the instruction address, the output of the instruction memory is an instruction itself.
They've obviously hidden most of the components (there's NO control circuitry). What they probably meant is the data path for branches, though they really should have put at least the link with the ALU output in there. Even so it would be better to explicitly decode the instruction, sign extend and shift left. So it's really inaccurate, but I don't see what else they could mean.

How is the stack and link register used in an interrupt procedure? (ARM Processor)

The ARM website says that the link register stores the return information for subroutines, function calls, and exceptions (such as interrupts), so what is the stack used for?
The answers to this similar question say that the stack is used to store the return address, and to "push" on local variables that will need to be put back on the core registers after the exception.
But this is what the link register is for, so is why is it needed? What is the difference between the two and how are they both used?
Okay I think I understand your question.
So you have some code in a function call a function
main ()
{
int a,b;
a = myfun0();
b=a+7;
...
So when we call myfun0() the link register basically gets us back so we can do the b = a+7; understanding of course all of this gets compiled to assembly and optimized and such, but this will suffice to understand that the link register is used to return back to just after the call.
But what if
myfun0 ()
{
return(myfun1()+3);
}
when main calls myfun0() the link register points at some code in the main() function it needs to return to. then myfun0() calls myfun1() it needs to come back to myfun0() to do some more math before returning to main(), so when it calls myfun1() the link register is set to come back and add 3. the problem is when we set the link register to return to myfun0() we trash the address in main() we needed to return to. so to prevent that IF the function is going to call another function then as well as local variables that cant all live within the disposable registers the link register must be put on the stack. So now main calls myfun0(), myfun0() is going to call a function (myfun1()) so a copy of the link register (return address into main()) is saved on the stack. myfun0 calls myfun1() myfun1() follows the same rule if calling something else put lr on the stack otherwise you dont have to, myfun1() uses lr to return to myfun0(), myfun0() restores lr from the stack so it can return to main. follow that simple rule per function and you cant go wrong.
Now interrupts which not sure related or not or maybe I misunderstood your question. so arm has banked registers at least for non-cortex-m cores. but in general when an interrupt/exception occurs if the exception handler needs to use any resources/registers that are in use by the foreground task then that handler needs to preserve those on the stack in such a way that the foreground task that was interrupted had no idea this happened since interrupts in general can often occur between any two instructions, so you need to even go so far as to preserve the flags that were set by the instruction before the one you interrupted.
So apply that to arm, you have to look at which architecture you are using and see where it describes the interrupt process what registers you have to preserve and which ones you dont, which stack pointer is used, etc (something you have to setup well ahead of your first exceptions, if you are using an arm with separate interrupt stack and forground stacks).
The cortex-m is designed to do some/all of that work for you it has one stack basically and on interrupt it pushes all the registers on the stack for you so you can simply have a C compiled function just run as a handler and the hardware cleans up after you (from a preserved register perspective). Some other processors families do something like this for you, they might have a return from interrupt instruction separate from a return instruction, one is there because on interrupt the hardware saves the flags and return address but for a simple call you dont need the flags preserved.
The arm is a lot more flexible than some other instruction sets, some others you may not have any instructions that allow you to branch to an address in any register you want you may have a limitation. You might be limited on what register you use as a stack pointer or the stack pointer itself is not accessible as a general purpose register. by convention the sp is 13 in the arm, they allow the pseudo instruction of push and pop which translates into the proper ldmia r13!{blah} and stmdb r13!,{blah} but you could pick your own (if not using a compiler that follows the convention or can change an open source compiler to use a different stack pointer register). the arm doesnt prevent that. the magic of the link register r14 is nothing more than a branch link or branch link exchange automatically modifies r14, but the instruction set allows you to use basically any register to branch/return for normal function calls. The arm has just enough general purpose registers to encourage compilers to do register based parameter passing vs stack only. Some processors lean toward stack only parameter passing and have designed their return address instructions to be strictly stack based avoiding a return register all together and a rule to have to save that if nesting functions.
All of these approaches have pros and cons, in the case of arm register passing was desireable and a register based return address too, but for nesting functions you have to preserve the return address at each nesting level to not get lost. Likewise for interrupts you have to put things back they way you found them as well as be able to get back to where you interrupted the foreground.