Implementing jump register control to single-cycle MIPS - mips

I am trying to implement jr (jump register) instruction support to a single-cycle MIPS processor. In the following image, I've drawn a simple mux that allows selecting between the normal chain PC or the instruction (jr) address.
How can I know that the instruction is JR to set the mux selection to '1'? I've already done jump and jump_and_link (although the image doesn't show it, as I don't have my project in hands right now), and to control them, I just check if the OP code is 10 (jump) or 11 (jal) in the main control and then set the mux sel to '1'. But I think I can't do the same with jr, as the instruction layout is distinct.

The opcode of a JR instruction has Instruction[31:26] == 0 (special) and Instruction[5:0] == 0x08 (JR). You need to look at both of these bit positions to decide that this is a JR instruction. The Control block on your diagram needs to have an additional input of Instruction[5:0]. The rs field in Instruction[25:21] selects the source register for this instruction. The PC needs to be assigned to rs when a JR instruction is executed.

I think you can improve the performance of the hardware by implementing the JR mux before the Jump mux, since the JR mux is not dependent on the pcnext of the output of the Jump sel mux.

Related

Does sw and lw in MIPS store a value below or above the stack pointer?

My professor gave a video that looks like this:
In the lower right, he wrote $ra at location 124 while the $sp is at 128 which implies that the first sw $ra, 4($sp) instruction stores the $ra value at a location 4 bytes less than the $sp. But my book does it differently:
and
The image implies that the lw instruction stores it at locations larger, more positive numbers than the $sp. So which is right? Does lw and sw offset numbers refer to numbers higher or lower than the $sp?
You are right in observing that the first factorial is storing above the stack pointer, stack storage that it did not allocate, and must have been allocated by the caller.
This is somewhat non-standard usage, but technically legal, since the MIPS calling convention requires giving the top 4 stack locations of any stack frame to the callee. The function is only allocating a 2-word frame, and according to the calling convention (which allows the callee to use the top 4 words of the frame) it should be allocating minimally a 4-word frame.
Still, since the factorial function calls no other except itself, this is ~legal, and in compliance with the calling convention — in the sense that its job is to ensure that one function can call another.
(Note that in RISC V (the open source MIPS follow-on) this requirement of 4-words stack frame for callee to use is not present so similar would not work there.)
The second example is more traditional, however, it also does not allocate a standard sized frame — one that gives the top 4-words to the callee. Still it is also not technically necessary, and less reliant on the original caller (e.g. main) providing a proper stack frame (one with 4 words given to the callee).
Let's further observe that the first code sample stores $ra and $a0 on the stack, which are registers that we expect to be saved — whereas the latter example stores $s0 (which we would expect to be saved as these are dedicated non-volatile), but also $t0 and $t1 which seems non standard as these are dedicated temporaries.

Extending MIPS datapath to implement SLL and SRL

Here's the datapath:
So this seems like a pretty common question but I can't seem to find any answers on how to extend the datapath to implement SLL and SRL.
This is how I would think to do it but I'm not entirely sure:
It would need another mux right next to Read data 1 next to the register file. This mux would take Read data 1 (rs) and Read data 2 (rt) as inputs. It would select Read data 1 if we're not doing a shift operation, and it would select rt if we ARE doing a shift operation (since sll and srl use rt, not rs). This would then be fed into the ALU.
Next, we would need to branch Instruction[10:6] (the shift amount) off of Instruction[15:0], and Instruction[10:6] would then be fed into the other port of the ALU. Is this correct thinking?
This is sll on single cycle datapath, but i am not sure if the ALU now gets 5 instead of 4 bits control input.
If u make sll then the first ALU input would be shamt and the second is the register to be shifted, ALU know if it must make shift because of instruction field, because it is a R-Type instruction. Then the shifted data will be saved in rd register.
SLL SC datapath
You need to modify the datapath for the SLL instruction, adding a input line to the ALU with the "shamt" field in order to determine de shift amount. The ALU will identify the SLL operation by the ALUop field.
Modiffied datapath
You are going in the correct direction. As stated in one of the answers, there can be one additional port added to the ALU which will consider the shamt amount (bits [10:6]). There can be some internal hardware such as a MUX in the ALU which takes care of selecting either the shamt field or Read Data 2 from the output of register file.

What can be the cause of "jal" to the middle of another function in MIPS

I am looking at a very suspicious disassembled MIPS code of a C application
80019B90 jal loc_80032EB4
loc_80032EB4 is in the middle of another function's body, I've specially checked that no other code is loaded at this address in runtime and calling that function this way(passing some code in the beginning) can be useful. But how is it possible to do in C? It's not a goto as you can't goto to another function and normal function call will always "jal" to the beginning. Can this be some hand optinmimzation?
Update:
Simplified layout of both functions, callee:
sub_80032E88 (lz77_decode)
... save registers ...
80032E90 addiu $sp, -8
... allocate memory for decompressed data ...
80032EB0 move DECOMPRESSED_DATA_POINTER_A1, $v0
loc_80032EB4:
80032EB4 lw $t7, 0(PACKED_DATA_POINTER_A0)
... actual data decompression ...
80032F4C jr $ra
caller:
80019ACC addiu $sp, -0x30
... some not related code ...
80019B88 lw $a1, off_80018084 // A predefined buffer is used instead of allocating it for decompressed data
80019B90 jal loc_80032EB4
80019B94 move $a0, $s0
... some other code and function epilogue ...
Update 2:
I've checked if this can be a case of setjmp/longjmp usage, but in my tests I can always see calls to setjmp and longjmp functions in disassembled code, not a direct jump.
Update 3:
I've tried using GCC-specific ability to get label pointers and casted this pointer to function, result is close to what I want but disassembled code is still different as instead of using jal with exaxct address it calculating it runtime, maybe I am just unable to force compiler to see this value as constant, becouse of scope issues.
Since it is a data decompression function from a game system, it is very likely that this function is hand optimized assembly with multiple entry points. Multiple entry points aren't commonly used, so it is difficult to find a publicly available example, but here is an old thread from the gcc mailing list that suggests a possible use for this technique.
The gist is that if you have two functions where one function F1 has code that is a subset of the other function, F2's code, then the code for F2 can fall through into the code for F1. In your case, F2 allocates memory for the decompressed data, and F1 assumes that the memory allocation has already been done. I'm pretty sure that GCC 2.9x cannot generate code like this.
It is not possible to directly translate this construct from assembler into standard C, because you cannot goto another function in C, but this is perfectly legal in assembler code. The gcc mailing list thread suggests a couple of work-arounds to express the same idea in C.
If you look at the dis-assembled code for the decompression it will likely have a different style than compiler generated code. There may even be some use of opcodes, like find first set bit that the compiler cannot generate from C.

MIPS - JAL confusion: $ra = PC+4 or PC+8?

I'm having trouble understanding how the instruction jal works in the MIPS processor.
My two questions are:
a) What is the value stored in R31 after "jal": PC+4 or PC+8?
b) If it's really PC+8, what happens to the instruction at PC+4? Is it executed before the jump or is it never executed?
In Patterson and Hennessy (fourth edition), pg 113:
"jump-and-link instruction: An instruction that jumps to and address and simultaneously saves the address of the following instruction in a register ($ra in MIPS)"
"program counter (PC): The register containing the address of the instruction in the program being executed"
After reading those two statements, it follows that the value saved in $ra should be (PC+4).
However, in the MIPS reference data (green card) that comes with the book, the jal instruction's algorithm is defined like this:
"Jump and Link : jal : J : R[31]=PC+8;PC=JumpAddr"
This website also states that "it's really PC+8", but strangely, after that it says that since pipelining is an advanced topic "we'll assume the return address is PC+4".
I come from 8086 assembly, so I'm aware that there's a big difference between returning to an address and to the one following it, because programs won't work if I just assume something that's not true. Thanks.
The address in $ra is really PC+8. The instruction immediately following the jal instruction is in the "branch delay slot". It is executed before the function is entered, so it shouldn't be re-executed when the function returns.
Other branching instructions on the Mips also have branch delay slots.
The delay slot is used to do something useful in the time it takes to execute the jal instruction.
I got the same question. Googled this excellent answer of Richard and also another link I wish to add here.
The link is http://chortle.ccsu.edu/AssemblyTutorial/Chapter-26/ass26_4.html
with this wonderful explanation of double adding 4 to the PC.
So the actual execution has two additions: 1) newPC=PC+4 by pipelining and 2) another addition $ra=newPC+4 by the jal instruction resulting the effective $ra = (address of the jal instruction)+8.

Microprogramming in MIPS

I am learning about micro programming and am confused as to what a micro-instruction actually is. I am using the MIPS architecture. My questions are as follows
Say for example I have the ADD instruction, what would the micro-instructions look like for this? How many micro-instructions are there for the add instruction. Is there somewhere online I can see the list of micro-instructions for the basic instructions of MIPS?
How can I figure out the bit string for an ADD microprogrammed instruction?
Microprogramming is a method of implementing a complex instruction set architecture (such as x86) in terms of simpler "micro instructions". MIPS is a RISC instruction set architecture and is not typically implemented using micro-programming, so there are ZERO microinstructions for the ADD instruction.
To answer your specific question one would have to know what the definition of your particular micro-architecture is.
This is an example of how to load the EPC into one of the registers and add 4-bytes to it:
lw t0, 20(sp) // Load EPC
addi t0, 4 // Add 4 to the return adress
sw t0, 20(sp) // Save EPC
There are "a lot" of instructions that you can use, you can see the MIPS Instruction Set here. In my humble opinion, MIPS is Really neat and easy to learn! A fun fact is that the first Playstation used a MIPS CPU.
Example instructions
lw = load word
la = load address
sw = save word
addi = add immidate
Then you have a lot of conditional instructions such as:
bne = branch not equal
bnez = branch not equal zero
And with these you use j to jump to an adress.
Here is an example from an Exception Handler that I wrote once for MIPS, this is the External Source handler:
External:
mfc0 t0, C0_CAUSE // We could aswell use 24(sp) to load CAUSE
and t0, t0, 0x02000 // Mask the CAUSE
bnez t0, Puls // If the only character left is
// "not equal zero" jump to Puls
j DisMiss // Else jump to DisMiss
In the above example I define an entry point called External that I can jump to, as I do with DisMiss to loop, you generally jump to yourself.
There are some other instructions used here aswell:
mfc0 = move from co-processor 0
To handle labels, I would suggest you check this question/answer out.
Here's a couple of resources on MicroProgramming with MIPS:
Some general information
Here is a bit more heavy power-point presentation on the subject from Princton ( PDF )
Here is a paper from another university which is one of the best of these three ( PDF ).