Why are 'opcode' field and 'funct' field apart in MIPS? - mips

MIPS ISA has an R type instruction, and the R instruction has an opcode field at its first 6 bits and a funct field at its last 6 bits. So why are the ISA designed like this? How about combine them into a 12-bits field?

My idea is that the three kinds of instructions share a prefix of 6-bit opcode. And for R and I types, the next 5 bits decide source register. If we combine opcode and funct for R instruction, the instruction format is not so consistent between R and I, which may make processor's design complex.

How about if combine them in 12-bits filed?
Since the opcode is the same for some operation in MIPS and if you change the funct than you can't differentiate which operation the instruction does, for example consider the following add(R,0,32) add has opcode 0 and funct 32
And also consider that and(R,0,36) and has also opcode 0 but different funct in this case 36 which means it's and AND operation.
check the MIPS Reference Sheet.

Related

What does "extend immediate to 32 bits" mean in MIPS?

I'm reading about the Instruction Decode (ID) phase in the MIPS datapath, and I've got the following quote: "Once operands are known, read the actual data (from registers) or extend the data to 32 bits (immediates)."
Can someone explain what the "extend the data to 32 bits (immediates)" part means? I know that registers all contain 32 bits, and I know what an immediate is. I just don't understand why you need to extend the immediate from 26 to 32 bits.
Thanks!
26-bit immediates are only in jump instructions, and aren't sign- or zero-extended to 32 bit, because they're not displacements to be added/subtracted.
I-type instructions with 16-bit immediates are different.
addi / addiu immediates are sign-extended (by duplicating the top/sign bit of the immediate to all higher bits).
https://en.wikipedia.org/wiki/Two%27s_complement#Sign_extension
This allows 2's complement numbers from -2^15 .. +2^15-1 to be encoded.
(0xFFFF8000 to 0x00007FFF)
ori/andi/xori boolean immediates are zero-extended (by setting all higher bits to zero)
This allows unsigned / 2's complement numbers from 0 .. 2^16-1 to be encoded.
(0x00000000 to 0x0000FFFF)
For other instructions see this instruction-set reference which breaks down each instruction showing 016 || [I15..0] for zero-extension or [I15]16 || [I15..0] for sign-extension.
This makes it possible to use 16-bit immediates as inputs to a 32-bit binary operation that only makes sense with 2 equal-width inputs. (In a simple classic MIPS pipeline, the decode stage fetches operands from registers and/or immediates. Register inputs are always going to be 32-bit, so the ALU is wired up for 32-bit inputs. Extending immediates to 32-bit means the rest of the CPU doesn't have to care whether the data came from an immediate or a register.)
Also sign-extended:
offsets in the reg+imm16 addressing mode used by lw/sw and other load/store instructions
relative branches (PC += imm16<<2)
maybe others, check the manual for instructions I didn't mention to see if they sign- or zero- extend.
You might be wondering "why does addiu sign-extend its immediate even though it's unsigned?"
Remember that there's no subiu, only addiu with a negative immediate. Being able to add or subtract numbers in the range -2^15 .. +2^15-1 is more useful than only being able to add 0 .. 2^16-1.
And usually you don't want to raise an exception on signed overflow, so normally compilers use addu / addiu even on signed integers. addu is badly named: it's not "for unsigned integers", it's just a wrapping-allowed / never-faulting version of add/addi. It sort of makes sense if you think of C, where signed overflow is undefined behaviour (and thus could use add and raise an exception in that case if the compiler wanted to implement it that way), but unsigned integers have well-defined overflow behaviour: base 2 wraparound.
On a 32-bit CPU, most of the operations you do (like adding, subtracting, dereferencing a pointer) are done with 32-bit numbers. When you have a number with fewer bits, you need to somehow decide what those other bits are going to be when you want to use that number in one of those operations. The act of deciding what those new high bits are is called "extending".
Assuming you are just doing a standard zero extension or sign extension, extending is very cheap. However, it does require some circuitry, so it makes sense that a description of the MIPS datapath would mention it.

How does a computer turn a string of ASCII into a signed or unsigned number?

For example if I type:
-6
Through what mechanism is that turned into:
1010
Would it be hardware based or somewhere in the kernel?
Would it be hardware based or somewhere in the kernel?
Usually no and no.
The kernel in a mainstream OS like Linux will usually just pass along bytes of text to user-space.
So a user-space program gets a string, i.e. a sequence of characters. (In simple cases, e.g. the ASCII subset of UTF-8, each character is a single byte.) A program would typically use a function like atoi() to convert a sequence of characters (representing ASCII codes for digits) to a binary integer. It's a standard library function because many programs need to deal with strings that represent integers, but it's a software function just like any other.
A simple implementation would have a loop like
int sum = 0;
for (auto d: digits) { // look at digits in MSB-first order
sum = 10*sum + d;
}
// the first digit ends up being multiplied by 10 n times
// the 2nd by 10 n-1 times, and so on. Each digit is multiplied by its place value.
This C++ source would be compiled to multiple asm instructions that implement it. Handling an optional - by negating is also a separate instruction. There's typically a neg instruction of some sort, or a way to subtract from zero, to get the 2's complement inverse. (Assuming 2's complement hardware).
You can speed this up by using fancier instructions that do more work per instruction / per clock cycle. On x86 for example you can convert a multi-digit string of digits to a binary integer with a few SIMD instructions, but that's still just using multiply and add instructions. See How to implement atoi using SIMD? for a nice use of pmaddwd to multiply by a vector of place-values and horizontally add. Also Fastest way to get IPv4 address from string is a cool examples of what you can do with packed-compare and looking up a pshufb shuffle-control vector from a table based on that compare result.
A function like scanf("%d", &num) that reads input as a number is implemented in user-space, but under the hood it uses a system call like read() to get data. (If the C stdio input buffer was empty.)
Some "toy" / teaching systems like the MARS and SPIM MIPS simulators have system calls that get get or print integers (with the input or result in an integer register). In that case, yes, the kernel does it in software.
Or depending on the implementation, there isn't actually a kernel at all, and the syscall instruction escapes to the emulator / simulator's input/output function, so from the POV of software running inside this virtual simulated machine, there really is hardware support for integer conversion. But no real hardware does the entire thing in microcode or actual hardware, at least not any mainstream architectures.

Converting binary/hexadecimal to MIPS instructions

For the following entries, what instructions do they represent respectively?
Binary: 00000001110001011000100000100001
Hexadecimal: 144FFF9D
I'm completely lost on what I'm doing here - searching online has produced a bunch of results that make very little sense to me, but what I've gathered is I'm basically supposed to match up the numbers to their appropriate instructions/registers, but how exactly do I know what those are? Where can I find a comprehensive list? How do I know whether it's an R I or J format function?
The first 6 bits (it is easier to work in binary) are the opcode, from which you can determine how to interpret the rest. This site should get you started: http://www.mrc.uidaho.edu/mrc/people/jff/digital/MIPSir.html
Update: Calling the first 6 bits the opcode is (to be too kind) misleading, but it is enough to tell you how to interpret the rest of the instruction; you may need to look elsewhere (typically at the end of the instruction) for the complete determination of the opcode.
There are 3 Type of MIPS Instructions:
R_type: Opcode must be 000000 (the first 6 bits) and with last 6 bits we can know what is the correct instruction
I_type
j_type
In this case, we have a R-type MIPS instruction and thus :
Opcode rs rt rd shamt funct
000000 01110 00101 10001 00000 100001
addu $s1 , $t6 , $a1

using mips instructions

If a thirty-two bit word can represent a MIPS instruction. How can we tell if that instruction is of type R, J, or I?
I'm having a hard time understanding these concepts, I think the opcodes might be different?
Basically MIPS instructions have an opcode stored in the most significant 6 bits which specify the format of the following bits. In particular, R-type instructions always have an opcode of 000000 (with the instruction functionality then further specified by the 6 least significant bits.
MIPS Instruction Coding

function codes for MIPS architecture

Im reviewing a problem where given a MIPS instruction, I have to write down the decimal value of the 4 fields corresponding to the opcode, rs, rt, and the function. I understand that the decimal value for rs and rt are just the decimal representations of the registers (i.e, $s0 is 16) but how could i figure out the 16 bit function code?
You can not determine that value.You need to be given that values.Each function code does different things,there are many instruction that has the same format.
Every instruction has its own opcode & function code. You can find the opcodes here, for example:
https://www.student.cs.uwaterloo.ca/~isg/res/mips/opcodes
For example, addi is 001000 in binary for the first 6 bytes (the opcode), followed by 2x5 bytes for the registers, followed by 16 bytes for the immediate value
add is 000000 (opcode), followed by 3x5 bytes for the registers, 00000 for shift amount (not used for this instruction), followed by 100000 for the function code.