Why do MIPS operations on unsigned numbers give signed results?

Why do MIPS operations on unsigned numbers give signed results? - mips

When I try working on unsigned integers in MIPS, the result of every operation I do remains signed (that is, the integers are all in 2's complement), even though every operation I perform is an unsigned one: addu, multu and so fourth...
When I print numbers in the range [2^31, 2^32 - 1] I get their 'overflowed' negative value as if they were signed (I guess they are).
Though, when I try something like this:
li $v0, 1
li $a0, 2147483648 # or any bigger number
syscall
the printed number is always 2147483647 (2^31 - 1)
I'm confused... What am I missing?
PS : I haven't included my code as it isn't very readable (such is assembly code) and putting aside this problem,
seems to be working fine. If anyone feels it is necessary I shall include it right away!

From Wikipedia:
The MIPS32 Instruction Set states that the word unsigned as part of Add and Subtract instructions, is a misnomer. The difference between signed and unsigned versions of commands is not a sign extension (or lack thereof) of the operands, but controls whether a trap is executed on overflow (e.g. Add) or an overflow is ignored (Add unsigned). An immediate operand CONST to these instructions is always sign-extended.
From the MIPS Instruction Reference:
ALL arithmetic immediate values are sign-extended [...] The only difference between signed and unsigned instructions is that signed instructions can generate an overflow exception and unsigned instructions can not.

It looks to me like the real problem is the syscall that you are using to print numbers. It appears to and to always interpret what you pass as signed, and to possibly to bound as as well.

Related

What does "extend immediate to 32 bits" mean in MIPS?

I'm reading about the Instruction Decode (ID) phase in the MIPS datapath, and I've got the following quote: "Once operands are known, read the actual data (from registers) or extend the data to 32 bits (immediates)."
Can someone explain what the "extend the data to 32 bits (immediates)" part means? I know that registers all contain 32 bits, and I know what an immediate is. I just don't understand why you need to extend the immediate from 26 to 32 bits.
Thanks!

26-bit immediates are only in jump instructions, and aren't sign- or zero-extended to 32 bit, because they're not displacements to be added/subtracted.
I-type instructions with 16-bit immediates are different.
addi / addiu immediates are sign-extended (by duplicating the top/sign bit of the immediate to all higher bits).
https://en.wikipedia.org/wiki/Two%27s_complement#Sign_extension
This allows 2's complement numbers from -2^15 .. +2^15-1 to be encoded.
(0xFFFF8000 to 0x00007FFF)
ori/andi/xori boolean immediates are zero-extended (by setting all higher bits to zero)
This allows unsigned / 2's complement numbers from 0 .. 2^16-1 to be encoded.
(0x00000000 to 0x0000FFFF)
For other instructions see this instruction-set reference which breaks down each instruction showing 016 || [I15..0] for zero-extension or [I15]16 || [I15..0] for sign-extension.
This makes it possible to use 16-bit immediates as inputs to a 32-bit binary operation that only makes sense with 2 equal-width inputs. (In a simple classic MIPS pipeline, the decode stage fetches operands from registers and/or immediates. Register inputs are always going to be 32-bit, so the ALU is wired up for 32-bit inputs. Extending immediates to 32-bit means the rest of the CPU doesn't have to care whether the data came from an immediate or a register.)
Also sign-extended:
offsets in the reg+imm16 addressing mode used by lw/sw and other load/store instructions
relative branches (PC += imm16<<2)
maybe others, check the manual for instructions I didn't mention to see if they sign- or zero- extend.
You might be wondering "why does addiu sign-extend its immediate even though it's unsigned?"
Remember that there's no subiu, only addiu with a negative immediate. Being able to add or subtract numbers in the range -2^15 .. +2^15-1 is more useful than only being able to add 0 .. 2^16-1.
And usually you don't want to raise an exception on signed overflow, so normally compilers use addu / addiu even on signed integers. addu is badly named: it's not "for unsigned integers", it's just a wrapping-allowed / never-faulting version of add/addi. It sort of makes sense if you think of C, where signed overflow is undefined behaviour (and thus could use add and raise an exception in that case if the compiler wanted to implement it that way), but unsigned integers have well-defined overflow behaviour: base 2 wraparound.

On a 32-bit CPU, most of the operations you do (like adding, subtracting, dereferencing a pointer) are done with 32-bit numbers. When you have a number with fewer bits, you need to somehow decide what those other bits are going to be when you want to use that number in one of those operations. The act of deciding what those new high bits are is called "extending".
Assuming you are just doing a standard zero extension or sign extension, extending is very cheap. However, it does require some circuitry, so it makes sense that a description of the MIPS datapath would mention it.

Range of immediates in lui instruction

I'm not sure what is the range bound for the immediate in lui instruction.
When I assemble:
lui $t0,32768
It successfully went without errors.
However,
lui $t0,-32768
notified that -32768 out of range.

In MIPS the immediate in I-type instructions is always 16-bit long. That means the range will be [0, 65535] if the assembler treats it as unsigned, and [-32768, 32767] for the signed case
However what you can use in the assembly depends on the assembler
For example some assemblers like shell-storm and WeMips accept constants in [-32768, 65535] which is a mix of both 16-bit signed and unsigned, MIPS Converter only accepts hexadecimal values but WebMIPSASM accepts even huge values like 9223372036854775807 and truncate the result to 16 bits

Verilog branch instruction MIPS

I am trying to understand how the verilog branch statement works in an immediate instruction format for the MIPS processor. I am having trouble understanding what the following Verilog code does:
IR is the instruction so IR[31:26] would give the opcode.
reg[31:0] BT = PC + 4 + { 14{IR[15]}, IR[15:0], 2'b0};
I see bits and pieces such as we are updating the program counter and that we are taking the last 16 bits of the instruction to get the immediate address. Then we need a 32 bit word so we extend 16 more zeros.
Why is it PC + 4 instead of just PC?
What is 2'b0?
I have read something about sign extension but don't quite understand what is going on here.
Thanks for all the help!

1: Branch offsets in MIPS are calculated relative to the next instruction (since the instruction after the branch is also executed, as the branch delay slot). Thus, we have to use PC +4 for the base address calculation.
2: Since MIPS uses a bytewise memory addressing system (every byte in memory has a unique address), but uses 32-bit (4-byte) words, the specification requires that each instruction be word-aligned; thus, the last two bits of the address point at the bottom byte of the instruction (0x____00).
IN full, the instruction calculates the branch target address by taking the program counter, adding 4 to account for hte branch delay slot, and then adding the sign extended (because the branch offset could be either positive or negative; this is what the 14{IR[15]} does) offset to the target.

Numbers in Verilog can be represented using number of bits tick format of the following number.
2'b11; // 2 bit binary
3'd3 ; // 3 bit decimal
4'ha ; // 4 bit hex
The format describes the following number, the bit pattern used is not changed by the format. Ie 2'b11 is identical to 2'd3;

MIPS: Can I get unsigned int value from user via syscall?

The title pretty much sums this up. I am writing a program in 32-bit MIPS Assembly Language (using the MARS emulator) for a school project and I'm having zero luck reading in int values > 2,147,483,647.
I spent a decent amount of time hunting around the internet and in my book to no avail. This is not central to the assignment (which, if you happen to know it is impossible, you probably already realized) but curiosity is killing this cat. Now that I've hit this brick wall, I must know for sure.
Notes:
I'm specifically looking for a way to grab an unsigned int as opposed to taking a float or a double.
The standard code for grabbing an int with syscall:
li $v0, 5
syscall
move $t0, $v0
The error that occurs when 2 500 000 000 is passed at prompt for integer:
Error in C:\DEV\....... line 57: Runtime exception at
0x004000034: invalid integer input (syscall 5)
Help me Obi-Wan, you're my only hope!

You'll need to use a different system call -- MARS is throwing the exception, not anything "inside" the MIPS CPU. Try, for instance, syscalls 8 or 12 (read string and read character). Note that, as a result, you'll have to implement a lot more of the parsing yourself to make these work.
Alternatively, you might try reading a double (syscall 7) and converting it to an integer...
There's a full list of MARS syscalls online at:
http://courses.missouristate.edu/KenVollmar/MARS/Help/SyscallHelp.html

negative integers in binary

5 (decimal) in binary 00000101
-5 (two's complement) in binary 11111011
but 11111011 is also 251 (decimal)!
How does computer discern one from another??
How does it know whether it's -5 or 251??
it's THE SAME 11111011
Thanks in advance!!

Signed bytes have a maximum of 127.
Unsigned bytes cannot be negative.
The compiler knows whether the variable holding that value is of signed or unsigend type, and treats it appropriately.

If your program chooses to treat the byte as signed, the run-time system decides whether the byte is to be considered positive or negative according to the high-order bit. A 1 in that high-order bit (bit 7, counting from the low-order bit 0) means the number is negative; a 0 in that bit position means the number is positive. So, in the case of 11111011, bit 7 is set to 1 and the number is treated, accordingly, as negative.
Because the sign bit takes up one bit position, the absolute magnitude of the number can range from 0 to 127, as was said before.
If your program chooses to treat the byte as unsigned, on the other hand, what would have been the sign bit is included in the magnitude, which can then range from 0 to 255.

Two's complement is designed to allow signed numbers to be added/substracted to one another in the same way unsigned numbers are. So there are only two cases where the signed-ness of numbers affect the computer at low level.
when there are overflows
when you are performing operations on mixed: one signed, one unsigned
Different processors take different tacks for this. WRT orverflows, the MIPS RISC architecture, for example, deals with overflows using traps. See http://en.wikipedia.org/wiki/MIPS_architecture#MIPS_I_instruction_formats
To the best of my knowledge, mixing signed and unsigned needs to avoided at a program level.

If you're asking "how does the program know how to interpret the value" - in general it's because you've told the compiler the "type" of the variable you assigned the value to. The program doesn't actually care if 00000101 as "5 decimal", it just has an unsigned integer with value 00000101 that it can perform operations legal for unsigned integers upon, and will behave in a given manner if you try to compare with or cast to a different "type" of variable.
At the end of the day everything in programming comes down to binary - all data (strings, numbers, images, sounds etc etc) and the compiled code just ends up as a large binary blob.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008