Converting hexadecimal to MIPs instruction - binary

Suppose the opcode of an MIPS instruction is 0 in decimal, the funct is 100001 in binary, and the rest of the machine code is cc560 in hexadecimal (from high-order bit to low-order bit). What is the instruction? When showing the registers, use names (e.g. $t0, $s2) instead of indices (e.g $8, $17).

Related

How to translate MIPS into C and how to reduce MIPS instructions?

Supposing that f, g, h, i are stored in $s0~$s4 respectively and the base addresses of arrays A and B are in $S6 and $S7.
sll $t0, $s0, 2
add $t0, $s6, $t0
sll $tl, $sl, 2
add $tl, $s7, $tl
lw $s0, 0($t0)
addi $t2 , $t0, 4
lw $t0, 0($t2)
add $t0, $t0, $s0
SW $t0, 0($tl)
I'm not familiar with MIPS so I Wonder how to translate MIPS into C and how to minimize these MIPS instructions?
how to translate MIPS into C
You recognize the patterns, here for array indexing / array element access.
On a byte addressable machine (all modern hardware), a 4-byte integer occupies 4 bytes in memory, and each of those bytes has a unique memory address.  Because of the way the hardware works, we only use one of those 4 addresses to refer to the whole 4-byte integer, namely we use the lowest address among the 4.  The hardware can load a 4-byte integer from memory given that one address (the lowest).
Since each 4-byte integer in memory occupies 4 addresses, in an array of 4-byte integers, the memory address of the first element and the memory address of the second element are 4 addresses apart even though are sequential index positions (i.e. they are only 1 index position apart).
The formula for indexing a 4-byte integer array, then is to convert the index into a byte offset, then add the byte offset to the base address of the array.  The first part of that: converting an index to a byte offset, is sometimes referred to as "scaling".  Scaling is conceptually done by multiplication, so in A[i], i needs to be scaled by the size of the array elements of A.  If 4-byte integers that means scaling (multiplying) the index by 4.  A quick way of doing that is shifting by 2 bit positions, which has the same effect as multiplying by 4.
The C language automatically scales when doing array references, whereas assembly language requires explicit scaling.  C can do this because it knows the type of the array, whereas assembly language does not.
In C we can do expressions like A[i].  The C language allows us to break that down somewhat into *(A+i), which separates the pointer arithmetic addition A+i from the dereferencing of that sum, dereferencing with the unary indirection operator, *.  As previously mentioned, C automatically scales, so A+i becomes the equivalent of A+i*4, in which we can substitute shifting for multiplication: A+(i<<2).
Next, we need to know if the dereference is for read or for write.  When A[i] is accessed for its value, we will see it on what we call the "right hand side" of an assignment operator, as in ... = A[i].  When A[i] is access to update/store a value, we will see it on what we call the left hand side of an assignment operator, as in A[i] = ....
So, the sequence for doing A[i] for read (right hand side) in C is the following in assembly:
sll $temp1, $i, 2
addu $temp2, $A, $temp1
lw $temp3, 0($temp1)
Where $tempN is some register (usually a designated temporary) chosen to hold an intermediate value.  Since multiple instructions are needed to accomplish anything, sequences of instructions are interconnected with registers that hold the intermediate states.  And also, in assembly we name registers, not variables, so in my above $i and $A should be a registers names representing those variables rather than variable names directly used.
The pattern for write/store array access is similar but ends with a sw instruction instead, to store some value into memory at the index position.
These instruction sequence are interconnected by the use of these registers, and the sequences can be interrupted or interspersed with other instructions — what we have to follow then is the above pattern by paying attention to to the register usages that interconnect them rather than the specific sequences.
In your sample code:
sll $t0, $s0, 2 # sourcing an index in $s0, scaling it into temp $t0
add $t0, $s6, $t0 # adding a base array in $s6, putting back into $t0
sll $tl, $sl, 2
add $tl, $s7, $tl
lw $s0, 0($t0) # accessing the value of $s6[$s0*4], aka A[f]
addi $t2 , $t0, 4
lw $t0, 0($t2)
add $t0, $t0, $s0
SW $t0, 0($tl)
We can see the pattern for a read access to an index in $s0, and an array in $s6, these, we are told, map to f and A, so those three instructions comprise A[f] to read a value from A at index f.
The rest are done similarly.  Your job is to use this knowledge to find the other array indexing patterns in the above sequence.  Find out how the results of the array indexing operations are used and you'll have the complete C code.
NOTE that the sample you've been given incorrectly uses add and addi when pointer arithmetic should use addu and addiu — we don't want signed integer overflow checking on pointer arithmetic, as pointers are unsigned.
One of the add instructions is not for pointer arithmetic, but should probably still have used addu if this is intended to be replicated in C, because the C language does not have a built in operator to trap on overflow.

How does MIPS assembler manage label address?

How does MIPS's assembler labels and J type instruction work?
I am currently making a MIPS simulator using C++ and came into a big question. How exactly does MIPS assembler manage label's and their address while on a J type instruction?
Let's assume that we have a following code. Also let's assume that start: starts at 0x00400000. Comments after code represent where the machine codes will be stored in memory.
start:
andi $t0, $t0, 0 # 0x0040 0000
andi $t1, $t1, 0 # 0x0040 0004
andi $t2, $t2, 0 # 0x0040 0008
addi $t3, $t3, 4 # 0x0040 000C
loop:
addi $t2, $t2, 1 # 0x0040 0010
beq $t2, $t3, exit # 0x0040 0014
j loop # 0x0040 0018
exit:
addi $t0, $t0, 1000 # 0x0040 002C
As I am understanding right at the moment, j loop expression will set PC as 0x0040 0010.
When J type instruction uses 32 bits and with MSB 6 bits as its opcode, it only has 26 bits left to represent address of instruction. Then how is it possible to represent 32 bit address system using only 26 bits?
With the example above, it can represent 0x00400010 with only 24bits. However, in references, text segment is located from 0x00400000 to 0x10000000 which needs 32bit to represent.
I have tried to understand this using MARS simulator, however it just represents j loop as j 0x00400010 which seems nonsense to me since 0x00400010 is 32 bits.
My current guess
One of my current guesses is following.
Assembler saves the loop: label's address into some memory address that is reachable by 26 bits. Then when expression j loop is called, label loop is translated to the memory address that contains 0x00400010 For example, 0x00400010 is saved in some address like 0x00300000 and when j loop is called, loop is translated into 0x00300000 and it is able to get value from 0x00300000 and reach out 0x00400010. (This is just one of my guess)
You have a number of questions here.
First, let's try to differentiate between the assembler's operation and the MIPS machine code that it generates and the processor executes.
The assembler manages labels and address in two ways.  First, it has a symbol table, which is like a dictionary, a data structure of key-value pairs where the names are keys and the addresses (that those names will refer to when the program is running) are the values in the pairs.
Second, the assembler manages the code and data sections with a location counter.  That location counter advances each time the program provides some code or data.  When new label is defined, the current location counter is then used as the address value in a new key-value pair.
The processor never sees the labels: they do not execute and they do not occupy any space in the code or data.  The processor sees only machine code instructions, which on MIPS are all 32-bits wide.  Each machine code instruction is divided into fields.  There are instruction types or formats, which on MIPS are straightforward: I-Type, J-Type, and R-Type.  These formats then define the instruction fields, and the assembler follows these encodings.  All the instruction formats share the 6-bit opcode field, and this opcode field tells the processor what format the instruction is, which fields it therefore has, and thus how to interpret and execute the rest of the instruction.
The assembler removes labels from the assembly — labels and their names do not exist in the program binary.  The label definitions themselves (label:) are omitted from the program binary but usages of labels are translated into numbers, so a machine code instruction that uses a label will have some instruction field that is numeric, and the assembler will provide a proper value for that numeric field so that the effect of the reaching or otherwise accessing what the label referred to is accomplished.  (The label is no longer in the program binary, but the code or data memory that the label referred does remain).
The assembler sets up branch instructions, j instructions, and la/lw instructions, using numbers that tell the processor how far forward or backward to move the program counter, or, what address some data of interest is at.  The lw/la instructions access data, and these use 2 x 32-bit instructions each holding 16 bits of the address of interest.  Between the two instructions, they put together a full 32-bit address for data access.  For branches to fully reach any 32-bit address, they would have to put together the 32-bit address in a similar manner (two instruction pair) and use an indirect/register branch.

How to convert / encode a negative number as an immediate in MIPS machine code

I want to change this instruction to binary or machine code:
addi $s3, $s1, -1000.
I know how to encode the opcode, rs, and rt, but I have no idea how to convert -1000 to binary.
I know how to get 1's complement and 2's complement. But i don't know how to express it in this I type instruction.
I just don't know how to express -1000 into last 16 digits as binary number.
since 1000(decimal) is 0000001111101000 in 16 digit.
1's complement is 1111110000010111
+1
= 1111110000011000 2's complement
so the answer for the whole instruction is
001000 10001 10011 1111110000011000
addi rs rt immediate
Is this right?
Yes, MIPS addi / addiu use a 16-bit signed 2's complement immediate as the low 16 bits of the instruction word. The CPU will sign-extend it to 32 (or 64) bits when decoding.
But note that ori / xori / andi logical instruction use unsigned 16-bit immediates that are zero-extended to 32-bit (or 64-bit), so -1000 is not encodable.
To implement xori $t0, $t1, -1000, you'd need to create a 32-bit -1000 in a register with something like addiu $at, $zero, -1000, then you could xori $t0, $t1, $at. ($at is the "assembler temporary" register that pseudo-instructions like bgt use.)

Arithmetic Overflow in mips

I am just started learning exception handler of MIPS instruction.
I need to make my program to have Arithmetic overflow exception so that i can test my exception handler.
I have two array A and B. Array A has hex number and Array B has integers.
How to make overflow by adding hex number and integer ?
The addition of which hex number and integer can cause overflow?
According to the MIPS instruction reference, the only addition operations which can produce overflow exceptions are the signed addition instructions:
ADD
ADDI
MIPS integers are 32-bit, and since you'll be using signed integers, the maximum value is 231-1 (aka 2147483647 or hex 7FFFFFFF). Thus any addition which results in a number larger than this should throw an exception, e.g if you try to add 1 to 2147483647:
# Load 2147483647 into $s1
LUI $s0, 32767
ORI $s1, $s0, 65535
# Add 1 to $s1 and store in $s2. This should produce an overflow exception
ADDI $s2, $s1, 1

double precision integer subtraction with 32-bit registers(MIPS)

I am learning computer arithmetic. The book I use(Patterson and Hennessey) lists the below question.
Write mips code to conduct double
precision integer subtraction for
64-bit data. Assume the first operand
to be in registers $t4(hi) and
$t5(lo), second in $t6(hi) and
$t7(lo).
My solution to the answer is
sub $t3, $t5, $t7 # Subtract lo parts of operands. t3 = t5 - t7
sltu $t2, $t5, $t7 # If the lo part of the 1st operand is less than the 2nd,
# it means a borrow must be made from the hi part
add $t6, $t6, $t2 # Simulate the borrow of the msb-of-low from lsb-of-high
sub $t2, $t4, $t6 # Subtract the hi's. t2 = t4 - t6
However the author given solutions for this problem are as below
For signed double precision integers,
subu $t3, $t5, $t7
sltu $t2, $t5, $t7
add $t6, $t6, $t2
sub $t2, $t4, $t6
For unsigned double precision integers,
subu $t3, $t5, $t7
sltu $t2, $t5, $t7
addu $t6, $t6, $t2
subu $t2, $t4, $t6
My understanding of the difference in operation of sub/add and subu/addu is that overflow-exception is generated in sub/add and not in subu/addu. Both sub/add and subu/addu subtract/add the bits of the operands and the interpretation of the operands being signed or unsigned makes no difference to the result unlike in slt and sltu instructions.
Question 1
I am inferring from the author given solutions that overflow detection is being handled whereas I did not think of the same in my solution. Am I right? Is there any other thing I am missing?
Question 2
Assuming that my above inference is right, why is overflow detection switched off for the author provided solutions in the case of subtraction of unsigned double precision by the use of addu and subu?
For addition and subtraction, there is no difference between signed and unsigned operands, except for the notion of overflow. An overflow is what happens when the numerical value of the result does not match the interpretation of the sequence of bits that you obtain.
For instance, consider 8-bit sequences (MIPS has 32-bit registers, but 8 bits are easier for my examples). Let us assume unsigned interpretation: an 8-bit sequence represents a numerical value between 0 and 255 (inclusive). If I add 10010011 (numerical value 147) to 01110110 (numerical value 118) then I get 00001001 (numerical value 9). 9 is not equal to 147+118. I get that result because the mathematical value is 265, which cannot fit in 8 bits. The addition result would have required 9 bits, but the upper ninth bit has been dropped.
Now, imagine the same example with the signed interpretation. 10010011 now has numerical value -109. 01110110 still has numerical value 118, and the obtained result (00001001) has value 9. The mathematical sum of -109 and 118 is 9, so there is no overflow.
This means that the notion of overflow depends on how you interpret the values. The addition mechanics are the same for both signed and unsigned interpretations (for the same input sequences of bits, you get the same output bit sequence -- this is the whole point of using two's complement for negative signed values) but overflow handling differs.
The MIPS architecture provides means for triggering exceptions on overflow. Conceptually, there are three possible addition operations on 32-bit words:
an addition which silently ignores overflows (result is truncated)
an addition which raises an exception when a signed overflow occurs (there is an overflow if the input and output sequences are interpreted as signed numbers)
an addition which raises an exception when an unsigned overflow occurs (there is an overflow if the intput and output sequences are interpreted as unsigned numbers)
The MIPS implements the first two kinds of additions, with, respectively, the addu and add opcodes. In the MIPS documentations, they are called, respectively, unsigned and signed arithmetics. There is no opcode for raising exceptions on unsigned overflows. In practice, C compilers use only addu, but they could use add for signed types (this is allowed by the C standard, but would break an awful lot of existing code). Ada compilers use add because Ada makes overflow checking mandatory.
That being said...
Patterson and Hennessey want to implement signed and unsigned arithmetics on 64-bit integers. For unsigned arithmetics, they want no exception whatsoever, hence they use addu and subu. For signed arithmetics, they want an exception to occur when the mathematical result would not fit on a 64-bit sequence with signed interpretation. They do not want to raise an exception because of some spurious overflow-like condition when processing the low 32-bit halves. This is why they use a subu for the low parts.
Your solution is wrong because it may raise an exception where it should not. Suppose that you want to subtract 2000000000 (two billions) from -2000000000 (minus two billions). The mathematical result is 4000000000 (four billions). The two operands and the result certainly fit in 64 bits (the representable range is -9223372036854775808 to 9223372036854775807). Hence, for 64-bit signed arithmetics, there is no overflow: there should be no exception. However, in that situation, your first sub will report an overflow. That sub works with 32-bit values and signed 32-bit arithmetics. Its operands will be 01110111001101011001010000000000 and 10001000110010100110110000000000. Notice that these values both fit in 32 bits: the 32-bit signed interpretation of these values are, respectively, plus and minus two billions. The subtraction result, however, is four billions, and it does not fit in 32 bits (as a signed number). Thus, your sub raises an exception.
As a rule of thumb, overflow detection is about doing things which depend on signedness interpretation, which impacts handling of the most significant bit. For big integer arithmetics, all words except the most significant shall be treated as unsigned, hence addu/subu everywhere. As a first step, things are easier to understand if you first concentrate on unsigned arithmetics, with no exception (then you just use addu and subu, and never add or sub).