Need some assistance understanding binary addition/subtraction using 2's complement - binary

If A = 01110011, B = 10010100, how would I add these?
I did this:
i.e: 01110011 + 10010100 = 100000111
Though, isn't it essentially 115 + (-108) = 7, whereas, I'm getting -249
Edit: I see that removing the highest order bit (overflow) I get 7 which is what I'm looking for but I'm not getting why you wouldn't have the extra bit.
Edit**: Ok, I figured it out. There was no overflow as I had assumed there was because 7 is within [-128, 127] (8-bits). Instead, like Omar hinted at I was supposed to drop the "extra" 1 from addition.

Your calculation is correct and the result is correct.
You stated that the second number is -108, so both your numbers are interpreted as signed 8-bit values. Thus, you should also interpret your result as an 8-bit signed value, this is why the 9th bit must be dropped, and so the result is 7 (00000111).
On a real hardware, like an 8-bit CPU for example, as all the registers are 8-bit wide, you are only be able to store the lowest 8-bit of the result, which here is 7 (00000111).
In some cases, the 9th bit may also be put inside a carry/overflow flag so it's not completely "dropped".

Related

How to perform Arithmetic on Ones Complement Numbers and correct overflow?

For some backstory, I'm making a program that can do arithmetic on ones complement numbers. To do this I'm converting a binary string into a BigInteger and then performing the math using said BigIntegers, and then converting that back into a binary string. The only problem occurs when the end result goes below -127 or above +127 because I don't know how to correct it due to the nature of ones complement numbers. I was hoping I could somehow instead convert them like unsigned numbers and do like what this answer says to do.
There are also a couple of other questions that I got from reading the linked question. I put them in block quotes. I'm just asking for information on what they mean, and explain it to me.
Firstly
I know that the r-1 complement for r-base number should do end around carry if the highest bit has carry.
Secondly
End-around carry is actually rather simple: it changes the modulus of the addition operation from rn to rn–1.
And lastly
Again, let's keep the carry bit where it is. If you look at the numbers as unsigned integers, we're computing 13 + 11 = 24. However, due to the wrap-around carry, addition is done modulo 15, so we end up with 9, which represents -6 (the correct result).
If someone can explain these quotes to me and provide some web pages for me to read I would greatly appreciate it! :)

Output_precision(); Octave

Hi after calling this code (Octave) I get an answer with 7 digits of precision, I need only 6. It is worth mentioning that on different data-set the output is normal(with 6 digits);
output_precision(6);
Prev
output:
Prev =
0.1855318
0.2181108
0.1796457
I know this is a little late but I wanted to add an answer for anyone with the same question in the future.
According to the function reference for output_precision(), the argument passed to the function (in this case, 6) specifies the minimum number of significant figures, which only guarantees that future numeric output won't have less than that number of significant figures.
From what I've seen, if you use output_precision(new_val) before displaying an array (e.g., Prev in the question), then octave will round the element with the least digits before the decimal place to have new_val significant figures and then all other elements will be rounded to have the same number of digits after the decimal place as that initial rounded result. If you use a statement to output a single value instead of an array, then the output is just rounded to new_val significant figures. However, I don't know if this behavior is guaranteed .
Here's a short example of what I mean:
% array defined with values having 5 digits after decimal
F = [401.51670 313.70753 -88.55225 188.50067 280.21988 354.51821 54.51821 350];
output_precision(4)
F
output_precision(6)
F
Output:
F =
401.52 313.71 -88.55 188.50 280.22 354.52 54.52 350.00
F =
401.5167 313.7075 -88.5523 188.5007 280.2199 354.5182 54.5182 350.0000
It can be a little quirky if you try to round the values too much. When I used output_precision(3) and then output F, the numbers were actually rounded as if my system's default precision, 5, was still active. However, when I used elements with only 2 or 3 digits after the decimal to define another array, it displayed as expected with output_precision(3).
Check out Octave Forge if you ever need docs for octave features. It's not perfect but it's something. Hope this was helpful.

Binary numbers addition

I have just started doing some binary number exercices to prepare for a class that i will start next month and i got the hang of all the conversion from decimal to binary and viceverca But now with the two letters 'a ' ' b' in this exercise i am not sure how can i apply that knowledge to add the bits with the following exercise
Given two Binary numbers a = (a7a6 ... a0) and b = (b7b6 ... b0).There is a clculator that can add 4-bit binary numbers.How many bits will be used to represent the result of a 4-bit addition? Why?
We would like to use our calculator to calculate a + b. For this we can put as many as eight bits (4 bits of the first and 4 bits of the second number) of our choice in the calculator and continue to use the result bit by bit
How many additions does our calculator have to carry out for the addition of a and b at most? How many bits is the result maximum long?
How many additions does the calculator have to perform at least For the result to be correct for all possible inputs a and b?
The number of bits needed to represent a 4-bit binary addition is 5. This is because there could be a carry-over bit that pushes the result to 5 bits.
For example 1111 + 0010 = 10010.
This can be done the same way as adding decimal numbers. From right to left just add the numbers of the same significance. If the two bits are 1+1, the result is 10 so that place becomes a zero and the 1 carries over to the next pair of bits, just like decimal addition.
With regard to the min/max number of step, these seems more like an algorithm specific question. Look up some different binary addition algorithms, like ripple-carry for instance, and it should give you a better idea of what is meant by the question.

Erlang cowboy reply json data , float number precision is wrong?

code is here :
RstJson = rfc4627:encode({obj, [{"age", 45.99}]}),
{ok, Req3} = cowboy_req:reply(200, [{<<"Content-Type">>, <<"application/json;charset=UTF-8">>}], RstJson, Req2)
then I get this wrong data from front client:
{
"age": 45.990000000000002
}
the float number precision is changed !
how can I solved this problem?
Let's have a look at the JSON that rfc4627 generates:
> io:format("~s~n", [rfc4627:encode({obj, [{"age", 45.99}]})]).
{"age":4.59900000000000019895e+01}
It turns out that rfc4627 encodes floating-point values by calling float_to_list/1, which uses "scientific" notation with 20 digits of precision. As Per Hedeland noted on the erlang-questions mailing list in November 2007, that's an odd choice:
A reasonable question could be why float_to_list/1 generates 20 digits
when a 64-bit float (a.k.a. C double), which is what is used internally,
only can hold 15-16 worth of them - I don't know off-hand what a 128-bit
float would have, but presumably significantly more than 20, so it's not
that either. I guess way back in the dark ages, someone thought that 20
was a nice and even number (I hope it wasn't me:-). The 6.30000 form is
of course just the ~p/~w formatting.
However, it turns out this is actually not the problem! In fact, 45.990000000000002 is equal to 45.99, so you do get the correct value in the front end:
> 45.990000000000002 =:= 45.99.
true
As noted above, a 64-bit float can hold 15 or 16 significant digits, but 45.990000000000002 contains 17 digits (count them!). It looks like your front end tries to print the number with more precision than it actually contains, thereby making the number look different even though it is in fact the same number.
The answers to the question Is floating point math broken? go into much more detail about why this actually makes sense, given how computers handle floating point values.
the encode float number function in rfc4627 is :
encode_number(Num, Acc) when is_float(Num) ->
lists:reverse(float_to_list(Num), Acc).
I changed it like this :
encode_number(Num, Acc) when is_float(Num) ->
lists:reverse(io_lib:format("~p",[Num]), Acc).
Problem Solved.

2's complement example, why not carry?

I'm watching some great lectures from David Malan (here) that is going over binary. He talked about signed/unsigned, 1's compliment, and 2's complement representations. There was an addition done of 4 + (-3) which lined up like this:
0100
1101 (flip 0011 to 1100, then add "1" to the end)
----
0001
But he waved his magical hands and threw away the last carry. I did some wikipedia research bit didn't quite get it, can someone explain to me why that particular carry (in the 8's ->16's columns) was dropped, but he kept the one just prior to it?
Thanks!
The last carry was dropped because it does not fit in the target space. It would be the fifth bit.
If he had carried out the same addition, but with for example 8 bit storage, it would have looked like this:
00000100
11111101
--------
00000001
In this situation we would also be stuck with an "unused" carry.
We have to treat carries this way to make addition with two's compliment work properly, but that's all good, because this is the easiest way of treating carries when you have limited storage. Anyway, we get the correct result, right :)
x86-processors store such an additional carry in the carry flag (CF), which is possible to test with certain instructions.
A carry is not the same as an overflow
In the example you do have a carry out of the MSB. By definition, this carry ends up on the floor. (If there was someplace for it to go, then it would not have been out of the MSB.)
But adding two numbers with different signs cannot overflow. An overflow can only happen when two numbers with the same sign produce a result with a different sign.
If you extend the left-hand side by adding more digit positions, you'll see that the carry rolls over into an infinite number of bit positions towards the left, so you never really get a final carry of 1. So the answer is positive.
...000100
+...111101
----------
....000001
At some point you have to set the number of bits to represent the numbers. He chose 4 bits. Any carry into the 5th bit is lost. But that's OK because he decided to represent the number in just 4 bits.
If he decided to use 5 bits to represent the numbers he would have gotten the same result.
That's the beauty of it... Your result will be the same size as the terms you are adding. So the fifth bit is thrown out
In 2's complement you use the carry bit to signal if there was an overflow in the last operation.
You must look at the LAST two carry bits to see if there was overflow. In your example, the last two carry bits were 11 meaning that there was no overflow.
If the last two carry bits are 11 or 00 then no overflow occurred. If the last two carry bits are 10 or 01 then there was overflow. That is why he sometimes cared about the carry bit and other times he ignored it.
The first row below is the carry row. The left-most bits in this row are used to determine if there was overflow.
1100
0100
1101
----
0001
Looks like you're only using 4 bits, so there is no 16's column.
If you were using more than 4 bits then the -3 representation would be different, and the carry of the math would still be thrown out the end. For example, with 6 bits you'd have:
000100
111101
------
1000001
and since the carry is outside the bit range of your representation it's gone, and you only have 000001
Consider 25 + 15:
5+5 = 10, we keep the 0 and let the 1 go to the tens-column. Then it's 2 + 1 (+ 1) = 4. Hence the result is 40 :)
It's the same thing with binaries. 0 + 1 = 1, 0 + 0 = 0, 1 + 1 = 10 => send the 1 the 8-column, 0 + 1 ( + 1 ) = 10 => send the 1 to the next column - Here's the overflow and why we just throw the 1 away.
This is why 2's complement is so great. It allows you to add / substract just like you do with base-10, because you (ab)use the fact that the sign-bit is the MSB, which will cascade operations all the way to overflows, when nessecary.
Hope I made myself understood. Quite hard to explan this when english is not you native tongue :)
When performing 2's complement addition, the only time that a carry indicates a problem is when there's an overflow condition - that can't happen if the 2 operands have a different sign.
If they have the same sign, then the overflow condition is when the sign bit changes from the 2 operands, ie., there's a carry into the most significant bit.
If I remember my computer architecture learnin' this is often detected at the hardware level by a flag that's set when the carry into the most significant bit is different than the carry out of the most significant bit. Which is not the case in your example (there's a carry into the msb as well as out of the msb).
One simple way to think of it is as "the sign not changing". If the carry into the msb is different than the carry out, then the sign has improperly changed.
The carry was dropped because there wasn't anything that could be done with it. If it's important to the result, it means that the operation overflowed the range of values that could be stored in the result. In assembler, there's usually an instruction that can test for the carry beyond the end of the result, and you can explicitly deal with it there - for example, carrying it into the next higher part of a multiple precision value.
Because you are talking about 4 bit representations. It's unussual compared to an actual machine, but if we were to take for granted that a computer has 4 bits in each byte for a moment, then we have the following properties: a byte wraps at 15 to -15. Anything outside that range cannot be stored. Besides, what would you do with an extra 5th bit beyond the sign bit anyway?
Now, given that, we can see from everyday math that 4 + (-3) = 1, which is exactly what you got.