How to do bit insertion - binary

How to insert bits to a certain position?
For example, if I have an integer 115 (i.e. b1110011).
And I want to insert integer 3 (i.e. b11) to position 3 from the right. In this case, let's assume that I know the size of the binary that I want to insert. For example, I know that 3 is 2 digits binary.
Therefore the resulting binary will be 475 (i.e. b111011011)
I have a solution in mind but it seems complicated. It involves first, shifting the bits 2 digits to the left to give room for the new bits. Then set rightmost 2+position bits to zero. Therefore now I have 111000000.
Then I will OR the result to change the bold bits part with the rightmost 2+position from the original bits with the first two position is changed to the bits that I want to insert. (i.e. 11011).
Therefore in the end I will have 111000000 | 11011 == 111011011.
But I think my solution is too complicated. Thus, is there any better way to do this operation?

Related

Theory behind multiplying two numbers without operands

I have been reading a Elements of Programming Interview and am struggling to understand the passage below:
"The algorithm taught in grade-school for decimal multiplication does
not use repeated addition- it uses shift and add to achieve a much
better time complexity. We can do the same with binary numbers- to
multiply x and y we initialize the result to 0 and iterate through the
bits of x, adding (2^k)y to the result if the kth bit of x is 1.
The value (2^k)y can be computed by left-shifting y by k. Since we
cannot use add directly, we must implement it. We can apply the
grade-school algorithm for addition to the binary case, i.e, compute
the sum bit-by-bit and "rippling" the carry along.
As an example, we show how to multiply 13 = (1101) and 9 = (1001)
using the algorithm described above. In the first iteration, since
the LSB of 13 is 1, we set the result to (1001). The second bit of
(1101) is 0, so we move on the third bit. The bit is 1, so we shift
(1001) to the left by 2 to obtain (1001001), which we add to (1001) to
get (101101). The forth and final bit of (1101) is 1, so we shift
(1001) to the left by 3 to obtain (1001000), which we add to (101101)
to get (1110101) = 117.
My Questions are:
What is the overall idea behind this, how is it a "bit-by-bit" addition
where does (2^k)y come from
what does it mean by "left-shifting y by k"
In the example, why do we set result to (1001) just because the LSB of 13 is 1?
The algorithm relies on the way numbers are coded in binary.
Let A be an unsigned number. A is coded by a set of bits an-1an-2...a0 in such a way that A=∑i=0n-1ai×2i
Now, assume you have two numbers A and B coded in binary and you wand to compute A×B
B×A=B×∑i=0n-1ai×2i
=∑i=0n-1B×ai×2i
ai is equal to 0 or 1. If ai=0, the sum will not be modified. If ai=1, we need to add B×ai
So, we can simply deduce the multiplication algorithm
result=0
for i in 0 to n-1
if a[i]=1 // assumes a[i] is the ith bit
result = result + B * 2^i
end
end
What is the overall idea behind this, how is it a "bit-by-bit" addition
It is just an application of the previous method where you process successively every bit of the multiplicator
where does (2^k)y come from
As mentioned above from the way binary numbers are coded. If ith bit is set, then there is a 2i in the decomposition of the number.
what does it mean by "left-shifting y by k"
Left shift means "pushing" the bits leftwards and filling the "holes" with zeroes. Hence if number is 1101 and it is left shifted by three, it becomes 1101000.
This is the way to multiply the number by 2i (just as when "left shifting" by 2 a decimal number and putting zeroes at the right places is the way to multiply by 100=102)
In the example, why do we set result to (1001) just because the LSB of 13 is 1?
Because there is a 1 at right most position, that corresponds to 20. So we left shift by 0 and add it to the result that is initialized to 0.

How would I convert a number into binary bits, then truncate or enlarge their size, and then insert into a bit container?

As the title of the question says, I want to take a number (declared preferably as int or char or std::uint8_t), convert it into its binary representation, then truncate or pad it by a certain variable number of bits given, and then insert it into a bit container (preferably std::vector<bool> because I need variable bit container size as per the variable number of bits). For example, I have int a= 2, b = 3. And let's say I have to write this as three bits and six bits respectively into the container. So I have to put 010 and 000011 into the bit container. So, how would I go from 2 to 010 or 3 to 000011 using normal STL methods? I tried every possible thing that came to my mind, but I got nothing. Please help. Thank you.
You can use a combination of 'shifting' (>>) and 'bit-wise and' (&).
First lets look at the bitwise &: For instance if you have an int a=7 and you do the &-operation on it with 13, you will get 5. Why?
Because & gives 1 at position i iff both operands have a 1 at position i. So we get:
00...000111 // binary 7
& 00...001101 // binary 13
-------------
00...000101 // binary 5
Next, by using the shift operation >> you can shift the binary representation of your ints. For instance 5 >> 1 is 2. Why?
Because each position gets displaced by 1 to the right. The rightmost bit "falls out". Hence we have:
00...00101 //binary for 5
shift by 1 to the right gives:
00...00010 // binary for 2
Another example: 13 (01101) shifted by 2 is 3 (00011). I hope you get the idea.
Hence, by repeatedly shifting and doing & with 1 (00..0001), you can read out the binary representation of a number.
Finally, you can use this 1 to set the corresponding position in your vector<bool>. Assuming you want to have the representation you show in your post, you will have to fill in your vector from the back. So, you could for instance do something along the lines:
unsigned int f=13; //the number we want to convert
std::vector<bool> binRepr(size, false); //size is the container-size you want to use.
for(int currBit=0; currBit<size; currBit++){
binRepr[size-1-currBit] = (f >> currBit) & 1;
}
If the container is smaller than the binary representation of your int, the container will contain the truncated number. If it is larger, it will fill in with 0s.
I'm using an unsigned int since for an int you would still have to take care of negative numbers (for positive numbers it should work the same) and we would have to dive into the two's complement representation, which is not difficult, but requires a bit more bit-fiddling.

tricky binary multiplication

I attempted to multiply binary 1111 as first input and 1111 as second input. When I multiply as usual I came across having to do the addition below I encounter having to carry the 1 with the three 1's which would mean 4 in binary with 2 bits. But that's impossible to represent 4 in 2 bits for this multiplication problem.
If you want to add multiple binary values, then you just carry whatever is left over after adding a column, regardless of how many bits you need to represent the carry.
It's just like doing the decimal add 99+99+99+99+99+99+99+99+99+99+99+99, when adding the least significant column, you end up with 108, so you carry 10 eventhough it's too large to fit in a single digit.
Likewise, if you add the binary 11+11+11+11+11 you end up with 101 when adding the least significant column, so you carry 10.
However, normally you only add two binary numbers at a time, as that lets you get away with using a single bit for carry.
What you have to do is carry the numbers over another digit.
Take the scenario:
11
+11
+11
you would have 1001 as your answer because 4 in binary is 100. Simply carry over the 1s into the correct place.

Wrapping my head around hardware representations of numbers: a hypothetical two's complement question

This is a super naive question (I know), but I think that it will make for a good jumping off point into considering how the basic instruction set of a CPU actually gets carried out:
In a two's complement system, you cannot invert the sign of the most negative number that your implementation can represent. The theoretical reason for this is obvious in that the negation of the most negative number would be out of the range of the implementation (the range is always something like
-128 to 127).
However, what actually happens when you try to carry out the negation operation on the most negative number is pretty strange. For example, in an 8 bit representation, the most negative number is -128, or 1000 0000 in binary. Normally, to negate a number you would flip all the bits and then add one. However, if you try to do this with -128 you end up with:
1000 0000 ->
0111 1111 ->
1000 0000
the same number that you started out with. For this reason, wikipedia calls it "the weird number".
In that same wikipedia article, it says that the above negation
is detected as an overflow condition since there was a carry into but not out of the most-significant bit.
So my question is this:
A) What the heck does that mean? and
B) It seems like the CPU would need to perform an extra error checking step each and every time it carried out a basic arithmetic operation in order to avoid accidents relating to this negation, creating significant overhead. If that is the case, why not just truncate the range of numbers that can be represented to leave the weird number out (i.e. -127 to 127 for 8 bits)? If that isn't the case, how can you implement such error checking without creating extra overhead?
The carry-out bit from the MSB is used as a flag to indicate that we
need more bits. Without it, we would have a system of modular
arithmetic1 without any way of detecting when we wrap around.
In modular arithmetic, you don’t deal with numbers but with
equivalence classes of numbers that have the same remainder. In such
a system, after adding 1 to 127, you would get −128, and you would
conclude that +128 and −128 belong to the same equivalence class.
If you restricted yourself to numbers in the range −127 to +127, you
would have to redefine addition, since 127 + 1 = −127 is nonsense.
Two’s-complement arithmetic, when presented to you by a computer, is
essentially modular arithmetic with the ability to detect an overflow.
This is what a 4-bit adder would look like when adding 0001 to
0111. You can see that in the MSB the carry-in and carry-out are
different:
0 0 0 1
| 0 | 1 | 1 | 1
| | | | | | | |
v v v v v v v v
0 <- ADD <-1- ADD <-1- ADD <-1- ADD <- 0
^ | ^ | | |
v v v v
1 0 0 0
It is this flag that the ALU uses to signal that an overflow occurred,
without any extra steps.
1. Modular arithmetic goes from 0 to 255 instead of −127 to 128, but the basic idea is the same.
It's not that the CPU does another check, its that the transistors are arranged to notice when this happens. And they are built that way because the engineers picked two-complement before they started designing the thing.
The result is that it happens during the same clock cycle as a non-overflowing result would be returned.
How does it work?
The "add 1" stage implements a cascade logic: starting with the LSB each bit is subjected in turn to the truth table
old-bit carry-in new-bit carry-out
-------------------------------------
0 0 0 0
0 1 1 0
1 0 1 0
1 1 0 1
(that is new-bit = old-bit xor carry-in and carry-out = old-bit and carry-in). The "carry-in" for the LSB is the 1 that we're adding, and for the rest of the bits it is the "carry-out" of the previous one (which is why this has to be done in a cascade).
The last of these circuits just adds a circuit for signed-overflow = (carry-in and not carry-out).
First off the wikipedia article states it cannot be negated from a negative signed number to a signed number. And what they mean is because it takes 9 bits to represent positive 128, which you cannot do with an 8 bit register. If you are going from negative signed to positive unsigned as a conversion, then you have enough bits. And the hardware should give you 0x80 when you negate 0x80 because that is the right answer.
For add, subtract, multiply, etc addition in twos complement is no different than decimal math from elementary school. You line up your binary numbers, add the columns, the result for that column is the least significant digit and the rest is carried over to the next column. So adding a 0b001 to 0b001 for example
1
001
001
===
010
Add the two ones in the rightmost column, the result is 0b10 (2 decimal), write zero then carry the one, one plus zero plus zero is one, nothing to carry, zero plus zero is zero, the result is 0b010.
The right most column where 1 plus 1 is 0b10 and we write 0 carry the one, well that carry the one is at the same time the carry out of the right most column and is the carry in of the second column. Also, with pencil and paper math we normally only talk about carry of something when it is non-zero but if you think about it you are always carrying a number like our second columns one plus zero is one carry the zero.
You can think of a twos complement negate as invert and add one, or walking the bits and inverting up to a point then not inverting, or taking the result of zero minus the number.
You can work subtract in binary using pencil and paper for what it is worth, makes your head hurt when borrowing compared to decimal, but works. For what you are asking though think of invert and add one.
It is easier to wrap your head around this if you take it down to even fewer bits than 8, three is a manageable number, it all scales from there.
So the first column below is the input, the second column is the inverted version and the third column is the second column plus one. The fourth column is the carry in to the msbit, the fifth column is the carry out of the msbit.
000 111 000 1 1
001 110 111 0 0
010 101 110 0 0
011 100 101 0 0
100 011 100 1 0
101 010 011 0 0
110 001 010 0 0
111 000 001 0 0
Real quick look at at adding a one to two bits:
00+1 = 001
01+1 = 010
10+1 = 011
11+1 = 100
For the case of adding one to a number, the only case where you carry out from the second bit into the third bit is when your bits are all ones, a single zero in there stops the cascading carry bits. So in the three bit inversion table above the only two cases where you have a carry into the msbit is 111 and 011 because those are the only two cases where those lower bits are all set. For the 111 case the msbit has a carry in and a carry out. for the 011 case the msbit has a carry in but not a carry out.
So as stated by someone else there are transistors wired up in the chip, if msbit carry in is set and msbit carry out is not set then set some flag somewhere, otherwise clear the flag.
So note that the three bit examples above scale. if after you invert and before you add one you have 0b01111111 then you are going to get a carry in without the carry out. If you have 0b11111111 then you get a carry in and a carry out. Note that zero is also a number where you get the same number back when you invert it, the difference is that when the bits are considered as signed, zero can be represented when negated, 1 with all zeros cannot.
The bottom line though is that this is not a crisis or end of the world thing there is a whole lot of math and other operations in the processor where carry bits and significant bits are falling off one side or the other and overflows and underflows are firing off, etc. Most of the time the programmers never check for such conditions and those bits just fall on the floor, sometimes causing the program to crash or sometimes the programmer has used 16 bit numbers for 8 bit math just to make sure nothing bad happens, or uses 8 bit numbers for 5 bit math for the same reason.
Note that the hardware doesnt know signed or unsigned for addition and subtraction. also the hardware doesnt know how to subtract. Hardware adders are three bit adders (two operands and carry in) with a result and carry out. Wire 8 of these up you have an 8 bit adder or subtractor, add without carry is the two operands wired directly with a zero wired in as the lsbit carry in. Add with carry is the two operands wired directly with the carry bit wired to the lsbit carry in. Subtract is add with the second operand inverted and a one on the carry in bit. At least from a high level perspective, that logic can all get optimized and implemented in ways often two hard to understand on casual inspection.
The really fun exercise is multiply, think about doing binary multiplication with pencil and paper, then realize it is much easier than decimal, because it is just a series of shifts and adds. given enough gates you can represent each result bit as a equation with the inputs to the equation being the operands. meaning you can do a single clock multiply if you wish, in the early days that was too many gates, so multi clock shift and adds were done, today we burn the gates and get single clock multiplies. Also note that understanding this also means that if you do say a 16 bit = 8 bit times 8 bit multiply, the lower 8 bit result is the same whether it is a signed multiply or unsigned. Since most folks do things like int = int * int; you really dont need a signed or unsigned multiply if all you care about is the result bits (no checking of flags, etc). fun stuff..
In the ARM Architecture Manual (DDI100E):
OverflowFrom
Returns 1 if the addition or subtraction specified as its parameter
caused a 32-bit signed overflow. [...]
Subtraction causes an overflow if the operands have different signs,
and the first operand and the result have different signs.
NEG
[...]
V Flag = OverflowFrom(0 - Rm)
NEG is the instruction for computing the negation of a number, i.e. the twos complement.
The V flag signals signed overflow and can be used for conditional branching. It's fairly standard across different processor architectures, together with the three other flags Z (zero), C (carry) and N (negative).
For 0 - (-128) = 0 + 128 = -128 the first operand is 0 and the second operand as well as the result is -128, so the condition for overflow is satisfied, and the V flag is set.

2's complement example, why not carry?

I'm watching some great lectures from David Malan (here) that is going over binary. He talked about signed/unsigned, 1's compliment, and 2's complement representations. There was an addition done of 4 + (-3) which lined up like this:
0100
1101 (flip 0011 to 1100, then add "1" to the end)
----
0001
But he waved his magical hands and threw away the last carry. I did some wikipedia research bit didn't quite get it, can someone explain to me why that particular carry (in the 8's ->16's columns) was dropped, but he kept the one just prior to it?
Thanks!
The last carry was dropped because it does not fit in the target space. It would be the fifth bit.
If he had carried out the same addition, but with for example 8 bit storage, it would have looked like this:
00000100
11111101
--------
00000001
In this situation we would also be stuck with an "unused" carry.
We have to treat carries this way to make addition with two's compliment work properly, but that's all good, because this is the easiest way of treating carries when you have limited storage. Anyway, we get the correct result, right :)
x86-processors store such an additional carry in the carry flag (CF), which is possible to test with certain instructions.
A carry is not the same as an overflow
In the example you do have a carry out of the MSB. By definition, this carry ends up on the floor. (If there was someplace for it to go, then it would not have been out of the MSB.)
But adding two numbers with different signs cannot overflow. An overflow can only happen when two numbers with the same sign produce a result with a different sign.
If you extend the left-hand side by adding more digit positions, you'll see that the carry rolls over into an infinite number of bit positions towards the left, so you never really get a final carry of 1. So the answer is positive.
...000100
+...111101
----------
....000001
At some point you have to set the number of bits to represent the numbers. He chose 4 bits. Any carry into the 5th bit is lost. But that's OK because he decided to represent the number in just 4 bits.
If he decided to use 5 bits to represent the numbers he would have gotten the same result.
That's the beauty of it... Your result will be the same size as the terms you are adding. So the fifth bit is thrown out
In 2's complement you use the carry bit to signal if there was an overflow in the last operation.
You must look at the LAST two carry bits to see if there was overflow. In your example, the last two carry bits were 11 meaning that there was no overflow.
If the last two carry bits are 11 or 00 then no overflow occurred. If the last two carry bits are 10 or 01 then there was overflow. That is why he sometimes cared about the carry bit and other times he ignored it.
The first row below is the carry row. The left-most bits in this row are used to determine if there was overflow.
1100
0100
1101
----
0001
Looks like you're only using 4 bits, so there is no 16's column.
If you were using more than 4 bits then the -3 representation would be different, and the carry of the math would still be thrown out the end. For example, with 6 bits you'd have:
000100
111101
------
1000001
and since the carry is outside the bit range of your representation it's gone, and you only have 000001
Consider 25 + 15:
5+5 = 10, we keep the 0 and let the 1 go to the tens-column. Then it's 2 + 1 (+ 1) = 4. Hence the result is 40 :)
It's the same thing with binaries. 0 + 1 = 1, 0 + 0 = 0, 1 + 1 = 10 => send the 1 the 8-column, 0 + 1 ( + 1 ) = 10 => send the 1 to the next column - Here's the overflow and why we just throw the 1 away.
This is why 2's complement is so great. It allows you to add / substract just like you do with base-10, because you (ab)use the fact that the sign-bit is the MSB, which will cascade operations all the way to overflows, when nessecary.
Hope I made myself understood. Quite hard to explan this when english is not you native tongue :)
When performing 2's complement addition, the only time that a carry indicates a problem is when there's an overflow condition - that can't happen if the 2 operands have a different sign.
If they have the same sign, then the overflow condition is when the sign bit changes from the 2 operands, ie., there's a carry into the most significant bit.
If I remember my computer architecture learnin' this is often detected at the hardware level by a flag that's set when the carry into the most significant bit is different than the carry out of the most significant bit. Which is not the case in your example (there's a carry into the msb as well as out of the msb).
One simple way to think of it is as "the sign not changing". If the carry into the msb is different than the carry out, then the sign has improperly changed.
The carry was dropped because there wasn't anything that could be done with it. If it's important to the result, it means that the operation overflowed the range of values that could be stored in the result. In assembler, there's usually an instruction that can test for the carry beyond the end of the result, and you can explicitly deal with it there - for example, carrying it into the next higher part of a multiple precision value.
Because you are talking about 4 bit representations. It's unussual compared to an actual machine, but if we were to take for granted that a computer has 4 bits in each byte for a moment, then we have the following properties: a byte wraps at 15 to -15. Anything outside that range cannot be stored. Besides, what would you do with an extra 5th bit beyond the sign bit anyway?
Now, given that, we can see from everyday math that 4 + (-3) = 1, which is exactly what you got.