How does Bit Masking Buffer Index Result in Wrap around - binary

Can someone explain how bit masking works in terms of a circular buffer index. Specifically in the following code:
#define USART_RX_BUFFER_SIZE 128 /* 2,4,8,16,32,64,128 or 256 bytes */
#define USART_RX_BUFFER_MASK ( USART_RX_BUFFER_SIZE - 1 )
ISR(USART_RX_vect)
{
unsigned char data;
unsigned char tmphead;
/* Read the received data */
data = UDR0;
/* Calculate buffer index */
tmphead = ( USART_RxHead + 1 ) & USART_RX_BUFFER_MASK;
USART_RxHead = tmphead; /* Store new index */
if ( tmphead == USART_RxTail )
{
/* ERROR! Receive buffer overflow */
}
USART_RxBuf[tmphead] = data; /* Store received data in buffer */
}
I know the result of bit masking the index is that the index wraps around; my question is why? Also, why does the "USART_RX_BUFFER_SIZE" have to be a power of 2?
Thank you
Joe

To understand this, you have to understand some binary, and you have to understand binary operations.
As you probably know, everything in computers is stored in binary, sequences of ones and zeros. This means that any string of data in memory can be treated as number, theoretically. Since your code is usin chars, I will focus on them.
in C, chars are either signed or unsigned, it is important that you use unsigned for this. I won't get into two's complement representation, but suffice it to say that it would break if you used signed chars. A char is a single byte, and that is normally considered to be 8 bits, like so:
00000000 -> 0
00001001 -> 9
Basically each bit represents a power of two (I'm using MSB-first here), so the second number is 2^1 + 2^3 = 1 + 8 = 9. So you can see how it can be used to index into an array.
Bitwise operations operate on the individual bits of some data. In this case, you are using binary and (&), and the act of applying binary and is called bit-masking.
data - 00101100
mask - 11110110
----------
result - 00101100
As you can see, the result has bits set to 1 only where both the data and mask has 1.
Now back to our binary representation. Since each bit is a power of two, a power of two in binary can be represented using a single 1 in amongst 0's.
01000000 - 64
And just like 1000 - 1 = 999, 01000000 - 1 = 00111111, where 00111111 is 63.
Using that we can find that when working out the next index, we perform the following operation:
(a + 1) & 00111111
if a is (for example) 10, then we get
(00001010 + 1) = 00001011 (11)
00001011 & 00111111 = 00001011
So masking made no change, but in the case of 63:
(00111111 + 1) = 01000000 (64)
01000000 & 00111111 = 00000000 (0)
So rather than trying to index into 64 (which is the 65th element, and therefore an error), you go back to the beginning.
This is why the buffer size has to be a power of two, if it wasn't then the mask wouldn't calculate properly and you would have to use modulo (%), or a comparison, rather than bit masking. This is important because bitwise operators are very fast, given that they are normally only a single instruction in most processors, and & would require very few cycles. A modulo may be a single instruction, but it would probably be integer division, and that is traditionally quite slow on most platforms. And a comparison would require several instructions, registers and at least 1 jump.

JOhn writed:
I second the comment about modulo, we found on our micro it was taking
something like 10-20x longer to do "a %= MODULUS" than to do if(a>b)
a/=MODULUS; – John U Aug 3 '12 at 17:01
But there is still division : a/=MODULUS and therefore effeciency is same as modulo operation, i assume ..

Related

LC-3 algorithm for converting ASCII strings to Binary Values

Figure 10.4 provides an algorithm for converting ASCII strings to binary values. Suppose the decimal number is arbitrarily long. Rather than store a table of 10 values for the thousands-place digit, another table for the 10 ten-thousands-place digit, and so on, design an algorithm to do the conversion without resorting to any tables whatsoever.
I have attached pictures of figure 10.4. I am not looking for an answer to the problem, but rather can someone please explain this problem and perhaps give some direction on how to go about creating the algorithm?
Figure 10.4
Figure 10.4 second image
I am unsure as to what it means by tables and do not know where to start really.
The tables are those global, initialized arrays: one called Lookup10 holding 10, 20, 30, 40, ..., and another called Lookup100 holding 100, 200, 300, 400...
You can ignore the tables: as per the assignment instructions you're supposed to find a different way to accomplish this anyway.  Or, you can run that code in simulator or mentally to understand how it works.
The bottom line is that LC-3, while it can do anything (it is turning complete), it can't do much in any one instruction.  For arithmetic & logic, it can do add, not, and.  That's pretty much it!  But that's enough — let's note that modern hardware does everything with only one logic gate, namely NAND, which is a binary operator (so NAND directly available; NOT by providing NAND with the same operand for both inputs; AND by doing NOT after NAND; OR using NOT on both inputs first and then NAND; etc..)
For example, LC-3 cannot multiply or divide or modulus or right shift directly — each of those operations is many instructions and in the general case, some looping construct.  Multiplication can be done by repetitive addition, and division/modulus by repetitive subtraction.  These are super inefficient for larger operands, and there are much more efficient algorithms that are also substantially more complex, so those greatly increase program complexity beyond that already with the repetitive operation approach.
That subroutine goes backwards through the use input string.  It takes a string length count in R1 as parameter supplied by caller (not shown).  It looks at the last character in the input and converts it from an ASCII character to a binary number.
(We would commonly do that conversion from ascii character to numeric value using subtraction: moving the character values from the ascii character range of 0x30..0x39 to numeric values in the range 0..9, but they do it with masking, which also works.  The subtraction approach integrates better with error detection (checking if not a valid digit character, which is not done here), whereas the masking approach is simpler for LC-3.)
The subroutine then obtains the 2nd last digit (moving backwards through the user's input string), converting that to binary using the mask approach.  That yields a number between 0 and 9, which is used as an index into the first table Lookup10.  The value obtained from the table at that index position is basically the index × 10.  So this table is a × 10 table.  The same approach is used for the third (and first or, last-going-backwards) digit, except it uses the 2nd table which is a × 100 table.
The standard approach for string to binary is called atoi (search it) standing for ascii to integer.  It moves forwards through the string, and for every new digit, it multiples the existing value, computed so far, by 10 before adding in the next digit's numeric value.
So, if the string is 456, the first it obtains 4, then because there is another digit, 4 × 10 = 40, then + 5 for 45, then × 10 for 450, then + 6 for 456, and so on.
The advantage of this approach is that it can handle any number of digits (up to overflow).  The disadvantage, of course, is that it requires multiplication, which is a complication for LC-3.
Multiplication where one operand is the constant 10 is fairly easy even in LC-3's limited capabilities, and can be done with simple addition without looping.  Basically:
n × 10 = n + n + n + n + n + n + n + n + n + n
and LC-3 can do those 9 additions in just 9 instructions.  Still, we can also observe that:
n × 10 = n × 8 + n × 2
and also that:
n × 10 = (n × 4 + n) × 2     (which is n × 5 × 2)
which can be done in just 4 instructions on LC-3 (and none of these needs looping)!
So, if you want to do this approach, you'll have to figure out how to go forwards through the string instead of backwards as the given table version does, and, how to multiply by 10 (use any one of the above suggestions).
There are other approaches as well if you study atoi.  You could keep the backwards approach, but now will have to multiply by 10, by 100, by 1000, a different factor for each successive digit .  That might be done by repetitive addition.  Or a count of how many times to multiply by 10 — e.g. n × 1000 = n × 10 × 10 × 10.

How would I convert a number into binary bits, then truncate or enlarge their size, and then insert into a bit container?

As the title of the question says, I want to take a number (declared preferably as int or char or std::uint8_t), convert it into its binary representation, then truncate or pad it by a certain variable number of bits given, and then insert it into a bit container (preferably std::vector<bool> because I need variable bit container size as per the variable number of bits). For example, I have int a= 2, b = 3. And let's say I have to write this as three bits and six bits respectively into the container. So I have to put 010 and 000011 into the bit container. So, how would I go from 2 to 010 or 3 to 000011 using normal STL methods? I tried every possible thing that came to my mind, but I got nothing. Please help. Thank you.
You can use a combination of 'shifting' (>>) and 'bit-wise and' (&).
First lets look at the bitwise &: For instance if you have an int a=7 and you do the &-operation on it with 13, you will get 5. Why?
Because & gives 1 at position i iff both operands have a 1 at position i. So we get:
00...000111 // binary 7
& 00...001101 // binary 13
-------------
00...000101 // binary 5
Next, by using the shift operation >> you can shift the binary representation of your ints. For instance 5 >> 1 is 2. Why?
Because each position gets displaced by 1 to the right. The rightmost bit "falls out". Hence we have:
00...00101 //binary for 5
shift by 1 to the right gives:
00...00010 // binary for 2
Another example: 13 (01101) shifted by 2 is 3 (00011). I hope you get the idea.
Hence, by repeatedly shifting and doing & with 1 (00..0001), you can read out the binary representation of a number.
Finally, you can use this 1 to set the corresponding position in your vector<bool>. Assuming you want to have the representation you show in your post, you will have to fill in your vector from the back. So, you could for instance do something along the lines:
unsigned int f=13; //the number we want to convert
std::vector<bool> binRepr(size, false); //size is the container-size you want to use.
for(int currBit=0; currBit<size; currBit++){
binRepr[size-1-currBit] = (f >> currBit) & 1;
}
If the container is smaller than the binary representation of your int, the container will contain the truncated number. If it is larger, it will fill in with 0s.
I'm using an unsigned int since for an int you would still have to take care of negative numbers (for positive numbers it should work the same) and we would have to dive into the two's complement representation, which is not difficult, but requires a bit more bit-fiddling.

What are bits and bitwise operations used for?

Can someone please explain, in very simple, simple terms why we need bitwise operators? I just started programming one month ago.
I understand that everything is stored in binary form. I understand computers count in base 2. And I understand the bitwise operators. I just don't understand what kind of programming would require using bits and bitwise operators?
I tried to look for the answer on the web, and I read something do with binary flags and disabilities, and got even more confused.
I guess I'm just wondering, what kind of real life application would require bits and bitwise operators?
You can pack data in a very concise format.
The smallest amount that an x86 computer can adress is a byte - that's 8 bits.
If your application has a 24 yes/no flags (bools), would you store them in 1 byte each? That's 24 bytes of data. If you use bits, then each byte contains 8 of those bools - so you only need 3 bytes for 24 yes/no values:
> 1 Byte per flag:
> 0000 0000 = off
> 0000 0001 = on
> Easy to check: if(b == 0) { /* flag is off */ } else if(b == 1) { /* flag is on */ }
> 1 Bit per flag
> 0011 1101 = Flags 1, 4, 8, 16 and 32 are on, flags 2, 64 and 128 are off
> Packs 8 flags in 1 byte
> Harder to check:
> if( (b & 32) != 0) { /* Flag 32 is on */ }
This is important for network protocols and other systems where every byte really counts.
For general purpose business applications, there is usually no need for the additional complexity, just use 1 byte per flag.
This isn't just used for bools. For example, some applications may want to store two numbers than can go from 0-15 - an example is the Commodore 64 which really needed to conserve RAM wherever possible. One byte can hold two of those numbers:
> Instead of interpreting this as 8 bits (ranging from 1 to 128)
> this is really two 4 bit numbers:
> 1001 0110
> First Number: 1001 = 1 + 8 = 9
> Second Number: 0110 = 2 + 4 = 6
>
> Getting the first number requires a bit shift to move them into position:
> (b >> 4) turns the above number into this:
> 0000 1001 - this can now be simply cast as a byte and returns 9
>
> The second number requires us to "turn off" the first 4 bits
> We use the AND operator for this: b = (b & 15)
> 15 in decimal is 0000 1111 in binary.
>
> 1001 0110 AND
> 0000 1111 =
> 0000 0110
>
> Once again, the result can be interpreted as a byte and results in the number 6
One more really neat trick is to quickly check if a number is even or odd. An odd number always has the lowest significant bit (the 1 Bit) set, while an even numer always as it clear.
So your check for IsEven looks like this:
return (b & 1) == 0; // Bit 1 not set - number is even
(Note: Depending on the language, Compilers MAY decide to optimize stuff, but in a nutshell, that's it)
Storing state using binary flags allows you to have many "active flags" in one variable, and by accessing it bitwise we can check be binary value of each position. You can also use it to access specific parts of a number if you know how it's stored, here's an example from processing.
I've used it in real life business solutions to store state that is best represented as many related flags. Like proficiency in different kinds of magic :)
Skills:
None (0)
Conjuration (1)
Evocation (2)
Illusion (4)
Necromancy (8)
Alteration (16)
Now I can store what magic wizards are capable of in a single field. If a wizards skills sum up to 13 we know that he knows: Conjuration, Illusion and Necromancy. All of this is easily accessed using bitwise operations. Exploiting what we know about bits and base-2 we can use each bit in a number as a boolean flag, usually to store some kind of related state (like options or magic proficiency, in C# the FlagsAttribute is very helpful.
well...there are a number of instances where you might use bitwise operators. Here's one. The linux system call takes a file path name and a bitmask that specifies the access mode for the file as arguments. Examples: open("somefile", O_RDWR | O_CREAT | O_TRUNC | S_IWUSR), open("somefile", O_RDONLY). The bitwise or operation allows us to specify a lot of information in a single argument, and therefore simplifies the interface to the kernel.

Real world use cases of bitwise operators [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
What are some real world use cases of the following bitwise operators?
AND
XOR
NOT
OR
Left/Right shift
Bit fields (flags)
They're the most efficient way of representing something whose state is defined by several "yes or no" properties. ACLs are a good example; if you have let's say 4 discrete permissions (read, write, execute, change policy), it's better to store this in 1 byte rather than waste 4. These can be mapped to enumeration types in many languages for added convenience.
Communication over ports/sockets
Always involves checksums, parity, stop bits, flow control algorithms, and so on, which usually depend on the logic values of individual bytes as opposed to numeric values, since the medium may only be capable of transmitting one bit at a time.
Compression, Encryption
Both of these are heavily dependent on bitwise algorithms. Look at the deflate algorithm for an example - everything is in bits, not bytes.
Finite State Machines
I'm speaking primarily of the kind embedded in some piece of hardware, although they can be found in software too. These are combinatorial in nature - they might literally be getting "compiled" down to a bunch of logic gates, so they have to be expressed as AND, OR, NOT, etc.
Graphics
There's hardly enough space here to get into every area where these operators are used in graphics programming. XOR (or ^) is particularly interesting here because applying the same input a second time will undo the first. Older GUIs used to rely on this for selection highlighting and other overlays, in order to eliminate the need for costly redraws. They're still useful in slow graphics protocols (i.e. remote desktop).
Those were just the first few examples I came up with - this is hardly an exhaustive list.
Is it odd?
(value & 0x1) > 0
Is it divisible by two (even)?
(value & 0x1) == 0
I've used bitwise operations in implementing a security model for a CMS. It had pages which could be accessed by users if they were in appropriate groups. A user could be in multiple groups, so we needed to check if there was an intersection between the users groups and the pages groups. So we assigned each group a unique power-of-2 identifier, e.g.:
Group A = 1 --> 00000001
Group B = 2 --> 00000010
Group C = 3 --> 00000100
We OR these values together, and store the value (as a single int) with the page. E.g. if a page could be accessed by groups A & B, we store the value 3 (which in binary is 00000011) as the pages access control. In much the same way, we store a value of ORed group identifiers with a user to represent which groups they are in.
So to check if a given user can access a given page, you just need to AND the values together and check if the value is non-zero. This is very fast as this check is implemented in a single instruction, no looping, no database round-trips.
Here's some common idioms dealing with flags stored as individual bits.
enum CDRIndicators {
Local = 1 << 0,
External = 1 << 1,
CallerIDMissing = 1 << 2,
Chargeable = 1 << 3
};
unsigned int flags = 0;
Set the Chargeable flag:
flags |= Chargeable;
Clear CallerIDMissing flag:
flags &= ~CallerIDMissing;
Test whether CallerIDMissing and Chargeable are set:
if((flags & (CallerIDMissing | Chargeable )) == (CallerIDMissing | Chargeable)) {
}
Low-level programming is a good example. You may, for instance, need to write a specific bit to a memory-mapped register to make some piece of hardware do what you want it to:
volatile uint32_t *register = (volatile uint32_t *)0x87000000;
uint32_t value;
uint32_t set_bit = 0x00010000;
uint32_t clear_bit = 0x00001000;
value = *register; // get current value from the register
value = value & ~clear_bit; // clear a bit
value = value | set_bit; // set a bit
*register = value; // write it back to the register
Also, htonl() and htons() are implemented using the & and | operators (on machines whose endianness(Byte order) doesn't match network order):
#define htons(a) ((((a) & 0xff00) >> 8) | \
(((a) & 0x00ff) << 8))
#define htonl(a) ((((a) & 0xff000000) >> 24) | \
(((a) & 0x00ff0000) >> 8) | \
(((a) & 0x0000ff00) << 8) | \
(((a) & 0x000000ff) << 24))
I use them to get RGB(A) values from packed colorvalues, for instance.
When I have a bunch of boolean flags, I like to store them all in an int.
I get them out using bitwise-AND. For example:
int flags;
if (flags & 0x10) {
// Turn this feature on.
}
if (flags & 0x08) {
// Turn a second feature on.
}
etc.
& = AND:
Mask out specific bits.
You are defining the specific bits which should be displayed
or not displayed. 0x0 & x will clear all bits in a byte while 0xFF will not change x.
0x0F will display the bits in the lower nibble.
Conversion:
To cast shorter variables into longer ones with bit identity it is necessary to adjust the bits because -1 in an int is 0xFFFFFFFF while -1 in a long is 0xFFFFFFFFFFFFFFFF. To preserve
the identity you apply a mask after conversion.
|=OR
Set bits. The bits will be set indepently if they are already set. Many datastructures (bitfields) have flags like IS_HSET = 0, IS_VSET = 1 which can be indepently set.
To set the flags, you apply IS_HSET | IS_VSET (In C and assembly this is very convenient to read)
^=XOR
Find bits which are the same or different.
~= NOT
Flip bits.
It can be shown that all possible local bit operations can be implemented by these operations.
So if you like you can implement an ADD instruction solely by bit operations.
Some wonderful hacks:
http://www.ugcs.caltech.edu/~wnoise/base2.html
http://www.jjj.de/bitwizardry/bitwizardrypage.html
Encryption is all bitwise operations.
You can use them as a quick and dirty way to hash data.
int a = 1230123;
int b = 1234555;
int c = 5865683;
int hash = a ^ b ^ c;
I just used bitwise-XOR (^) about three minutes ago to calculate a checksum for serial communication with a PLC...
This is an example to read colours from a bitmap image in byte format
byte imagePixel = 0xCCDDEE; /* Image in RRGGBB format R=Red, G=Green, B=Blue */
//To only have red
byte redColour = imagePixel & 0xFF0000; /*Bitmasking with AND operator */
//Now, we only want red colour
redColour = (redColour >> 24) & 0xFF; /* This now returns a red colour between 0x00 and 0xFF.
I hope this tiny examples helps....
In the abstracted world of today's modern language, not too many. File IO is an easy one that comes to mind, though that's exercising bitwise operations on something already implemented and is not implementing something that uses bitwise operations. Still, as an easy example, this code demonstrates removing the read-only attribute on a file (so that it can be used with a new FileStream specifying FileMode.Create) in c#:
//Hidden files posses some extra attibutes that make the FileStream throw an exception
//even with FileMode.Create (if exists -> overwrite) so delete it and don't worry about it!
if(File.Exists(targetName))
{
FileAttributes attributes = File.GetAttributes(targetName);
if ((attributes & FileAttributes.ReadOnly) == FileAttributes.ReadOnly)
File.SetAttributes(targetName, attributes & (~FileAttributes.ReadOnly));
File.Delete(targetName);
}
As far as custom implementations, here's a recent example:
I created a "message center" for sending secure messages from one installation of our distributed application to another. Basically, it's analogous to email, complete with Inbox, Outbox, Sent, etc, but it also has guaranteed delivery with read receipts, so there are additional subfolders beyond "inbox" and "sent." What this amounted to was a requirement for me to define generically what's "in the inbox" or what's "in the sent folder". Of the sent folder, I need to know what's read and what's unread. Of what's unread, I need to know what's received and what's not received. I use this information to build a dynamic where clause which filters a local datasource and displays the appropriate information.
Here's how the enum is put together:
public enum MemoView :int
{
InboundMemos = 1, // 0000 0001
InboundMemosForMyOrders = 3, // 0000 0011
SentMemosAll = 16, // 0001 0000
SentMemosNotReceived = 48, // 0011
SentMemosReceivedNotRead = 80, // 0101
SentMemosRead = 144, // 1001
Outbox = 272, //0001 0001 0000
OutBoxErrors = 784 //0011 0001 0000
}
Do you see what this does? By anding (&) with the "Inbox" enum value, InboundMemos, I know that InboundMemosForMyOrders is in the inbox.
Here's a boiled down version of the method that builds and returns the filter that defines a view for the currently selected folder:
private string GetFilterForView(MemoView view, DefaultableBoolean readOnly)
{
string filter = string.Empty;
if((view & MemoView.InboundMemos) == MemoView.InboundMemos)
{
filter = "<inbox filter conditions>";
if((view & MemoView.InboundMemosForMyOrders) == MemoView.InboundMemosForMyOrders)
{
filter += "<my memo filter conditions>";
}
}
else if((view & MemoView.SentMemosAll) == MemoView.SentMemosAll)
{
//all sent items have originating system = to local
filter = "<memos leaving current system>";
if((view & MemoView.Outbox) == MemoView.Outbox)
{
...
}
else
{
//sent sub folders
filter += "<all sent items>";
if((view & MemoView.SentMemosNotReceived) == MemoView.SentMemosNotReceived)
{
if((view & MemoView.SentMemosReceivedNotRead) == MemoView.SentMemosReceivedNotRead)
{
filter += "<not received and not read conditions>";
}
else
filter += "<received and not read conditions>";
}
}
}
return filter;
}
Extremely simple, but a neat implementation at a level of abstraction that doesn't typically require bitwise operations.
Usually bitwise operations are faster than doing multiply/divide. So if you need to multiply a variable x by say 9, you will do x<<3 + x which would be a few cycles faster than x*9. If this code is inside an ISR, you will save on response time.
Similarly if you want to use an array as a circular queue, it'd be faster (and more elegant) to handle wrap around checks with bit wise operations. (your array size should be a power of 2). Eg: , you can use tail = ((tail & MASK) + 1) instead of tail = ((tail +1) < size) ? tail+1 : 0, if you want to insert/delete.
Also if you want a error flag to hold multiple error codes together, each bit can hold a separate value. You can AND it with each individual error code as a check. This is used in Unix error codes.
Also a n-bit bitmap can be a really cool and compact data structure. If you want to allocate a resource pool of size n, we can use a n-bits to represent the current status.
Bitwise & is used to mask/extract a certain part of a byte.
1 Byte variable
01110010
&00001111 Bitmask of 0x0F to find out the lower nibble
--------
00000010
Specially the shift operator (<< >>) are often used for calculations.
Bitwise operators are useful for looping arrays which length is power of 2. As many people mentioned, bitwise operators are extremely useful and are used in Flags, Graphics, Networking, Encryption. Not only that, but they are extremely fast. My personal favorite use is to loop an array without conditionals. Suppose you have a zero-index based array(e.g. first element's index is 0) and you need to loop it indefinitely. By indefinitely I mean going from first element to last and returning to first. One way to implement this is:
int[] arr = new int[8];
int i = 0;
while (true) {
print(arr[i]);
i = i + 1;
if (i >= arr.length)
i = 0;
}
This is the simplest approach, if you'd like to avoid if statement, you can use modulus approach like so:
int[] arr = new int[8];
int i = 0;
while (true) {
print(arr[i]);
i = i + 1;
i = i % arr.length;
}
The down side of these two methods is that modulus operator is expensive, since it looks for a remainder after integer division. And the first method runs an if statement on each iteration. With bitwise operator however if length of your array is a power of 2, you can easily generate a sequence like 0 .. length - 1 by using & (bitwise and) operator like so i & length. So knowing this, the code from above becomes
int[] arr = new int[8];
int i = 0;
while (true){
print(arr[i]);
i = i + 1;
i = i & (arr.length - 1);
}
Here is how it works. In binary format every number that is power of 2 subtracted by 1 is expressed only with ones. For example 3 in binary is 11, 7 is 111, 15 is 1111 and so on, you get the idea. Now, what happens if you & any number against a number consisting only of ones in binary? Let's say we do this:
num & 7;
If num is smaller or equal to 7 then the result will be num because each bit &-ed with 1 is itself. If num is bigger than 7, during the & operation computer will consider 7's leading zeros which of course will stay as zeros after & operation only the trailing part will remain. Like in case of 9 & 7 in binary it will look like
1001 & 0111
the result will be 0001 which is 1 in decimal and addresses second element in array.
Base64 encoding is an example. Base64 encoding is used to represent binary data as a printable characters for sending over email systems (and other purposes). Base64 encoding converts a series of 8 bit bytes into 6 bit character lookup indexes. Bit operations, shifting, and'ing, or'ing, not'ing are very useful for implementing the bit operations necessary for Base64 encoding and decoding.
This of course is only 1 of countless examples.
I'm suprised no one picked the obvious answer for the Internet age. Calculating valid network addresses for a subnet.
http://www.topwebhosts.org/tools/netmask.php
Nobody seems to have mentioned fixed point maths.
(Yeah, I'm old, ok?)
Is a number x a power of 2? (Useful for example in algorithms where a counter is incremented, and an action is to be taken only logarithmic number of times)
(x & (x - 1)) == 0
Which is the highest bit of an integer x? (This for example can be used to find the minimum power of 2 that is larger than x)
x |= (x >> 1);
x |= (x >> 2);
x |= (x >> 4);
x |= (x >> 8);
x |= (x >> 16);
return x - (x >>> 1); // ">>>" is unsigned right shift
Which is the lowest 1 bit of an integer x? (Helps find number of times divisible by 2.)
x & -x
If you ever want to calculate your number mod(%) a certain power of 2, you can use yourNumber & 2^N-1, which in this case is the same as yourNumber % 2^N.
number % 16 = number & 15;
number % 128 = number & 127;
This is probably only useful being an alternative to modulus operation with a very big dividend that is 2^N... But even then its speed boost over the modulus operation is negligible in my test on .NET 2.0. I suspect modern compilers already perform optimizations like this. Anyone know more about this?
I use them for multi select options, this way I only store one value instead of 10 or more
it can also be handy in a sql relational model, let's say you have the following tables: BlogEntry, BlogCategory
traditonally you could create a n-n relationship between them using a BlogEntryCategory table
or when there are not that much BlogCategory records you could use one value in BlogEntry to link to multiple BlogCategory records just like you would do with flagged enums,
in most RDBMS there are also a very fast operators to select on that 'flagged' column...
When you only want to change some bits of a microcontroller's Outputs, but the register to write to is a byte, you do something like this (pseudocode):
char newOut = OutRegister & 0b00011111 //clear 3 msb's
newOut = newOut | 0b10100000 //write '101' to the 3 msb's
OutRegister = newOut //Update Outputs
Of course, many microcontrollers allow you to change each bit individually...
I've seen them used in role based access control systems.
There is a real world use in my question here -
Respond to only the first WM_KEYDOWN notification?
When consuming a WM_KEYDOWN message in the windows C api bit 30 specifies the previous key state. The value is 1 if the key is down before the message is sent, or it is zero if the key is up
They are mostly used for bitwise operations (surprise). Here are a few real-world examples found in PHP codebase.
Character encoding:
if (s <= 0 && (c & ~MBFL_WCSPLANE_MASK) == MBFL_WCSPLANE_KOI8R) {
Data structures:
ar_flags = other->ar_flags & ~SPL_ARRAY_INT_MASK;
Database drivers:
dbh->transaction_flags &= ~(PDO_TRANS_ACCESS_MODE^PDO_TRANS_READONLY);
Compiler implementation:
opline->extended_value = (opline->extended_value & ~ZEND_FETCH_CLASS_MASK) | ZEND_FETCH_CLASS_INTERFACE;
I've seen it in a few game development books as a more efficient way to multiply and divide.
2 << 3 == 2 * 8
32 >> 4 == 32 / 16
Whenever I first started C programming, I understood truth tables and all that, but it didn't all click with how to actually use it until I read this article http://www.gamedev.net/reference/articles/article1563.asp (which gives real life examples)
I don't think this counts as bitwise, but ruby's Array defines set operations through the normal integer bitwise operators. So [1,2,4] & [1,2,3] # => [1,2]. Similarly for a ^ b #=> set difference and a | b #=> union.

What are bitwise shift (bit-shift) operators and how do they work?

I've been attempting to learn C in my spare time, and other languages (C#, Java, etc.) have the same concept (and often the same operators)...
At a core level, what does bit-shifting (<<, >>, >>>) do, what problems can it help solve, and what gotchas lurk around the bend? In other words, an absolute beginner's guide to bit shifting in all its goodness.
The bit shifting operators do exactly what their name implies. They shift bits. Here's a brief (or not-so-brief) introduction to the different shift operators.
The Operators
>> is the arithmetic (or signed) right shift operator.
>>> is the logical (or unsigned) right shift operator.
<< is the left shift operator, and meets the needs of both logical and arithmetic shifts.
All of these operators can be applied to integer values (int, long, possibly short and byte or char). In some languages, applying the shift operators to any datatype smaller than int automatically resizes the operand to be an int.
Note that <<< is not an operator, because it would be redundant.
Also note that C and C++ do not distinguish between the right shift operators. They provide only the >> operator, and the right-shifting behavior is implementation defined for signed types. The rest of the answer uses the C# / Java operators.
(In all mainstream C and C++ implementations including GCC and Clang/LLVM, >> on signed types is arithmetic. Some code assumes this, but it isn't something the standard guarantees. It's not undefined, though; the standard requires implementations to define it one way or another. However, left shifts of negative signed numbers is undefined behaviour (signed integer overflow). So unless you need arithmetic right shift, it's usually a good idea to do your bit-shifting with unsigned types.)
Left shift (<<)
Integers are stored, in memory, as a series of bits. For example, the number 6 stored as a 32-bit int would be:
00000000 00000000 00000000 00000110
Shifting this bit pattern to the left one position (6 << 1) would result in the number 12:
00000000 00000000 00000000 00001100
As you can see, the digits have shifted to the left by one position, and the last digit on the right is filled with a zero. You might also note that shifting left is equivalent to multiplication by powers of 2. So 6 << 1 is equivalent to 6 * 2, and 6 << 3 is equivalent to 6 * 8. A good optimizing compiler will replace multiplications with shifts when possible.
Non-circular shifting
Please note that these are not circular shifts. Shifting this value to the left by one position (3,758,096,384 << 1):
11100000 00000000 00000000 00000000
results in 3,221,225,472:
11000000 00000000 00000000 00000000
The digit that gets shifted "off the end" is lost. It does not wrap around.
Logical right shift (>>>)
A logical right shift is the converse to the left shift. Rather than moving bits to the left, they simply move to the right. For example, shifting the number 12:
00000000 00000000 00000000 00001100
to the right by one position (12 >>> 1) will get back our original 6:
00000000 00000000 00000000 00000110
So we see that shifting to the right is equivalent to division by powers of 2.
Lost bits are gone
However, a shift cannot reclaim "lost" bits. For example, if we shift this pattern:
00111000 00000000 00000000 00000110
to the left 4 positions (939,524,102 << 4), we get 2,147,483,744:
10000000 00000000 00000000 01100000
and then shifting back ((939,524,102 << 4) >>> 4) we get 134,217,734:
00001000 00000000 00000000 00000110
We cannot get back our original value once we have lost bits.
Arithmetic right shift (>>)
The arithmetic right shift is exactly like the logical right shift, except instead of padding with zero, it pads with the most significant bit. This is because the most significant bit is the sign bit, or the bit that distinguishes positive and negative numbers. By padding with the most significant bit, the arithmetic right shift is sign-preserving.
For example, if we interpret this bit pattern as a negative number:
10000000 00000000 00000000 01100000
we have the number -2,147,483,552. Shifting this to the right 4 positions with the arithmetic shift (-2,147,483,552 >> 4) would give us:
11111000 00000000 00000000 00000110
or the number -134,217,722.
So we see that we have preserved the sign of our negative numbers by using the arithmetic right shift, rather than the logical right shift. And once again, we see that we are performing division by powers of 2.
Let's say we have a single byte:
0110110
Applying a single left bitshift gets us:
1101100
The leftmost zero was shifted out of the byte, and a new zero was appended to the right end of the byte.
The bits don't rollover; they are discarded. That means if you left shift 1101100 and then right shift it, you won't get the same result back.
Shifting left by N is equivalent to multiplying by 2N.
Shifting right by N is (if you are using ones' complement) is the equivalent of dividing by 2N and rounding to zero.
Bitshifting can be used for insanely fast multiplication and division, provided you are working with a power of 2. Almost all low-level graphics routines use bitshifting.
For example, way back in the olden days, we used mode 13h (320x200 256 colors) for games. In Mode 13h, the video memory was laid out sequentially per pixel. That meant to calculate the location for a pixel, you would use the following math:
memoryOffset = (row * 320) + column
Now, back in that day and age, speed was critical, so we would use bitshifts to do this operation.
However, 320 is not a power of two, so to get around this we have to find out what is a power of two that added together makes 320:
(row * 320) = (row * 256) + (row * 64)
Now we can convert that into left shifts:
(row * 320) = (row << 8) + (row << 6)
For a final result of:
memoryOffset = ((row << 8) + (row << 6)) + column
Now we get the same offset as before, except instead of an expensive multiplication operation, we use the two bitshifts...in x86 it would be something like this (note, it's been forever since I've done assembly (editor's note: corrected a couple mistakes and added a 32-bit example)):
mov ax, 320; 2 cycles
mul word [row]; 22 CPU Cycles
mov di,ax; 2 cycles
add di, [column]; 2 cycles
; di = [row]*320 + [column]
; 16-bit addressing mode limitations:
; [di] is a valid addressing mode, but [ax] isn't, otherwise we could skip the last mov
Total: 28 cycles on whatever ancient CPU had these timings.
Vrs
mov ax, [row]; 2 cycles
mov di, ax; 2
shl ax, 6; 2
shl di, 8; 2
add di, ax; 2 (320 = 256+64)
add di, [column]; 2
; di = [row]*(256+64) + [column]
12 cycles on the same ancient CPU.
Yes, we would work this hard to shave off 16 CPU cycles.
In 32 or 64-bit mode, both versions get a lot shorter and faster. Modern out-of-order execution CPUs like Intel Skylake (see http://agner.org/optimize/) have very fast hardware multiply (low latency and high throughput), so the gain is much smaller. AMD Bulldozer-family is a bit slower, especially for 64-bit multiply. On Intel CPUs, and AMD Ryzen, two shifts are slightly lower latency but more instructions than a multiply (which may lead to lower throughput):
imul edi, [row], 320 ; 3 cycle latency from [row] being ready
add edi, [column] ; 1 cycle latency (from [column] and edi being ready).
; edi = [row]*(256+64) + [column], in 4 cycles from [row] being ready.
vs.
mov edi, [row]
shl edi, 6 ; row*64. 1 cycle latency
lea edi, [edi + edi*4] ; row*(64 + 64*4). 1 cycle latency
add edi, [column] ; 1 cycle latency from edi and [column] both being ready
; edi = [row]*(256+64) + [column], in 3 cycles from [row] being ready.
Compilers will do this for you: See how GCC, Clang, and Microsoft Visual C++ all use shift+lea when optimizing return 320*row + col;.
The most interesting thing to note here is that x86 has a shift-and-add instruction (LEA) that can do small left shifts and add at the same time, with the performance as an add instruction. ARM is even more powerful: one operand of any instruction can be left or right shifted for free. So scaling by a compile-time-constant that's known to be a power-of-2 can be even more efficient than a multiply.
OK, back in the modern days... something more useful now would be to use bitshifting to store two 8-bit values in a 16-bit integer. For example, in C#:
// Byte1: 11110000
// Byte2: 00001111
Int16 value = ((byte)(Byte1 >> 8) | Byte2));
// value = 000011111110000;
In C++, compilers should do this for you if you used a struct with two 8-bit members, but in practice they don't always.
Bitwise operations, including bit shift, are fundamental to low-level hardware or embedded programming. If you read a specification for a device or even some binary file formats, you will see bytes, words, and dwords, broken up into non-byte aligned bitfields, which contain various values of interest. Accessing these bit-fields for reading/writing is the most common usage.
A simple real example in graphics programming is that a 16-bit pixel is represented as follows:
bit | 15| 14| 13| 12| 11| 10| 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| Blue | Green | Red |
To get at the green value you would do this:
#define GREEN_MASK 0x7E0
#define GREEN_OFFSET 5
// Read green
uint16_t green = (pixel & GREEN_MASK) >> GREEN_OFFSET;
Explanation
In order to obtain the value of green ONLY, which starts at offset 5 and ends at 10 (i.e. 6-bits long), you need to use a (bit) mask, which when applied against the entire 16-bit pixel, will yield only the bits we are interested in.
#define GREEN_MASK 0x7E0
The appropriate mask is 0x7E0 which in binary is 0000011111100000 (which is 2016 in decimal).
uint16_t green = (pixel & GREEN_MASK) ...;
To apply a mask, you use the AND operator (&).
uint16_t green = (pixel & GREEN_MASK) >> GREEN_OFFSET;
After applying the mask, you'll end up with a 16-bit number which is really just a 11-bit number since its MSB is in the 11th bit. Green is actually only 6-bits long, so we need to scale it down using a right shift (11 - 6 = 5), hence the use of 5 as offset (#define GREEN_OFFSET 5).
Also common is using bit shifts for fast multiplication and division by powers of 2:
i <<= x; // i *= 2^x;
i >>= y; // i /= 2^y;
Bit Masking & Shifting
Bit shifting is often used in low-level graphics programming. For example, a given pixel color value encoded in a 32-bit word.
Pixel-Color Value in Hex: B9B9B900
Pixel-Color Value in Binary: 10111001 10111001 10111001 00000000
For better understanding, the same binary value labeled with what sections represent what color part.
Red Green Blue Alpha
Pixel-Color Value in Binary: 10111001 10111001 10111001 00000000
Let's say for example we want to get the green value of this pixel's color. We can easily get that value by masking and shifting.
Our mask:
Red Green Blue Alpha
color : 10111001 10111001 10111001 00000000
green_mask : 00000000 11111111 00000000 00000000
masked_color = color & green_mask
masked_color: 00000000 10111001 00000000 00000000
The logical & operator ensures that only the values where the mask is 1 are kept. The last thing we now have to do, is to get the correct integer value by shifting all those bits to the right by 16 places (logical right shift).
green_value = masked_color >>> 16
Et voilà, we have the integer representing the amount of green in the pixel's color:
Pixels-Green Value in Hex: 000000B9
Pixels-Green Value in Binary: 00000000 00000000 00000000 10111001
Pixels-Green Value in Decimal: 185
This is often used for encoding or decoding image formats like jpg, png, etc.
One gotcha is that the following is implementation dependent (according to the ANSI standard):
char x = -1;
x >> 1;
x can now be 127 (01111111) or still -1 (11111111).
In practice, it's usually the latter.
I am writing tips and tricks only. It may be useful in tests and exams.
n = n*2: n = n<<1
n = n/2: n = n>>1
Checking if n is power of 2 (1,2,4,8,...): check !(n & (n-1))
Getting xth bit of n: n |= (1 << x)
Checking if x is even or odd: x&1 == 0 (even)
Toggle the nth bit of x: x ^ (1<<n)
Note that in the Java implementation, the number of bits to shift is mod'd by the size of the source.
For example:
(long) 4 >> 65
equals 2. You might expect shifting the bits to the right 65 times would zero everything out, but it's actually the equivalent of:
(long) 4 >> (65 % 64)
This is true for <<, >>, and >>>. I have not tried it out in other languages.
Some useful bit operations/manipulations in Python.
I implemented Ravi Prakash's answer in Python.
# Basic bit operations
# Integer to binary
print(bin(10))
# Binary to integer
print(int('1010', 2))
# Multiplying x with 2 .... x**2 == x << 1
print(200 << 1)
# Dividing x with 2 .... x/2 == x >> 1
print(200 >> 1)
# Modulo x with 2 .... x % 2 == x & 1
if 20 & 1 == 0:
print("20 is a even number")
# Check if n is power of 2: check !(n & (n-1))
print(not(33 & (33-1)))
# Getting xth bit of n: (n >> x) & 1
print((10 >> 2) & 1) # Bin of 10 == 1010 and second bit is 0
# Toggle nth bit of x : x^(1 << n)
# take bin(10) == 1010 and toggling second bit in bin(10) we get 1110 === bin(14)
print(10^(1 << 2))
The Bitwise operators are used to perform operations a bit-level or to manipulate bits in different ways. The bitwise operations are found to be much faster and are some times used to improve the efficiency of a program.
Basically, Bitwise operators can be applied to the integer types: long, int, short, char and byte.
Bitwise Shift Operators
They are classified into two categories left shift and the right shift.
Left Shift(<<): The left shift operator, shifts all of the bits in value to the left a specified number of times. Syntax: value << num. Here num specifies the number of position to left-shift the value in value. That is, the << moves all of the bits in the specified value to the left by the number of bit positions specified by num. For each shift left, the high-order bit is shifted out (and ignored/lost), and a zero is brought in on the right. This means that when a left shift is applied to 32-bit compiler, bits are lost once they are shifted past bit position 31. If the compiler is of 64-bit then bits are lost after bit position 63.
Output: 6, Here the binary representation of 3 is 0...0011(considering 32-bit system) so when it shifted one time the leading zero is ignored/lost and all the rest 31 bits shifted to left. And zero is added at the end. So it became 0...0110, the decimal representation of this number is 6.
In the case of a negative number:
Output: -2, In java negative number, is represented by 2's complement. SO, -1 represent by 2^32-1 which is equivalent to 1....11(Considering 32-bit system). When shifted one time the leading bit is ignored/lost and the rest 31 bits shifted to left and zero is added at the last. So it becomes, 11...10 and its decimal equivalent is -2.
So, I think you get enough knowledge about the left shift and how its work.
Right Shift(>>): The right shift operator, shifts all of the bits in value to the right a specified of times. Syntax: value >> num, num specifies the number of positions to right-shift the value in value. That is, the >> moves/shift all of the bits in the specified value of the right the number of bit positions specified by num.
The following code fragment shifts the value 35 to the right by two positions:
Output: 8, As a binary representation of 35 in a 32-bit system is 00...00100011, so when we right shift it two times the first 30 leading bits are moved/shifts to the right side and the two low-order bits are lost/ignored and two zeros are added at the leading bits. So, it becomes 00....00001000, the decimal equivalent of this binary representation is 8.
Or there is a simple mathematical trick to find out the output of this following code: To generalize this we can say that, x >> y = floor(x/pow(2,y)). Consider the above example, x=35 and y=2 so, 35/2^2 = 8.75 and if we take the floor value then the answer is 8.
Output:
But remember one thing this trick is fine for small values of y if you take the large values of y it gives you incorrect output.
In the case of a negative number:
Because of the negative numbers the Right shift operator works in two modes signed and unsigned. In signed right shift operator (>>), In case of a positive number, it fills the leading bits with 0. And In case of a negative number, it fills leading bits with 1. To keep the sign. This is called 'sign extension'.
Output: -5, As I explained above the compiler stores the negative value as 2's complement. So, -10 is represented as 2^32-10 and in binary representation considering 32-bit system 11....0110. When we shift/ move one time the first 31 leading bits got shifted in the right side and the low-order bit got lost/ignored. So, it becomes 11...0011 and the decimal representation of this number is -5 (How I know the sign of number? because the leading bit is 1).
It is interesting to note that if you shift -1 right, the result always remains -1 since sign extension keeps bringing in more ones in the high-order bits.
Unsigned Right Shift(>>>): This operator also shifts bits to the right. The difference between signed and unsigned is the latter fills the leading bits with 1 if the number is negative and the former fills zero in either case. Now the question arises why we need unsigned right operation if we get the desired output by signed right shift operator. Understand this with an example, If you are shifting something that does not represent a numeric value, you may not want sign extension to take place. This situation is common when you are working with pixel-based values and graphics. In these cases, you will generally want to shift a zero into the high-order bit no matter what it's the initial value was.
Output: 2147483647, Because -2 is represented as 11...10 in a 32-bit system. When we shift the bit by one, the first 31 leading bit is moved/shifts in right and the low-order bit is lost/ignored and the zero is added to the leading bit. So, it becomes 011...1111 (2^31-1) and its decimal equivalent is 2147483647.
Be aware of that only 32 bit version of PHP is available on the Windows platform.
Then if you for instance shift << or >> more than by 31 bits, results are unexpectable. Usually the original number instead of zeros will be returned, and it can be a really tricky bug.
Of course if you use 64 bit version of PHP (Unix), you should avoid shifting by more than 63 bits. However, for instance, MySQL uses the 64-bit BIGINT, so there should not be any compatibility problems.
UPDATE: From PHP 7 Windows, PHP builds are finally able to use full 64 bit integers:
The size of an integer is platform-dependent, although a maximum value of about two billion is the usual value (that's 32 bits signed). 64-bit platforms usually have a maximum value of about 9E18, except on Windows prior to PHP 7, where it was always 32 bit.