Rot13 for numbers - language-agnostic

EDIT: Now a Major Motion Blog Post at http://messymatters.com/sealedbids
The idea of rot13 is to obscure text, for example to prevent spoilers. It's not meant to be cryptographically secure but to simply make sure that only people who are sure they want to read it will read it.
I'd like to do something similar for numbers, for an application involving sealed bids. Roughly I want to send someone my number and trust them to pick their own number, uninfluenced by mine, but then they should be able to reveal mine (purely client-side) when they're ready. They should not require further input from me or any third party.
(Added: Note the assumption that the recipient is being trusted not to cheat.)
It's not as simple as rot13 because certain numbers, like 1 and 2, will recur often enough that you might remember that, say, 34.2 is really 1.
Here's what I'm looking for specifically:
A function seal() that maps a real number to a real number (or a string). It should not be deterministic -- seal(7) should not map to the same thing every time. But the corresponding function unseal() should be deterministic -- unseal(seal(x)) should equal x for all x. I don't want seal or unseal to call any webservices or even get the system time (because I don't want to assume synchronized clocks). (Added: It's fine to assume that all bids will be less than some maximum, known to everyone, say a million.)
Sanity check:
> seal(7)
482.2382 # some random-seeming number or string.
> seal(7)
71.9217 # a completely different random-seeming number or string.
> unseal(seal(7))
7 # we always recover the original number by unsealing.

You can pack your number as a 4 byte float together with another random float into a double and send that. The client then just has to pick up the first four bytes. In python:
import struct, random
def seal(f):
return struct.unpack("d",struct.pack("ff", f, random.random() ))[0]
def unseal(f):
return struct.unpack("ff",struct.pack("d", f))[0]
>>> unseal( seal( 3))
3.0
>>> seal(3)
4.4533985422978706e-009
>>> seal(3)
9.0767582382536571e-010

Here's a solution inspired by Svante's answer.
M = 9999 # Upper bound on bid.
seal(x) = M * randInt(9,99) + x
unseal(x) = x % M
Sanity check:
> seal(7)
716017
> seal(7)
518497
> unseal(seal(7))
7
This needs tweaking to allow negative bids though:
M = 9999 # Numbers between -M/2 and M/2 can be sealed.
seal(x) = M * randInt(9,99) + x
unseal(x) =
m = x % M;
if m > M/2 return m - M else return m
A nice thing about this solution is how trivial it is for the recipient to decode -- just mod by 9999 (and if that's 5000 or more then it was a negative bid so subtract another 9999). It's also nice that the obscured bid will be at most 6 digits long. (This is plenty security for what I have in mind -- if the bids can possibly exceed $5k then I'd use a more secure method. Though of course the max bid in this method can be set as high as you want.)
Instructions for Lay Folk
Pick a number between 9 and 99 and multiply it by 9999, then add your bid.
This will yield a 5 or 6-digit number that encodes your bid.
To unseal it, divide by 9999, subtract the part to the left of the decimal point, then multiply by 9999.
(This is known to children and mathematicians as "finding the remainder when dividing by 9999" or "mod'ing by 9999", respectively.)
This works for nonnegative bids less than 9999 (if that's not enough, use 99999 or as many digits as you want).
If you want to allow negative bids, then the magic 9999 number needs to be twice the biggest possible bid.
And when decoding, if the result is greater than half of 9999, ie, 5000 or more, then subtract 9999 to get the actual (negative) bid.
Again, note that this is on the honor system: there's nothing technically preventing you from unsealing the other person's number as soon as you see it.

If you're relying on honesty of the user and only dealing with integer bids, a simple XOR operation with a random number should be all you need, an example in C#:
static Random rng = new Random();
static string EncodeBid(int bid)
{
int i = rng.Next();
return String.Format("{0}:{1}", i, bid ^ i);
}
static int DecodeBid(string encodedBid)
{
string[] d = encodedBid.Split(":".ToCharArray());
return Convert.ToInt32(d[0]) ^ Convert.ToInt32(d[1]);
}
Use:
int bid = 500;
string encodedBid = EncodeBid(bid); // encodedBid is something like 54017514:4017054 and will be different each time
int decodedBid = DecodeBid(encodedBid); // decodedBid is 500
Converting the decode process to a client side construct should be simple enough.

Is there a maximum bid? If so, you could do this:
Let max-bid be the maximum bid and a-bid the bid you want to encode. Multiply max-bid by a rather large random number (if you want to use base64 encoding in the last step, max-rand should be (2^24/max-bid)-1, and min-rand perhaps half of that), then add a-bid. Encode this, e.g. through base64.
The recipient then just has to decode and find the remainder modulo max-bid.

What you want to do (a Commitment scheme) is impossible to do client-side-only. The best you could do is encrypt with a shared key.
If the client doesn't need your cooperation to reveal the number, they can just modify the program to reveal the number. You might as well have just sent it and not displayed it.
To do it properly, you could send a secure hash of your bid + a random salt. That commits you to your bid. The other client can commit to their bid in the same way. Then you each share your bid and salt.
[edit] Since you trust the other client:
Sender:
Let M be your message
K = random 4-byte key
C1 = M xor hash(K) //hash optional: hides patterns in M xor K
//(you can repeat or truncate hash(K) as necessary to cover the message)
//(could also xor with output of a PRNG instead)
C2 = K append M //they need to know K to reveal the message
send C2 //(convert bytes to hex representation if needed)
Receiver:
receive C2
K = C2[:4]
C1 = C2[4:]
M = C1 xor hash(K)

Are you aware that you need a larger 'sealed' set of numbers than your original, if you want that to work?
So you need to restrict your real numbers somehow, or store extra info that you don't show.

One simple way is to write a message like:
"my bid is: $14.23: aduigfurjwjnfdjfugfojdjkdskdfdhfddfuiodrnfnghfifyis"
All that junk is randomly-generated, and different every time.
Send the other person the SHA256 hash of the message. Have them send you the hash of their bid. Then, once you both have the hashes, send the full message, and confirm that their bid corresponds to the hash they gave you.
This gives rather stronger guarantees than you need - it's actually not possible from them to work out your bid before you send them your full message. However, there is no unseal() function as you describe.
This simple scheme has various weaknesses that a full zero-knowledge scheme would not have. For example, if they fake you out by sending you a random number instead of a hash, then they can work out your bid without revealing their own. But you didn't ask for bullet-proof. This prevents both accidental and (I think) undetectable cheating, and uses only a commonly-available command line utility, plus a random number generator (dice will do).
If, as you say, you want them to be able to recover your bid without any further input from you, and you are willing to trust them only to do it after posting their bid, then just encrypt using any old symmetric cipher (gpg --symmetric, perhaps) and the key, "rot13". This will prevent accidental cheating, but allow undetectable cheating.

One idea that poped into my mind was to maybe base your algorithm on the mathematics
used for secure key sharing.
If you want to give two persons, Bob and Alice, half a key each so
that only when combining them they will be able to open whatever the key locks, how do you do that? The solution to this comes from mathematics. Say you have two points A (-2,2) and B (2,0) in a x/y coordinate system.
|
A +
|
C
|
---+---+---+---|---+---B---+---+---+---
|
+
|
+
If you draw a straight line between them it will cross the y axis at exactly one single point, C (0,1).
If you only know one of the points A or B it is impossible to tell where it will cross.
Thus you can let the points A and B be the shared keys which when combined will reveal the y-value
of the crossing point (i.e. 1 in this example) and this value is then typically used as
a real key for something.
For your bidding application you could let seal() and unseal() swap the y-value between the C and B points
(deterministic) but have the A point vary from time to time.
This way seal(y-value of point B) will give completely different results depending on point A,
but unseal(seal(y-value of point B)) should return the y-value of B which is what you ask for.
PS
It is not required to have A and B on different sides of the y-axis, but is much simpler conceptually to think of it this way (and I recommend implementing it that way as well).
With this straight line you can then share keys between several persons so that only two of
them are needed to unlock whatever. It is possible to use curve types other then straight lines to create other
key sharing properties (i.e. 3 out of 3 keys are required etc).

Pseudo code:
encode:
value = 2000
key = random(0..255); // our key is only 2 bytes
// 'sealing it'
value = value XOR 2000;
// add key
sealed = (value << 16) | key
decode:
key = sealed & 0xFF
unsealed = key XOR (sealed >> 16)
Would that work?

Since it seems that you are assuming that the other person doesn't want to know your bid until after they've placed their own, and can be trusted not to cheat, you could try a variable rotation scheme:
from random import randint
def seal(input):
r = randint(0, 50)
obfuscate = [str(r)] + [ str(ord(c) + r) for c in '%s' % input ]
return ':'.join(obfuscate)
def unseal(input):
tmp = input.split(':')
r = int(tmp.pop(0))
deobfuscate = [ chr(int(c) - r) for c in tmp ]
return ''.join(deobfuscate)
# I suppose you would put your bid in here, for 100 dollars
tmp = seal('$100.00') # --> '1:37:50:49:49:47:49:49' (output varies)
print unseal(tmp) # --> '$100.00'
At some point (I think we may have already passed it) this becomes silly, and because it is so easy, you should just use simple encryption, where the message recipient always knows the key - the person's username, perhaps.

If the bids are fairly large numbers, how about a bitwise XOR with some predetermined random-ish number? XORing again will then retrieve the original value.
You can change the number as often as you like, as long as both client and server know it.

You could set a different base (like 16, 17, 18, etc.) and keep track of which base you've "sealed" the bid with...
Of course, this presumes large numbers (> the base you're using, at least). If they were decimal, you could drop the point (for example, 27.04 becomes 2704, which you then translate to base 29...)
You'd probably want to use base 17 to 36 (only because some people might recognize hex and be able to translate it in their head...)
This way, you would have numbers like G4 or Z3 or KW (depending on the numbers you're sealing)...

Here's a cheap way to piggyback off rot13:
Assume we have a function gibberish() that generates something like "fdjk alqef lwwqisvz" and a function words(x) that converts a number x to words, eg, words(42) returns "forty two" (no hyphens).
Then define
seal(x) = rot13(gibberish() + words(x) + gibberish())
and
unseal(x) = rot13(x)
Of course the output of unseal is not an actual number and is only useful to a human, but that might be ok.
You could make it a little more sophisticated with words-to-number function that would also just throw away all the gibberish words (defined as anything that's not one of the number words -- there are less than a hundred of those, I think).
Sanity check:
> seal(7)
fhrlls hqufw huqfha frira afsb ht ahuqw ajaijzji
> seal(7)
qbua adfshua hqgya ubiwi ahp wqwia qhu frira wge
> unseal(seal(7))
sueyyf udhsj seven ahkua snsfo ug nuhdj nwnvwmwv
I know this is silly but it's a way to do it "by hand" if all you have is rot13 available.

Related

Store 2 previous array to implement Leapfrog numerical Scheme

In the context of advection numerical solving, I try to implement the following recurrence formula in a time loop:
As you can see, I need the second previous time value for (j-1) and previous (j) value to compute the (j+1) time value.
I don't know how to implement this recurrence formula. Here below my attempt in Python where u represents the array of values T for each iteration:
l = 1
# Time loop
for i in range(1,nt+1):
# Leapfrog scheme
# Store (i-1) value for scheme formula
if (l < 2):
atemp = copy(u)
l = l+1
elif (l == 2):
btemp = copy(atemp)
l = 1
u[1:nx-1] = btemp[1:nx-1] - cfl*(u[2:nx] - u[0:nx-2])
t=t+dt
Coefficient cfl is equal to s.
But the results of simulation don't give fully good results. I think my way to do is not correct.
How can I implement this recurrence? i.e mostly how to store the (j-1) value in time to inject it into formula for computing (j+1) ?
Update
In the formula:
the time index j has to start from j=1since we have the term T_(i,j-1).
So for the first iteration, we have :
T_i,2 = T_i,0 - s (T_(i+1),1 - T_(i-1),1)
Then, if In only use time loop (and not spatial loop such that way, I can't compute dudx[i]=T[i+1]-T[i-1]), how can I compute (T_(i+1),1 - T_(i-1),1), I mean, without precalculating dudx[i] = T_(i+1),1 - T_(i-1),1 ?
That was the trick I try to implement in my original question. The main problem is that I am imposed to use only time loop.
The code would be simpler if I could use 2D array with T[i][j] element, ifor spatial and jfor time but I am not allowed to use 2D array in my examination.
There are few problems I see in your code. First is notation. From the numerical scheme you posted it looks like you are discretizing time with j and space with i using central differences in both. But in your code it looks like the time loop is written in terms of i and this is confusing. I will use j for space and n for time here.
Second, this line
u[1:nx-1] = btemp[1:nx-1] - cfl*(u[2:nx] - u[0:nx-2])
is not correct since for the spatial derivatve du/dx you need to apply the central difference scheme at every spatial point of u. Hence, u[2:nx] - u[0:nx-2] is doing nothing like this, it is just subtracting what seems to be the solution including boundary points on the left from the solution including boundary points on the right. You need to properly calculate this spatial derivative.
Finally, the Leapfrog method which indeed takes into account the n-1 solution is usually implemented by keeping a copy of the previous time step in another variable such as u_prev. So if you use the Leapfrog time scheme plus central difference spatial scheme, in the end you should have something like
u_prev = u_init
u = u_prev
for n in time...:
u_new = u_prev - cfl*(dudx)
u_prev = u
u = u_new
Note that u on the LHS is to compute time n+1, u_prev is at time n-1 and dudx uses u at the current time n. Also, you can compute dudx with
for j in space...:
dudx[j] = u[j+1]-u[j-1]

Generate unique serial from id number

I have a database that increases id incrementally. I need a function that converts that id to a unique number between 0 and 1000. (the actual max is much larger but just for simplicity's sake.)
1 => 3301,
2 => 0234,
3 => 7928,
4 => 9821
The number generated cannot have duplicates.
It can not be incremental.
Need it generated on the fly (not create a table of uniform numbers to read from)
I thought a hash function but there is a possibility for collisions.
Random numbers could also have duplicates.
I need a minimal perfect hash function but cannot find a simple solution.
Since the criteria are sort of vague (good enough to fool the average person), I am unsure exactly which route to take. Here are some ideas:
You could use a Pearson hash. According to the Wikipedia page:
Given a small, privileged set of inputs (e.g., reserved words for a compiler), the permutation table can be adjusted so that those inputs yield distinct hash values, producing what is called a perfect hash function.
You could just use a complicated looking one-to-one mathematical function. The drawback of this is that it would be difficult to make one that was not strictly increasing or strictly decreasing due to the one-to-one requirement. If you did something like (id ^ 2) + id * 2, the interval between ids would change and it wouldn't be immediately obvious what the function was without knowing the original ids.
You could do something like this:
new_id = (old_id << 4) + arbitrary_4bit_hash(old_id);
This would give the unique IDs and it wouldn't be immediately obvious that the first 4 bits are just garbage (especially when reading the numbers in decimal format). Like the last option, the new IDs would be in the same order as the old ones. I don't know if that would be a problem.
You could just hardcode all ID conversions by making a lookup array full of "random" numbers.
You could use some kind of hash function generator like gperf.
GNU gperf is a perfect hash function generator. For a given list of strings, it produces a hash function and hash table, in form of C or C++ code, for looking up a value depending on the input string. The hash function is perfect, which means that the hash table has no collisions, and the hash table lookup needs a single string comparison only.
You could encrypt the ids with a key using a cryptographically secure mechanism.
Hopefully one of these works for you.
Update
Here is the rotational shift the OP requested:
function map($number)
{
// Shift the high bits down to the low end and the low bits
// down to the high end
// Also, mask out all but 10 bits. This allows unique mappings
// from 0-1023 to 0-1023
$high_bits = 0b0000001111111000 & $number;
$new_low_bits = $high_bits >> 3;
$low_bits = 0b0000000000000111 & $number;
$new_high_bits = $low_bits << 7;
// Recombine bits
$new_number = $new_high_bits | $new_low_bits;
return $new_number;
}
function demap($number)
{
// Shift the high bits down to the low end and the low bits
// down to the high end
$high_bits = 0b0000001110000000 & $number;
$new_low_bits = $high_bits >> 7;
$low_bits = 0b0000000001111111 & $number;
$new_high_bits = $low_bits << 3;
// Recombine bits
$new_number = $new_high_bits | $new_low_bits;
return $new_number;
}
This method has its advantages and disadvantages. The main disadvantage that I can think of (besides the security aspect) is that for lower IDs consecutive numbers will be exactly the same (multiplicative) interval apart until digits start wrapping around. That is to say
map(1) * 2 == map(2)
map(1) * 3 == map(3)
This happens, of course, because with lower numbers, all the higher bits are 0, so the map function is equivalent to just shifting. This is why I suggested using pseudo-random data for the lower bits rather than the higher bits of the number. It would make the regular interval less noticeable. To help mitigate this problem, the function I wrote shifts only the first 3 bits and rotates the rest. By doing this, the regular interval will be less noticeable for all IDs greater than 7.
It seems that it doesn't have to be numerical? What about an MD5-Hash?
select md5(id+rand(10000)) from ...

NaN and +-INF in floating point number system following IEEE754

In the standard, representation of NaN and INF is like this:
For NaN: exponent = emax+1 & mantissa != 0;
For INF: exponent = emax+1 & mantissa = 0;
Their are many ways and calculations resulting these two value.
But what ACTUALLY is NaN(INF)?
And HOW does the system "decide" or "judge" to store value as these one(two)?
Here may be a case seeming to be odd to me:
a = b = 1*2(emax);
then calculating c = a+b, the actual result is 1*2^(emax+1);
Now, c is not an available FP value according to the standard;
then how does the system store c in device?
Is it NaN?
If yes, how can this be even reasonable?
I mean, 1*2^(emax+1) IS(Should be) a Number...in a common sense...?
If this is the case, then how ACTUALLY does the standard think what a NaN is?
If not, then how do we deal with this???
I'm considering one like this:
let eM = emax+1;
then 1d.d...d * 2^(eM-1) = 1d.d...d * 2^(emax)
with 1d.d...d having legal number of digits by the system.
This is actually a way like that dealing with denormalized number.
The thing here is this:
Is the judgement posterior or prior to the completion of calculation?
If it's the former, the above may be a problem or not?
On the other hand, then the task seems undonable...
Is there anyone ever thinking about this issue?
Thx for considering it!!
Note: things for +-INF are also presented.
From Wikipedia:
The five possible exceptions are:
Invalid operation (e.g., square root of a negative number) (returns qNaN by default).
Division by zero (an operation on finite operands gives an exact infinite result, e.g., 1/0 or log(0)) (returns ±infinity by default).
Overflow (a result is too large to be represented correctly) (returns ±infinity by default (for round-to-nearest mode)).
Underflow (a result is very small (outside the normal range) and is inexact) (returns a denormalized value by default).
...

Give an unique 6 or 9 digit number to each row

Is it possible to assign an unique 6 or 9 digit number to each new row only with MySQL.
Example :
id1 : 928524
id2 : 124952
id3 : 485920
...
...
P.S : I can do that with php's rand() function, but I want a better way.
MySQL can assign unique continuous keys by itself. If you don't want to use rand(), maybe this is what you meant?
I suggest you manually set the ID of the first row to 100000, then tell the database to auto increment. Next row should then be 100001, then 100002 and so on. Each unique.
Don't know why you would ever want to do this but you will have to use php's rand function, see if its already in the database, if it is start from the beginning again, if its not then use it for the id.
Essentially you want a cryptographic hash that's guaranteed not to have a collision for your range of inputs. Nobody seems to know the collision behavior of MD5, so here's an algorithm that's guaranteed not to have any: Choose two large numbers M and N that have no common divisors-- they can be two very large primes, or 2**64 and 3**50, or whatever. You will be generating numbers in the range 0..M-1. Use the following hashing function:
H(k) = k*N (mod M)
Basic number theory guarantees that the sequence has no collisions in the range 0..M-1. So as long as the IDs in your table are less than M, you can just hash them with this function and you'll have distinct hashes. If you use unsigned 64-bit integer arithmetic, you can let M = 2**64. N can then be any odd number (I'd choose something large enough to ensure that k*N > M), and you get the modulo operation for free as arithmetic overflow!
I wrote the following in comments but I'd better repeat it here: This is not a good way to implement access protection. But it does prevent people from slurping all your content, if M is sufficiently large.

Real world use cases of bitwise operators [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
What are some real world use cases of the following bitwise operators?
AND
XOR
NOT
OR
Left/Right shift
Bit fields (flags)
They're the most efficient way of representing something whose state is defined by several "yes or no" properties. ACLs are a good example; if you have let's say 4 discrete permissions (read, write, execute, change policy), it's better to store this in 1 byte rather than waste 4. These can be mapped to enumeration types in many languages for added convenience.
Communication over ports/sockets
Always involves checksums, parity, stop bits, flow control algorithms, and so on, which usually depend on the logic values of individual bytes as opposed to numeric values, since the medium may only be capable of transmitting one bit at a time.
Compression, Encryption
Both of these are heavily dependent on bitwise algorithms. Look at the deflate algorithm for an example - everything is in bits, not bytes.
Finite State Machines
I'm speaking primarily of the kind embedded in some piece of hardware, although they can be found in software too. These are combinatorial in nature - they might literally be getting "compiled" down to a bunch of logic gates, so they have to be expressed as AND, OR, NOT, etc.
Graphics
There's hardly enough space here to get into every area where these operators are used in graphics programming. XOR (or ^) is particularly interesting here because applying the same input a second time will undo the first. Older GUIs used to rely on this for selection highlighting and other overlays, in order to eliminate the need for costly redraws. They're still useful in slow graphics protocols (i.e. remote desktop).
Those were just the first few examples I came up with - this is hardly an exhaustive list.
Is it odd?
(value & 0x1) > 0
Is it divisible by two (even)?
(value & 0x1) == 0
I've used bitwise operations in implementing a security model for a CMS. It had pages which could be accessed by users if they were in appropriate groups. A user could be in multiple groups, so we needed to check if there was an intersection between the users groups and the pages groups. So we assigned each group a unique power-of-2 identifier, e.g.:
Group A = 1 --> 00000001
Group B = 2 --> 00000010
Group C = 3 --> 00000100
We OR these values together, and store the value (as a single int) with the page. E.g. if a page could be accessed by groups A & B, we store the value 3 (which in binary is 00000011) as the pages access control. In much the same way, we store a value of ORed group identifiers with a user to represent which groups they are in.
So to check if a given user can access a given page, you just need to AND the values together and check if the value is non-zero. This is very fast as this check is implemented in a single instruction, no looping, no database round-trips.
Here's some common idioms dealing with flags stored as individual bits.
enum CDRIndicators {
Local = 1 << 0,
External = 1 << 1,
CallerIDMissing = 1 << 2,
Chargeable = 1 << 3
};
unsigned int flags = 0;
Set the Chargeable flag:
flags |= Chargeable;
Clear CallerIDMissing flag:
flags &= ~CallerIDMissing;
Test whether CallerIDMissing and Chargeable are set:
if((flags & (CallerIDMissing | Chargeable )) == (CallerIDMissing | Chargeable)) {
}
Low-level programming is a good example. You may, for instance, need to write a specific bit to a memory-mapped register to make some piece of hardware do what you want it to:
volatile uint32_t *register = (volatile uint32_t *)0x87000000;
uint32_t value;
uint32_t set_bit = 0x00010000;
uint32_t clear_bit = 0x00001000;
value = *register; // get current value from the register
value = value & ~clear_bit; // clear a bit
value = value | set_bit; // set a bit
*register = value; // write it back to the register
Also, htonl() and htons() are implemented using the & and | operators (on machines whose endianness(Byte order) doesn't match network order):
#define htons(a) ((((a) & 0xff00) >> 8) | \
(((a) & 0x00ff) << 8))
#define htonl(a) ((((a) & 0xff000000) >> 24) | \
(((a) & 0x00ff0000) >> 8) | \
(((a) & 0x0000ff00) << 8) | \
(((a) & 0x000000ff) << 24))
I use them to get RGB(A) values from packed colorvalues, for instance.
When I have a bunch of boolean flags, I like to store them all in an int.
I get them out using bitwise-AND. For example:
int flags;
if (flags & 0x10) {
// Turn this feature on.
}
if (flags & 0x08) {
// Turn a second feature on.
}
etc.
& = AND:
Mask out specific bits.
You are defining the specific bits which should be displayed
or not displayed. 0x0 & x will clear all bits in a byte while 0xFF will not change x.
0x0F will display the bits in the lower nibble.
Conversion:
To cast shorter variables into longer ones with bit identity it is necessary to adjust the bits because -1 in an int is 0xFFFFFFFF while -1 in a long is 0xFFFFFFFFFFFFFFFF. To preserve
the identity you apply a mask after conversion.
|=OR
Set bits. The bits will be set indepently if they are already set. Many datastructures (bitfields) have flags like IS_HSET = 0, IS_VSET = 1 which can be indepently set.
To set the flags, you apply IS_HSET | IS_VSET (In C and assembly this is very convenient to read)
^=XOR
Find bits which are the same or different.
~= NOT
Flip bits.
It can be shown that all possible local bit operations can be implemented by these operations.
So if you like you can implement an ADD instruction solely by bit operations.
Some wonderful hacks:
http://www.ugcs.caltech.edu/~wnoise/base2.html
http://www.jjj.de/bitwizardry/bitwizardrypage.html
Encryption is all bitwise operations.
You can use them as a quick and dirty way to hash data.
int a = 1230123;
int b = 1234555;
int c = 5865683;
int hash = a ^ b ^ c;
I just used bitwise-XOR (^) about three minutes ago to calculate a checksum for serial communication with a PLC...
This is an example to read colours from a bitmap image in byte format
byte imagePixel = 0xCCDDEE; /* Image in RRGGBB format R=Red, G=Green, B=Blue */
//To only have red
byte redColour = imagePixel & 0xFF0000; /*Bitmasking with AND operator */
//Now, we only want red colour
redColour = (redColour >> 24) & 0xFF; /* This now returns a red colour between 0x00 and 0xFF.
I hope this tiny examples helps....
In the abstracted world of today's modern language, not too many. File IO is an easy one that comes to mind, though that's exercising bitwise operations on something already implemented and is not implementing something that uses bitwise operations. Still, as an easy example, this code demonstrates removing the read-only attribute on a file (so that it can be used with a new FileStream specifying FileMode.Create) in c#:
//Hidden files posses some extra attibutes that make the FileStream throw an exception
//even with FileMode.Create (if exists -> overwrite) so delete it and don't worry about it!
if(File.Exists(targetName))
{
FileAttributes attributes = File.GetAttributes(targetName);
if ((attributes & FileAttributes.ReadOnly) == FileAttributes.ReadOnly)
File.SetAttributes(targetName, attributes & (~FileAttributes.ReadOnly));
File.Delete(targetName);
}
As far as custom implementations, here's a recent example:
I created a "message center" for sending secure messages from one installation of our distributed application to another. Basically, it's analogous to email, complete with Inbox, Outbox, Sent, etc, but it also has guaranteed delivery with read receipts, so there are additional subfolders beyond "inbox" and "sent." What this amounted to was a requirement for me to define generically what's "in the inbox" or what's "in the sent folder". Of the sent folder, I need to know what's read and what's unread. Of what's unread, I need to know what's received and what's not received. I use this information to build a dynamic where clause which filters a local datasource and displays the appropriate information.
Here's how the enum is put together:
public enum MemoView :int
{
InboundMemos = 1, // 0000 0001
InboundMemosForMyOrders = 3, // 0000 0011
SentMemosAll = 16, // 0001 0000
SentMemosNotReceived = 48, // 0011
SentMemosReceivedNotRead = 80, // 0101
SentMemosRead = 144, // 1001
Outbox = 272, //0001 0001 0000
OutBoxErrors = 784 //0011 0001 0000
}
Do you see what this does? By anding (&) with the "Inbox" enum value, InboundMemos, I know that InboundMemosForMyOrders is in the inbox.
Here's a boiled down version of the method that builds and returns the filter that defines a view for the currently selected folder:
private string GetFilterForView(MemoView view, DefaultableBoolean readOnly)
{
string filter = string.Empty;
if((view & MemoView.InboundMemos) == MemoView.InboundMemos)
{
filter = "<inbox filter conditions>";
if((view & MemoView.InboundMemosForMyOrders) == MemoView.InboundMemosForMyOrders)
{
filter += "<my memo filter conditions>";
}
}
else if((view & MemoView.SentMemosAll) == MemoView.SentMemosAll)
{
//all sent items have originating system = to local
filter = "<memos leaving current system>";
if((view & MemoView.Outbox) == MemoView.Outbox)
{
...
}
else
{
//sent sub folders
filter += "<all sent items>";
if((view & MemoView.SentMemosNotReceived) == MemoView.SentMemosNotReceived)
{
if((view & MemoView.SentMemosReceivedNotRead) == MemoView.SentMemosReceivedNotRead)
{
filter += "<not received and not read conditions>";
}
else
filter += "<received and not read conditions>";
}
}
}
return filter;
}
Extremely simple, but a neat implementation at a level of abstraction that doesn't typically require bitwise operations.
Usually bitwise operations are faster than doing multiply/divide. So if you need to multiply a variable x by say 9, you will do x<<3 + x which would be a few cycles faster than x*9. If this code is inside an ISR, you will save on response time.
Similarly if you want to use an array as a circular queue, it'd be faster (and more elegant) to handle wrap around checks with bit wise operations. (your array size should be a power of 2). Eg: , you can use tail = ((tail & MASK) + 1) instead of tail = ((tail +1) < size) ? tail+1 : 0, if you want to insert/delete.
Also if you want a error flag to hold multiple error codes together, each bit can hold a separate value. You can AND it with each individual error code as a check. This is used in Unix error codes.
Also a n-bit bitmap can be a really cool and compact data structure. If you want to allocate a resource pool of size n, we can use a n-bits to represent the current status.
Bitwise & is used to mask/extract a certain part of a byte.
1 Byte variable
01110010
&00001111 Bitmask of 0x0F to find out the lower nibble
--------
00000010
Specially the shift operator (<< >>) are often used for calculations.
Bitwise operators are useful for looping arrays which length is power of 2. As many people mentioned, bitwise operators are extremely useful and are used in Flags, Graphics, Networking, Encryption. Not only that, but they are extremely fast. My personal favorite use is to loop an array without conditionals. Suppose you have a zero-index based array(e.g. first element's index is 0) and you need to loop it indefinitely. By indefinitely I mean going from first element to last and returning to first. One way to implement this is:
int[] arr = new int[8];
int i = 0;
while (true) {
print(arr[i]);
i = i + 1;
if (i >= arr.length)
i = 0;
}
This is the simplest approach, if you'd like to avoid if statement, you can use modulus approach like so:
int[] arr = new int[8];
int i = 0;
while (true) {
print(arr[i]);
i = i + 1;
i = i % arr.length;
}
The down side of these two methods is that modulus operator is expensive, since it looks for a remainder after integer division. And the first method runs an if statement on each iteration. With bitwise operator however if length of your array is a power of 2, you can easily generate a sequence like 0 .. length - 1 by using & (bitwise and) operator like so i & length. So knowing this, the code from above becomes
int[] arr = new int[8];
int i = 0;
while (true){
print(arr[i]);
i = i + 1;
i = i & (arr.length - 1);
}
Here is how it works. In binary format every number that is power of 2 subtracted by 1 is expressed only with ones. For example 3 in binary is 11, 7 is 111, 15 is 1111 and so on, you get the idea. Now, what happens if you & any number against a number consisting only of ones in binary? Let's say we do this:
num & 7;
If num is smaller or equal to 7 then the result will be num because each bit &-ed with 1 is itself. If num is bigger than 7, during the & operation computer will consider 7's leading zeros which of course will stay as zeros after & operation only the trailing part will remain. Like in case of 9 & 7 in binary it will look like
1001 & 0111
the result will be 0001 which is 1 in decimal and addresses second element in array.
Base64 encoding is an example. Base64 encoding is used to represent binary data as a printable characters for sending over email systems (and other purposes). Base64 encoding converts a series of 8 bit bytes into 6 bit character lookup indexes. Bit operations, shifting, and'ing, or'ing, not'ing are very useful for implementing the bit operations necessary for Base64 encoding and decoding.
This of course is only 1 of countless examples.
I'm suprised no one picked the obvious answer for the Internet age. Calculating valid network addresses for a subnet.
http://www.topwebhosts.org/tools/netmask.php
Nobody seems to have mentioned fixed point maths.
(Yeah, I'm old, ok?)
Is a number x a power of 2? (Useful for example in algorithms where a counter is incremented, and an action is to be taken only logarithmic number of times)
(x & (x - 1)) == 0
Which is the highest bit of an integer x? (This for example can be used to find the minimum power of 2 that is larger than x)
x |= (x >> 1);
x |= (x >> 2);
x |= (x >> 4);
x |= (x >> 8);
x |= (x >> 16);
return x - (x >>> 1); // ">>>" is unsigned right shift
Which is the lowest 1 bit of an integer x? (Helps find number of times divisible by 2.)
x & -x
If you ever want to calculate your number mod(%) a certain power of 2, you can use yourNumber & 2^N-1, which in this case is the same as yourNumber % 2^N.
number % 16 = number & 15;
number % 128 = number & 127;
This is probably only useful being an alternative to modulus operation with a very big dividend that is 2^N... But even then its speed boost over the modulus operation is negligible in my test on .NET 2.0. I suspect modern compilers already perform optimizations like this. Anyone know more about this?
I use them for multi select options, this way I only store one value instead of 10 or more
it can also be handy in a sql relational model, let's say you have the following tables: BlogEntry, BlogCategory
traditonally you could create a n-n relationship between them using a BlogEntryCategory table
or when there are not that much BlogCategory records you could use one value in BlogEntry to link to multiple BlogCategory records just like you would do with flagged enums,
in most RDBMS there are also a very fast operators to select on that 'flagged' column...
When you only want to change some bits of a microcontroller's Outputs, but the register to write to is a byte, you do something like this (pseudocode):
char newOut = OutRegister & 0b00011111 //clear 3 msb's
newOut = newOut | 0b10100000 //write '101' to the 3 msb's
OutRegister = newOut //Update Outputs
Of course, many microcontrollers allow you to change each bit individually...
I've seen them used in role based access control systems.
There is a real world use in my question here -
Respond to only the first WM_KEYDOWN notification?
When consuming a WM_KEYDOWN message in the windows C api bit 30 specifies the previous key state. The value is 1 if the key is down before the message is sent, or it is zero if the key is up
They are mostly used for bitwise operations (surprise). Here are a few real-world examples found in PHP codebase.
Character encoding:
if (s <= 0 && (c & ~MBFL_WCSPLANE_MASK) == MBFL_WCSPLANE_KOI8R) {
Data structures:
ar_flags = other->ar_flags & ~SPL_ARRAY_INT_MASK;
Database drivers:
dbh->transaction_flags &= ~(PDO_TRANS_ACCESS_MODE^PDO_TRANS_READONLY);
Compiler implementation:
opline->extended_value = (opline->extended_value & ~ZEND_FETCH_CLASS_MASK) | ZEND_FETCH_CLASS_INTERFACE;
I've seen it in a few game development books as a more efficient way to multiply and divide.
2 << 3 == 2 * 8
32 >> 4 == 32 / 16
Whenever I first started C programming, I understood truth tables and all that, but it didn't all click with how to actually use it until I read this article http://www.gamedev.net/reference/articles/article1563.asp (which gives real life examples)
I don't think this counts as bitwise, but ruby's Array defines set operations through the normal integer bitwise operators. So [1,2,4] & [1,2,3] # => [1,2]. Similarly for a ^ b #=> set difference and a | b #=> union.