Is this defined by the language? Is there a defined maximum? Is it different in different browsers?
JavaScript has two number types: Number and BigInt.
The most frequently-used number type, Number, is a 64-bit floating point IEEE 754 number.
The largest exact integral value of this type is Number.MAX_SAFE_INTEGER, which is:
253-1, or
+/- 9,007,199,254,740,991, or
nine quadrillion seven trillion one hundred ninety-nine billion two hundred fifty-four million seven hundred forty thousand nine hundred ninety-one
To put this in perspective: one quadrillion bytes is a petabyte (or one thousand terabytes).
"Safe" in this context refers to the ability to represent integers exactly and to correctly compare them.
From the spec:
Note that all the positive and negative integers whose magnitude is no
greater than 253 are representable in the Number type (indeed, the
integer 0 has two representations, +0 and -0).
To safely use integers larger than this, you need to use BigInt, which has no upper bound.
Note that the bitwise operators and shift operators operate on 32-bit integers, so in that case, the max safe integer is 231-1, or 2,147,483,647.
const log = console.log
var x = 9007199254740992
var y = -x
log(x == x + 1) // true !
log(y == y - 1) // also true !
// Arithmetic operators work, but bitwise/shifts only operate on int32:
log(x / 2) // 4503599627370496
log(x >> 1) // 0
log(x | 1) // 1
Technical note on the subject of the number 9,007,199,254,740,992: There is an exact IEEE-754 representation of this value, and you can assign and read this value from a variable, so for very carefully chosen applications in the domain of integers less than or equal to this value, you could treat this as a maximum value.
In the general case, you must treat this IEEE-754 value as inexact, because it is ambiguous whether it is encoding the logical value 9,007,199,254,740,992 or 9,007,199,254,740,993.
>= ES6:
Number.MIN_SAFE_INTEGER;
Number.MAX_SAFE_INTEGER;
<= ES5
From the reference:
Number.MAX_VALUE;
Number.MIN_VALUE;
console.log('MIN_VALUE', Number.MIN_VALUE);
console.log('MAX_VALUE', Number.MAX_VALUE);
console.log('MIN_SAFE_INTEGER', Number.MIN_SAFE_INTEGER); //ES6
console.log('MAX_SAFE_INTEGER', Number.MAX_SAFE_INTEGER); //ES6
It is 253 == 9 007 199 254 740 992. This is because Numbers are stored as floating-point in a 52-bit mantissa.
The min value is -253.
This makes some fun things happening
Math.pow(2, 53) == Math.pow(2, 53) + 1
>> true
And can also be dangerous :)
var MAX_INT = Math.pow(2, 53); // 9 007 199 254 740 992
for (var i = MAX_INT; i < MAX_INT + 2; ++i) {
// infinite loop
}
Further reading: http://blog.vjeux.com/2010/javascript/javascript-max_int-number-limits.html
In JavaScript, there is a number called Infinity.
Examples:
(Infinity>100)
=> true
// Also worth noting
Infinity - 1 == Infinity
=> true
Math.pow(2,1024) === Infinity
=> true
This may be sufficient for some questions regarding this topic.
Jimmy's answer correctly represents the continuous JavaScript integer spectrum as -9007199254740992 to 9007199254740992 inclusive (sorry 9007199254740993, you might think you are 9007199254740993, but you are wrong!
Demonstration below or in jsfiddle).
console.log(9007199254740993);
However, there is no answer that finds/proves this programatically (other than the one CoolAJ86 alluded to in his answer that would finish in 28.56 years ;), so here's a slightly more efficient way to do that (to be precise, it's more efficient by about 28.559999999968312 years :), along with a test fiddle:
/**
* Checks if adding/subtracting one to/from a number yields the correct result.
*
* #param number The number to test
* #return true if you can add/subtract 1, false otherwise.
*/
var canAddSubtractOneFromNumber = function(number) {
var numMinusOne = number - 1;
var numPlusOne = number + 1;
return ((number - numMinusOne) === 1) && ((number - numPlusOne) === -1);
}
//Find the highest number
var highestNumber = 3; //Start with an integer 1 or higher
//Get a number higher than the valid integer range
while (canAddSubtractOneFromNumber(highestNumber)) {
highestNumber *= 2;
}
//Find the lowest number you can't add/subtract 1 from
var numToSubtract = highestNumber / 4;
while (numToSubtract >= 1) {
while (!canAddSubtractOneFromNumber(highestNumber - numToSubtract)) {
highestNumber = highestNumber - numToSubtract;
}
numToSubtract /= 2;
}
//And there was much rejoicing. Yay.
console.log('HighestNumber = ' + highestNumber);
Many earlier answers have shown 9007199254740992 === 9007199254740992 + 1 is true to verify that 9,007,199,254,740,991 is the maximum and safe integer.
But what if we keep doing accumulation:
input: 9007199254740992 + 1 output: 9007199254740992 // expected: 9007199254740993
input: 9007199254740992 + 2 output: 9007199254740994 // expected: 9007199254740994
input: 9007199254740992 + 3 output: 9007199254740996 // expected: 9007199254740995
input: 9007199254740992 + 4 output: 9007199254740996 // expected: 9007199254740996
We can see that among numbers greater than 9,007,199,254,740,992, only even numbers are representable.
It's an entry to explain how the double-precision 64-bit binary format works. Let's see how 9,007,199,254,740,992 be held (represented) by using this binary format.
Using a brief version to demonstrate it from 4,503,599,627,370,496:
1 . 0000 ---- 0000 * 2^52 => 1 0000 ---- 0000.
|-- 52 bits --| |exponent part| |-- 52 bits --|
On the left side of the arrow, we have bit value 1, and an adjacent radix point. By consuming the exponent part on the left, the radix point is moved 52 steps to the right. The radix point ends up at the end, and we get 4503599627370496 in pure binary.
Now let's keep incrementing the fraction part with 1 until all the bits are set to 1, which equals 9,007,199,254,740,991 in decimal.
1 . 0000 ---- 0000 * 2^52 => 1 0000 ---- 0000.
(+1)
1 . 0000 ---- 0001 * 2^52 => 1 0000 ---- 0001.
(+1)
1 . 0000 ---- 0010 * 2^52 => 1 0000 ---- 0010.
(+1)
.
.
.
1 . 1111 ---- 1111 * 2^52 => 1 1111 ---- 1111.
Because the 64-bit double-precision format strictly allots 52 bits for the fraction part, no more bits are available if we add another 1, so what we can do is setting all bits back to 0, and manipulate the exponent part:
┏━━▶ This bit is implicit and persistent.
┃
1 . 1111 ---- 1111 * 2^52 => 1 1111 ---- 1111.
|-- 52 bits --| |-- 52 bits --|
(+1)
1 . 0000 ---- 0000 * 2^52 * 2 => 1 0000 ---- 0000. * 2
|-- 52 bits --| |-- 52 bits --|
(By consuming the 2^52, radix
point has no way to go, but
there is still one 2 left in
exponent part)
=> 1 . 0000 ---- 0000 * 2^53
|-- 52 bits --|
Now we get the 9,007,199,254,740,992, and for the numbers greater than it, the format can only handle increments of 2 because every increment of 1 on the fraction part ends up being multiplied by the left 2 in the exponent part. That's why double-precision 64-bit binary format cannot hold odd numbers when the number is greater than 9,007,199,254,740,992:
(consume 2^52 to move radix point to the end)
1 . 0000 ---- 0001 * 2^53 => 1 0000 ---- 0001. * 2
|-- 52 bits --| |-- 52 bits --|
Following this pattern, when the number gets greater than 9,007,199,254,740,992 * 2 = 18,014,398,509,481,984 only 4 times the fraction can be held:
input: 18014398509481984 + 1 output: 18014398509481984 // expected: 18014398509481985
input: 18014398509481984 + 2 output: 18014398509481984 // expected: 18014398509481986
input: 18014398509481984 + 3 output: 18014398509481984 // expected: 18014398509481987
input: 18014398509481984 + 4 output: 18014398509481988 // expected: 18014398509481988
How about numbers between [ 2 251 799 813 685 248, 4 503 599 627 370 496 )?
1 . 0000 ---- 0001 * 2^51 => 1 0000 ---- 000.1
|-- 52 bits --| |-- 52 bits --|
The value 0.1 in binary is exactly 2^-1 (=1/2) (=0.5)
So when the number is less than 4,503,599,627,370,496 (2^52), there is one bit available to represent the 1/2 times of the integer:
input: 4503599627370495.5 output: 4503599627370495.5
input: 4503599627370495.75 output: 4503599627370495.5
Less than 2,251,799,813,685,248 (2^51)
input: 2251799813685246.75 output: 2251799813685246.8 // expected: 2251799813685246.75
input: 2251799813685246.25 output: 2251799813685246.2 // expected: 2251799813685246.25
input: 2251799813685246.5 output: 2251799813685246.5
/**
Please note that if you try this yourself and, say, log
these numbers to the console, they will get rounded. JavaScript
rounds if the number of digits exceed 17. The value
is internally held correctly:
*/
input: 2251799813685246.25.toString(2)
output: "111111111111111111111111111111111111111111111111110.01"
input: 2251799813685246.75.toString(2)
output: "111111111111111111111111111111111111111111111111110.11"
input: 2251799813685246.78.toString(2)
output: "111111111111111111111111111111111111111111111111110.11"
And what is the available range of exponent part? 11 bits allotted for it by the format.
From Wikipedia (for more details, go there)
So to make the exponent part be 2^52, we exactly need to set e = 1075.
To be safe
var MAX_INT = 4294967295;
Reasoning
I thought I'd be clever and find the value at which x + 1 === x with a more pragmatic approach.
My machine can only count 10 million per second or so... so I'll post back with the definitive answer in 28.56 years.
If you can't wait that long, I'm willing to bet that
Most of your loops don't run for 28.56 years
9007199254740992 === Math.pow(2, 53) + 1 is proof enough
You should stick to 4294967295 which is Math.pow(2,32) - 1 as to avoid expected issues with bit-shifting
Finding x + 1 === x:
(function () {
"use strict";
var x = 0
, start = new Date().valueOf()
;
while (x + 1 != x) {
if (!(x % 10000000)) {
console.log(x);
}
x += 1
}
console.log(x, new Date().valueOf() - start);
}());
The short answer is “it depends.”
If you’re using bitwise operators anywhere (or if you’re referring to the length of an Array), the ranges are:
Unsigned: 0…(-1>>>0)
Signed: (-(-1>>>1)-1)…(-1>>>1)
(It so happens that the bitwise operators and the maximum length of an array are restricted to 32-bit integers.)
If you’re not using bitwise operators or working with array lengths:
Signed: (-Math.pow(2,53))…(+Math.pow(2,53))
These limitations are imposed by the internal representation of the “Number” type, which generally corresponds to IEEE 754 double-precision floating-point representation. (Note that unlike typical signed integers, the magnitude of the negative limit is the same as the magnitude of the positive limit, due to characteristics of the internal representation, which actually includes a negative 0!)
ECMAScript 6:
Number.MAX_SAFE_INTEGER = Math.pow(2, 53)-1;
Number.MIN_SAFE_INTEGER = -Number.MAX_SAFE_INTEGER;
Other may have already given the generic answer, but I thought it would be a good idea to give a fast way of determining it :
for (var x = 2; x + 1 !== x; x *= 2);
console.log(x);
Which gives me 9007199254740992 within less than a millisecond in Chrome 30.
It will test powers of 2 to find which one, when 'added' 1, equals himself.
Anything you want to use for bitwise operations must be between 0x80000000 (-2147483648 or -2^31) and 0x7fffffff (2147483647 or 2^31 - 1).
The console will tell you that 0x80000000 equals +2147483648, but 0x80000000 & 0x80000000 equals -2147483648.
JavaScript has received a new data type in ECMAScript 2020: BigInt. It introduced numerical literals having an "n" suffix and allows for arbitrary precision:
var a = 123456789012345678901012345678901n;
Precision will still be lost, of course, when such big integer is (maybe unintentionally) coerced to a number data type.
And, obviously, there will always be precision limitations due to finite memory, and a cost in terms of time in order to allocate the necessary memory and to perform arithmetic on such large numbers.
For instance, the generation of a number with a hundred thousand decimal digits, will take a noticeable delay before completion:
console.log(BigInt("1".padEnd(100000,"0")) + 1n)
...but it works.
Try:
maxInt = -1 >>> 1
In Firefox 3.6 it's 2^31 - 1.
I did a simple test with a formula, X-(X+1)=-1, and the largest value of X I can get to work on Safari, Opera and Firefox (tested on OS X) is 9e15. Here is the code I used for testing:
javascript: alert(9e15-(9e15+1));
I write it like this:
var max_int = 0x20000000000000;
var min_int = -0x20000000000000;
(max_int + 1) === 0x20000000000000; //true
(max_int - 1) < 0x20000000000000; //true
Same for int32
var max_int32 = 0x80000000;
var min_int32 = -0x80000000;
Let's get to the sources
Description
The MAX_SAFE_INTEGER constant has a value of 9007199254740991 (9,007,199,254,740,991 or ~9 quadrillion). The reasoning behind that number is that JavaScript uses double-precision floating-point format numbers as specified in IEEE 754 and can only safely represent numbers between -(2^53 - 1) and 2^53 - 1.
Safe in this context refers to the ability to represent integers exactly and to correctly compare them. For example, Number.MAX_SAFE_INTEGER + 1 === Number.MAX_SAFE_INTEGER + 2 will evaluate to true, which is mathematically incorrect. See Number.isSafeInteger() for more information.
Because MAX_SAFE_INTEGER is a static property of Number, you always use it as Number.MAX_SAFE_INTEGER, rather than as a property of a Number object you created.
Browser compatibility
In JavaScript the representation of numbers is 2^53 - 1.
However, Bitwise operation are calculated on 32 bits ( 4 bytes ), meaning if you exceed 32bits shifts you will start loosing bits.
In the Google Chrome built-in javascript, you can go to approximately 2^1024 before the number is called infinity.
Scato wrotes:
anything you want to use for bitwise operations must be between
0x80000000 (-2147483648 or -2^31) and 0x7fffffff (2147483647 or 2^31 -
1).
the console will tell you that 0x80000000 equals +2147483648, but
0x80000000 & 0x80000000 equals -2147483648
Hex-Decimals are unsigned positive values, so 0x80000000 = 2147483648 - thats mathematically correct. If you want to make it a signed value you have to right shift: 0x80000000 >> 0 = -2147483648. You can write 1 << 31 instead, too.
Firefox 3 doesn't seem to have a problem with huge numbers.
1e+200 * 1e+100 will calculate fine to 1e+300.
Safari seem to have no problem with it as well. (For the record, this is on a Mac if anyone else decides to test this.)
Unless I lost my brain at this time of day, this is way bigger than a 64-bit integer.
Node.js and Google Chrome seem to both be using 1024 bit floating point values so:
Number.MAX_VALUE = 1.7976931348623157e+308
Related
Consider the following code:
int i = 1;
System.out.println("1 binary: " + Long.toBinaryString(i));
long ri = Long.reverse(i);
System.out.println("1 reverse bit decimal: " + ri);
System.out.println("1 reverse bit binary: "+ Long.toBinaryString(ri));
BigInteger bil = new BigInteger(1, Longs.toByteArray(ri));
System.out.println("1 Sign-Magnitude BigInteger toString: " + bil.toString());
The output is:
1 binary: 1
1 reverse bit decimal: -9223372036854775808
1 reverse bit binary: 1000000000000000000000000000000000000000000000000000000000000000
1 Sign-Magnitude BigInteger toString: 9223372036854775808
Can anyone help to explain why the value of "1 Sign-Magnitude BigInteger toString:" is 9223372036854775808 (2^63)?
To get the sign-magnitude of a value, you simply take its absolute value as magnitude and remember the sign in a separate bit (or byte).
The sign-magnitude representation of, say, 722 is simply:
sign = 0
magnitude = 722
The sign magnitude of -722 is simply:
sign = 1
magnitude = 722
That is also what BigInteger uses.
Your code reverses a value, which means that, say, the 8 bit value 00000001 (1) is changed into 10000000 (128 or 2^7). That is not the same as inverting, which turns e.g. 00000001 (1) into 11111110 (254). That is what one's complement does. The generally used two's complement negates 00000001 (1) into 11111111 (255, i.e. 256 - 1). You should read up about two's complement, which takes some understanding. Sign-magnitude is, however, very easy to understand (but not always very practical -- addition, subtraction are different for signed and unsigned, etc. -- and that is why most processors use two's-complement)
So again: sign-magnitude works like this:
sign = (n < 0)
magnitude = abs(n)
I have a file that defines a set of tiles (used in an online game). The format for each tile is as follows:
x: 12 bits
y: 12 bits
tile: 8 bits
32 bits in total, so each tile can be expressed as a 32 bit integer.
More info about the file format can be found here:
http://wiki.minegoboom.com/index.php/LVL_Format
http://www.rarefied.org/subspace/lvlformat.html
The 4 byte structures are not broken along byte boundaries. As you can see x: and y: are both defined as 12 bits. ie. x is stored in 1.5 bytes, y is stored in 1.5 bytes and tile is stored in 1 byte.
Even though x and y use 12 bits their max value is 1023, so they could be expressed in 10 bits. This was down to the creator of the format. I guess they were just padding things out so they could use a 32-bit integer for each tile? Either way, for x and y we can ignore the final 2 bits.
I'm using a nodejs Buffer to read the file and I'm using the following code to read the values.
var n = tileBuffer.readUInt32LE(0);
var x = n & 0x03FF;
var y = (n >> 12) & 0x03FF;
var tile = (n >> 24) & 0x00ff;
This code works fine but when I read the bits themselves, in an attempt to understand binary better, I see something that confuses me.
Take, for example a int that expresses the following:
x: 1023
y: 1023
tile: 1
Creating the tiles in a map editor and reading the resulting file into a buffer returns <Buffer ff f3 3f 01>
When I convert each byte into a string of bits I get the following:
ff = 11111111
f3 = 11110011
3f = 00111111
01 = 00000001
11111111 11110011 00111111 00000001
I assume I should just take the first 12 bits as x but chop off the last 2 bits. Use the next 12 bits as y, chopping off 2 bits again, and the remaining 8 bits would be the tile.
x: 1111111111
y: 0011001111
tile: 00000001
The x is correct (1111111111 = 1023), the y is wrong (0011001111 = 207, not 1023), and tile is correct (00000001 = 1)
I'm confused and obviously missing something.
It makes more sense to look at it in this order: (this would be the binary representation of n)
00000001 00111111 11110011 11111111
On that order, you can easily do the masking and shifting visually.
The problem with what you did is that for example in 11111111 11110011, the bits of the second byte that belong to the first field are at the right (the lowest part of that byte), which in that order is discontinuous.
Also, masking with 0x03FF makes those first two fields have 10 bits, with two bits just disappearing. You can make them 12 bits by masking with 0x0FFF. As it is now, you effectively have two padding bits.
In MSACCESS VBA, I convert a HEX string to decimal by prefixing the string with "&h"
?CLng("&h1234")
4660
?CLng("&h80000000")
-2147483648
What should I do to convert it to an unsigned integer?
Using CDbl doesn't work either:
?CDbl("&h80000000")
-2147483648
Your version seems like the best answer, but can be shortened a bit:
Function Hex2Dbl(h As String) As Double
Hex2Dbl = CDbl("&h0" & h) ' Overflow Error if more than 2 ^ 64
If Hex2Dbl < 0 Then Hex2Dbl = Hex2Dbl + 4294967296# ' 16 ^ 8 = 4294967296
End Function
Double will have rounding precision error for most values above 2 ^ 53 - 1 (about 16 decimal digits), but Decimal can be used for values up to 16 ^ 12 - 1 (Decimal uses 16 bytes, but only 12 of them for the number)
Function Hex2Dec(h)
Dim L As Long: L = Len(h)
If L < 16 Then ' CDec results in Overflow error for hex numbers above 16 ^ 8
Hex2Dec = CDec("&h0" & h)
If Hex2Dec < 0 Then Hex2Dec = Hex2Dec + 4294967296# ' 2 ^ 32
ElseIf L < 25 Then
Hex2Dec = Hex2Dec(Left$(h, L - 9)) * 68719476736# + CDec("&h" & Right$(h, 9)) ' 16 ^ 9 = 68719476736
End If
End Function
If you want to go higher than 2^31 you could use Decimal or LongLong. LongLong and CLngLngonly work on 64bit platforms though. Since I only have 32 bit office at the moment, this is for Decimal and CDec.
There seems to be an issue when converting 8-digit Hex numbers because apparently signed 32-bit is used somewhere in the process which results in the sign mistake even though Decimal could handle the number.
'only for positive numbers
Function myHex2Dec(hexString As String) As Variant
'cut off "&h" if present
If Left(hexString, 2) = "&h" Or Left(hexString, 2) = "&H" Then hexString = Mid(hexString, 3)
'cut off leading zeros
While Left(hexString, 1) = "0"
hexString = Mid(hexString, 2)
Wend
myHex2Dec = CDec("&h" & hexString)
'correct value for 8 digits onle
If myHex2Dec < 0 And Len(hexString) = 8 Then
myHex2Dec = CDec("&h1" & hexString) - 4294967296#
'cause overflow for 16 digits
ElseIf myHex2Dec < 0 Then
Error (6) 'overflow
End If
End Function
Test:
Sub test()
Dim v As Variant
v = CDec("&H80000000") '-2147483648
v = myHex2Dec("&H80000000") '2147483648
v = CDec("&H7FFFFFFFFFFFFFFF") '9223372036854775807
v = myHex2Dec("&H7FFFFFFFFFFFFFFF") '9223372036854775807
v = CDec("&H8000000000000000") '-9223372036854775808
v = myHex2Dec("&H8000000000000000") 'overflow
End Sub
With remark of #arcadeprecinct I was able to create a function for it:
Function Hex2UInt(h As String) As Double
Dim dbl As Double: dbl = CDbl("&h" & h)
If dbl < 0 Then
dbl = CDbl("&h1" & h) - 4294967296#
End If
Hex2UInt = dbl
End Function
Some example output:
?Hex2UInt("1234")
4660
?Hex2UInt("80000000")
2147483648
?Hex2UInt("FFFFFFFFFFFF")
281474976710655
Maximum value to represent as an integer is 0x38D7EA4C67FFF
?Hex2UInt("38D7EA4C67FFF")
999999999999999
?Hex2UInt("38D7EA4C68000")
1E+15
a proposal, result in h
sh = "&H80000000"
h = CDbl(sh)
If h < 0 Then
fd = Hex$(CDbl(Left(sh, 3)) - 8)
sh = "&h" & fd & Mid(sh, 4)
h = CDbl(sh) + 2 ^ 31
End If
I found my way here looking for a Word VBA solution, but what I've discovered might also apply to other Office apps. I realise that this is a very old question and that there are some ingenious solutions to it, but I'm surprised that nobody has explained what it is that seems to be the root cause of the problem, and hence what might possibly be a one-line solution in many cases. When I was an assembly language programmer in the 1970s, working more in binary and octal than anything else, this was a very common issue, known as "2s complement".
I'll explain it in its simplest form, from first principles, by the way it works on a byte, so that it's understandable even by absolute beginners.
Normally, the most significant bit is bit-7 at the left which has a value of 128, the least significant bit is bit-0 at the right which has a value of 1. Therefore, the highest possible value if all bits are set is 255. However in 2s complement, bit-7 is the "sign bit". This only leaves the seven bits from 0 to 6 to hold the actual value, giving them a maximum value of 127. The sign bit has a value of -128. If all 8 bits are set, the byte value becomes (-128 + 127) which gives the negative decimal value of -1. The 2s complement range of values for 8 bits is from -128 (with only bit-7 set) to +127 (with only bits 0 to 6 set). If the sign bit is set, the value of the byte is -128 plus the positive value of whatever is stored in bits 0 to 6. E.g. binary 11111101 = hex FD = decimal (-128 + 125) = -3, 10110100 = hex B4 = decimal (-128 + 52) = -76.
2s complement applies the same effect at each increasing 8-bit boundary, thus for 16 bits, the sign bit is bit-15 (with a value of -32,768) and the positive value is in bits 0 to 14, giving a 16-bit range of values from -32768 to 32767. Similarly, the 24-bit range is from -8388608 to 8388607, and so on.
I recently encountered this conversion problem in some code that was converting hexadecimal RGB colour values which originated as a 6-character text string in a Word document. Having successfully processed tens of thousands of these I was suddeny presented with an "out of range" error pop-up. The string that had caused the problem was "008080". The command ... = Val("&H" + variable) had converted this to -32896, an invalid value to pass as a colour property. The Val() function had removed the leading zeros and treated 8080 as a signed 2s complement 16-bit value.
In my case the solution was simple. Because I know that I'll always be dealing with 24-bit, 6-character hex values. I just added an extra "1" text character to the front of the hex code (thus making it longer than 16 bits), then, in effect, subtracted the same value. So, with the original 6-character hex RGB code held in the variable HexCode, I get the right decimal result using the command
DecCode = Val("&H" + "1" + HexCode) - Val("&H" + "1000000")
Problem solved, by just adding a little extra code to an existing line. I hope that my explanation of the cause of the problem helps others to devise their own solutions where it's appropriate.
Is there an elegant way of moving a bit within a byte (or word/long). For simplicity, lets use a simple 8-bit byte and just one bit to move within the byte.
Given a bit number, based on 0-7 Least-sig-bit to most-sig-bit, (or bits 1-8 if you'd rather), I would like to move a bit from one position to another:
7654 3210 <bit position
0101 1010 <some binary value
--x- --y- <move bit from x to y
0111 0100 <new value with x moved to y and intervening bits shifted left
So, x at bit position 5 moves to y at bit position 1, bits 0,6,7 stay unchanged. Bits 2,3,4 are shifted left to 'make room' for the bit moved from 5 to 2. This is just an example.
It is important that the bit moves, not swapped with its target. There are numerous exampls of bits that swap, but that is quite trivial.
The solution ideally would use simple bit-twiddling and bitwise operators. Assume language agnostic, bit simple AND/OR/XOR, NOT, SHIFT Left/Right / ROTATE or similar instructions would be fine in any combination, plus any other basic arithmetic operator, eg: mod, addition/subtraction etc. Even working psuedo-code would be ok. Alternatively, a bit array or bitfield type structure would probably be straightforward.
In addition to the actual bit move, I would like to find a way to :
Move any bit up or down.
Specify the bit number source/destination in any convenient format: eg: 6>2
implies shift down, 3>7 shift up or start-bit +/- offset: 6-4 or 3+4, or bit weighted: bit 6=64 to bit 3=8.
Possibly extendable from byte to unsigned int, long, etc.
(Ideally, be extendable
to more than one bit at a time, probably adjacent bits if easier)
Performance is not a major issue, but something elegant is likley to be plenty fast enough.
My own niaive approach would be to identify the source and target bit positions, decide if shift up or down, take a shifted copy, mask off the static bits and find the source bit, merge the static and shifted bits and somehow set/clear the target bit. However, while the theory seems good, an elegant implementation is beyond me.
I realise that a precompiled lookup table could be built for a byte, but if this is to be extended to integers/longs, this would be impractical for me.
Any help appreciated. Thanks in advance.
First, an observation about the original problem, and the subsequent extensions that you mention:
The "moving a bit" operation that you describe is really a rotation of a contiguous range of bits. In your example, you are rotating bits 1-5 inclusive, by one bit to the left:
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
| 0 | 1 | 0<--1<--1<--0<--1 | 0 | -> | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 0 |
+---+---+-|-+---+---+---+-^-+---+ +---+---+---+---+---+---+---+---+
| |
+---------------+
If you consider a more general form of this operation to be "rotate a range of bits left by some amount" with three parameters:
the least significant bit to include in the rotation
the most significant bit to include in the rotation
the number of bits to rotate by
then it becomes a single basic primitive which can perform all of the things you want to do:
you can obviously move any bit (choose appropriate least/most significant bit paramaters);
you can rotate left or right, because if you are rotating a range of n bits, then a rotation right by k bits is the same thing as a rotation left by n - k bits;
it trivially generalises to any bit width;
by definition we can rotate more by more than one bit at a time.
So now, all that's needed is to construct this primitive...
To start with, we're almost certainly going to need a bit mask for the bits we care about.
We can form a mask for bits 0 - n by shifting a 1 by n + 1 bits to the left, then subtracting 1. e.g. a mask for bits 0-5 would be (in binary):
00111111
...which can be formed by taking a 1:
00000001
...shifting 5+1 = 6 bits to the left:
01000000
...and subtracting 1 to give:
00111111
In C, this would be (1 << (bit + 1)) - 1. But there is a subtlety here, for C at least (and I apologise for the digression when you've tagged this as language-agnostic, but this is important, and there are probably similar issues in other languages too): a shift by the width of your type (or more) leads to undefined behaviour. So if we were trying to construct a mask for bits 0-7 for an 8-bit type, the calculation would be (1 << 8) - 1, which would be undefined. (It might work on some systems and some compilers, but wouldn't be portable.) There are also undefined behaviour issues with signed types in the case where you would end up shifting into the sign bit.
Fortunately, in C, we can avoid these problems by using an unsigned type, and writing the expression as (1 << bit) + (1 << bit) - 1. Arithmetic with unsigned n-bit values is defined by the standard to be reduced modulo 2n, and all of the individual operations are well-defined, so we're guaranteed to get the right answer.
(End of digression.)
OK, so now we have a mask for bits 0 - msb. We want to make a mask for bits lsb - msb, which we can do by subtracting the mask for bits 0 - (lsb-1), which is (1 << lsb) - 1. e.g.
00111111 mask for bits 0-5: (1 << 5) + (1 << 5) - 1
- 00000001 mask for bits 0-0: (1 << 1) - 1
-------- -------------------------------
00111110 mask for bits 1-5: (1 << 5) + (1 << 5) - (1 << 1)
So the final expression for the mask is:
mask = (1 << msb) + (1 << msb) - (1 << lsb);
The bits to be rotated can be selected by a bitwise AND with the mask:
to_rotate = value & mask;
...and the bits that will be left untouched can be selected by a AND with the inverted mask:
untouched = value & ~mask;
The rotation itself can be performed easily in two parts: first, we can obtain the leftmost bits of the rotated portion by simply rotating to_rotate left and discarding any bits that fall outside the mask:
left = (to_rotate << shift) & mask;
To get the rightmost bits, rotate to_rotate right by (n - shift) bits, where n is the number of bits we're rotating (this n can be calculated as msb + 1 - lsb):
right = (to_rotate >> (msb + 1 - lsb - shift)) & mask;
The final result can be obtained by combining all the bits from untouched, left, and right:
result = untouched | left | right;
Your original example would work like this (msb is 5, lsb is 1, and shift is 1):
value = 01011010
mask = 00111110 from (1 << 5) + (1 << 5) - (1 << 1)
01011010 value
& 00111110 mask
----------
to_rotate = 00011010
01011010 value
& 11000001 ~mask (i.e. inverted mask)
----------
untouched = 01000000
00110100 to_rotate << 1
& 00111110 mask
----------
left = 00110100
00000001 to_rotate >> 4 (5 + 1 - 1 - 1 = 4)
& 00111110 mask
----------
right = 00000000
01000000 untouched
00110100 left
| 00000000 right
----------
result = 01110100
Here's a different example with a 16-bit input value, msb = 15, lsb = 4, and shift = 4 (which rotates the top 3 hex digits of a 4-digit hex value):
value = 0101011001111000 (0x5678)
mask = 1111111111110000 from (1 << 15) + (1 << 15) - (1 << 4)
0101011001111000 value
& 1111111111110000 mask
------------------
to_rotate = 0101011001110000
0101011001111000 value
& 0000000000001111 ~mask
------------------
untouched = 0000000000001000
0110011100000000 to_rotate << 4
& 1111111111110000 mask
------------------
left = 0110011100000000
0000000001010110 to_rotate >> 8 (15 + 1 - 4 - 4 = 8)
& 1111111111110000 mask
------------------
right = 0000000001010000
0000000000001000 untouched
0110011100000000 left
| 0000000001010000 right
------------------
result = 0110011101011000 = 0x6758
Here's a working implementation in C that is not highly optimised but which might at least serve as a starting point for any further implementations. It works with ints but you can adapt it for any word size, or just use it as is and mask out any unwanted high order bits (e.g. if you are working with individual bytes). I broke the functionality down into two lower level routines for extracting a bit and inserting a bit - these may have other uses, I imagine.
//
// bits.c
//
#include <stdio.h>
#include <stdlib.h>
//
// extract_bit
//
// extract bit at given index and move less significant bits left
//
int extract_bit(int *word, int index)
{
int result = (*word & (1 << index)) != 0;
int mask = (1 << index) + (1 << index) - 1;
*word = ((*word << 1) & mask) | (*word & ~mask);
return result;
}
//
// insert_bit
//
// insert bit at given index and move less significant bits right
//
void insert_bit(int *word, int index, int val)
{
int mask1 = (1 << index) + (1 << index) - 1;
int mask2 = (1 << index) - 1;
*word = ((*word >> 1) & mask2) | (*word & ~mask1) | (val << index);
}
//
// move_bit
//
// move bit from given src index to given dest index
//
int move_bit(int *word, int src_index, int dest_index)
{
int val = extract_bit(word, src_index);
insert_bit(word, dest_index, val);
return val;
}
int main(int argc, char * argv[])
{
if (argc > 2)
{
int test = 0x55555555;
int index1 = atoi(argv[1]);
int index2 = atoi(argv[2]);
printf("test (before) = %#x\n", test);
printf("index (src) = %d\n", index1);
printf("index (dest) = %d\n", index2);
move_bit(&test, index1, index2);
printf("test (after) = %#x\n", test);
}
return 0;
}
This likely doesn't qualify as "elegant," but you might be able to cram it into one line if that is your kind of thing? The plan is to split the number into four pieces (shouldn't be hard with bit operations, right?), do the appropriate things to them, and then put the three pieces back together.
Number: 01x1 10y1
P1 (before x): 0100 0000
P2 (just bit x): 00x0 0000
P3 (between x and y): 0001 10y0
P4 (after y): 0000 0001
Then the number you want is [P1] + [P3 shifted up by 1] + [P2 shifted down by 4] + [P4].
P1: 0100 0000
P2 shifted down by 3: 0000 00x0
P3 shifted up by 1: 0011 0y00
P4: 0000 0001
Sum: 0111 0yx1
Are you using bits to conserve space? Is it REALLY needed?
You might be better off with a list class that allows you to remove and insert items in the list. In your case the items would be Booleans.
I'm attempting to write a function in assembly that will detect if a longer binary number contains a smaller binary pattern.
Example:
Does 100111 contain 1001?
When I read this problem I figured that I would do a bitwise-AND with the large number and its smaller pattern while shifting right (logical) each time in a loop.
So, in my head I thought it would do:
100111 AND 1001 = 0
Shift-right 1
010011 AND 1001 = 0
Shift-right 1
001001 AND 1001 = 1 // Pattern FOUND!
and repeat this until either the number was shifted until it was zero or the AND returned 1.
However, I think I must have something confused because this is returning 1 for most things I put in, on the first run of the loop. Am I confused on my usage of AND?
The problem is that "partial matches" also return a non-zero value for your AND check:
100111 AND 001001 = 000001
So this tests if any of the bits match, but you want to make sure all bits are the same. The result of the AND needs to be equal to the pattern you are searching:
x = 100111
if (x AND 1001 == 1001)
print "found"
Bitwise AND does not work the way you expect (judging from the samples and ignoring the notation which seems to suggest you are using bitwise AND as the logical AND of bits). AND only takes the bits that are set to 1 "into account". E.g 1111 AND 1001 == 1001.
You need to use XOR and compare against 0 for match (remember the mask the bits you are not comparing from the result). In your example a match is found when (N ^ 1001) & 1111 == 0000
In order to make sure that both the 0 and 1 bits match your search pattern, you need to do something like this:
if ((InputPattern AND SearchMask) == SearchPattern)
{
// then match
}
The SearchMask should be all 1 bits, of a length equal to your SearchPattern. For example, you could have SearchMask == 1111, SearchPattern == 1001.
You should AND and then test against the search pattern:
if ((TestPattern & SearchPattern) == SearchPattern)
{
// then match
}
(where & represents bitwise AND)