How to calculate excess-256 and excess-128? - binary

I have the number -37 in decimal. Could you please tell me what its representation is in excess-256 and excess-128?
Its normal binary representation is -100101 and in 8 bits 1010 0101. How exactly do I then get the excess-N representation for it? Am I allowed to have a number larger than 8 bits when writing excess-N? Would I thus have:
01010 0000 (first digit) + 10000 0000 and 00101 + 10000 00000?

to get it into excess 128, add the true value (-37) to 128. = 91, then get the binary value for 91, 1011011. Same applies for 256 i think.

Related

Fetch request integers change after applying .json() [duplicate]

Is this defined by the language? Is there a defined maximum? Is it different in different browsers?
JavaScript has two number types: Number and BigInt.
The most frequently-used number type, Number, is a 64-bit floating point IEEE 754 number.
The largest exact integral value of this type is Number.MAX_SAFE_INTEGER, which is:
253-1, or
+/- 9,007,199,254,740,991, or
nine quadrillion seven trillion one hundred ninety-nine billion two hundred fifty-four million seven hundred forty thousand nine hundred ninety-one
To put this in perspective: one quadrillion bytes is a petabyte (or one thousand terabytes).
"Safe" in this context refers to the ability to represent integers exactly and to correctly compare them.
From the spec:
Note that all the positive and negative integers whose magnitude is no
greater than 253 are representable in the Number type (indeed, the
integer 0 has two representations, +0 and -0).
To safely use integers larger than this, you need to use BigInt, which has no upper bound.
Note that the bitwise operators and shift operators operate on 32-bit integers, so in that case, the max safe integer is 231-1, or 2,147,483,647.
const log = console.log
var x = 9007199254740992
var y = -x
log(x == x + 1) // true !
log(y == y - 1) // also true !
// Arithmetic operators work, but bitwise/shifts only operate on int32:
log(x / 2) // 4503599627370496
log(x >> 1) // 0
log(x | 1) // 1
Technical note on the subject of the number 9,007,199,254,740,992: There is an exact IEEE-754 representation of this value, and you can assign and read this value from a variable, so for very carefully chosen applications in the domain of integers less than or equal to this value, you could treat this as a maximum value.
In the general case, you must treat this IEEE-754 value as inexact, because it is ambiguous whether it is encoding the logical value 9,007,199,254,740,992 or 9,007,199,254,740,993.
>= ES6:
Number.MIN_SAFE_INTEGER;
Number.MAX_SAFE_INTEGER;
<= ES5
From the reference:
Number.MAX_VALUE;
Number.MIN_VALUE;
console.log('MIN_VALUE', Number.MIN_VALUE);
console.log('MAX_VALUE', Number.MAX_VALUE);
console.log('MIN_SAFE_INTEGER', Number.MIN_SAFE_INTEGER); //ES6
console.log('MAX_SAFE_INTEGER', Number.MAX_SAFE_INTEGER); //ES6
It is 253 == 9 007 199 254 740 992. This is because Numbers are stored as floating-point in a 52-bit mantissa.
The min value is -253.
This makes some fun things happening
Math.pow(2, 53) == Math.pow(2, 53) + 1
>> true
And can also be dangerous :)
var MAX_INT = Math.pow(2, 53); // 9 007 199 254 740 992
for (var i = MAX_INT; i < MAX_INT + 2; ++i) {
// infinite loop
}
Further reading: http://blog.vjeux.com/2010/javascript/javascript-max_int-number-limits.html
In JavaScript, there is a number called Infinity.
Examples:
(Infinity>100)
=> true
// Also worth noting
Infinity - 1 == Infinity
=> true
Math.pow(2,1024) === Infinity
=> true
This may be sufficient for some questions regarding this topic.
Jimmy's answer correctly represents the continuous JavaScript integer spectrum as -9007199254740992 to 9007199254740992 inclusive (sorry 9007199254740993, you might think you are 9007199254740993, but you are wrong!
Demonstration below or in jsfiddle).
console.log(9007199254740993);
However, there is no answer that finds/proves this programatically (other than the one CoolAJ86 alluded to in his answer that would finish in 28.56 years ;), so here's a slightly more efficient way to do that (to be precise, it's more efficient by about 28.559999999968312 years :), along with a test fiddle:
/**
* Checks if adding/subtracting one to/from a number yields the correct result.
*
* #param number The number to test
* #return true if you can add/subtract 1, false otherwise.
*/
var canAddSubtractOneFromNumber = function(number) {
var numMinusOne = number - 1;
var numPlusOne = number + 1;
return ((number - numMinusOne) === 1) && ((number - numPlusOne) === -1);
}
//Find the highest number
var highestNumber = 3; //Start with an integer 1 or higher
//Get a number higher than the valid integer range
while (canAddSubtractOneFromNumber(highestNumber)) {
highestNumber *= 2;
}
//Find the lowest number you can't add/subtract 1 from
var numToSubtract = highestNumber / 4;
while (numToSubtract >= 1) {
while (!canAddSubtractOneFromNumber(highestNumber - numToSubtract)) {
highestNumber = highestNumber - numToSubtract;
}
numToSubtract /= 2;
}
//And there was much rejoicing. Yay.
console.log('HighestNumber = ' + highestNumber);
Many earlier answers have shown 9007199254740992 === 9007199254740992 + 1 is true to verify that 9,007,199,254,740,991 is the maximum and safe integer.
But what if we keep doing accumulation:
input: 9007199254740992 + 1 output: 9007199254740992 // expected: 9007199254740993
input: 9007199254740992 + 2 output: 9007199254740994 // expected: 9007199254740994
input: 9007199254740992 + 3 output: 9007199254740996 // expected: 9007199254740995
input: 9007199254740992 + 4 output: 9007199254740996 // expected: 9007199254740996
We can see that among numbers greater than 9,007,199,254,740,992, only even numbers are representable.
It's an entry to explain how the double-precision 64-bit binary format works. Let's see how 9,007,199,254,740,992 be held (represented) by using this binary format.
Using a brief version to demonstrate it from 4,503,599,627,370,496:
1 . 0000 ---- 0000 * 2^52 => 1 0000 ---- 0000.
|-- 52 bits --| |exponent part| |-- 52 bits --|
On the left side of the arrow, we have bit value 1, and an adjacent radix point. By consuming the exponent part on the left, the radix point is moved 52 steps to the right. The radix point ends up at the end, and we get 4503599627370496 in pure binary.
Now let's keep incrementing the fraction part with 1 until all the bits are set to 1, which equals 9,007,199,254,740,991 in decimal.
1 . 0000 ---- 0000 * 2^52 => 1 0000 ---- 0000.
(+1)
1 . 0000 ---- 0001 * 2^52 => 1 0000 ---- 0001.
(+1)
1 . 0000 ---- 0010 * 2^52 => 1 0000 ---- 0010.
(+1)
.
.
.
1 . 1111 ---- 1111 * 2^52 => 1 1111 ---- 1111.
Because the 64-bit double-precision format strictly allots 52 bits for the fraction part, no more bits are available if we add another 1, so what we can do is setting all bits back to 0, and manipulate the exponent part:
┏━━▶ This bit is implicit and persistent.
┃
1 . 1111 ---- 1111 * 2^52 => 1 1111 ---- 1111.
|-- 52 bits --| |-- 52 bits --|
(+1)
1 . 0000 ---- 0000 * 2^52 * 2 => 1 0000 ---- 0000. * 2
|-- 52 bits --| |-- 52 bits --|
(By consuming the 2^52, radix
point has no way to go, but
there is still one 2 left in
exponent part)
=> 1 . 0000 ---- 0000 * 2^53
|-- 52 bits --|
Now we get the 9,007,199,254,740,992, and for the numbers greater than it, the format can only handle increments of 2 because every increment of 1 on the fraction part ends up being multiplied by the left 2 in the exponent part. That's why double-precision 64-bit binary format cannot hold odd numbers when the number is greater than 9,007,199,254,740,992:
(consume 2^52 to move radix point to the end)
1 . 0000 ---- 0001 * 2^53 => 1 0000 ---- 0001. * 2
|-- 52 bits --| |-- 52 bits --|
Following this pattern, when the number gets greater than 9,007,199,254,740,992 * 2 = 18,014,398,509,481,984 only 4 times the fraction can be held:
input: 18014398509481984 + 1 output: 18014398509481984 // expected: 18014398509481985
input: 18014398509481984 + 2 output: 18014398509481984 // expected: 18014398509481986
input: 18014398509481984 + 3 output: 18014398509481984 // expected: 18014398509481987
input: 18014398509481984 + 4 output: 18014398509481988 // expected: 18014398509481988
How about numbers between [ 2 251 799 813 685 248, 4 503 599 627 370 496 )?
1 . 0000 ---- 0001 * 2^51 => 1 0000 ---- 000.1
|-- 52 bits --| |-- 52 bits --|
The value 0.1 in binary is exactly 2^-1 (=1/2) (=0.5)
So when the number is less than 4,503,599,627,370,496 (2^52), there is one bit available to represent the 1/2 times of the integer:
input: 4503599627370495.5 output: 4503599627370495.5
input: 4503599627370495.75 output: 4503599627370495.5
Less than 2,251,799,813,685,248 (2^51)
input: 2251799813685246.75 output: 2251799813685246.8 // expected: 2251799813685246.75
input: 2251799813685246.25 output: 2251799813685246.2 // expected: 2251799813685246.25
input: 2251799813685246.5 output: 2251799813685246.5
/**
Please note that if you try this yourself and, say, log
these numbers to the console, they will get rounded. JavaScript
rounds if the number of digits exceed 17. The value
is internally held correctly:
*/
input: 2251799813685246.25.toString(2)
output: "111111111111111111111111111111111111111111111111110.01"
input: 2251799813685246.75.toString(2)
output: "111111111111111111111111111111111111111111111111110.11"
input: 2251799813685246.78.toString(2)
output: "111111111111111111111111111111111111111111111111110.11"
And what is the available range of exponent part? 11 bits allotted for it by the format.
From Wikipedia (for more details, go there)
So to make the exponent part be 2^52, we exactly need to set e = 1075.
To be safe
var MAX_INT = 4294967295;
Reasoning
I thought I'd be clever and find the value at which x + 1 === x with a more pragmatic approach.
My machine can only count 10 million per second or so... so I'll post back with the definitive answer in 28.56 years.
If you can't wait that long, I'm willing to bet that
Most of your loops don't run for 28.56 years
9007199254740992 === Math.pow(2, 53) + 1 is proof enough
You should stick to 4294967295 which is Math.pow(2,32) - 1 as to avoid expected issues with bit-shifting
Finding x + 1 === x:
(function () {
"use strict";
var x = 0
, start = new Date().valueOf()
;
while (x + 1 != x) {
if (!(x % 10000000)) {
console.log(x);
}
x += 1
}
console.log(x, new Date().valueOf() - start);
}());
The short answer is “it depends.”
If you’re using bitwise operators anywhere (or if you’re referring to the length of an Array), the ranges are:
Unsigned: 0…(-1>>>0)
Signed: (-(-1>>>1)-1)…(-1>>>1)
(It so happens that the bitwise operators and the maximum length of an array are restricted to 32-bit integers.)
If you’re not using bitwise operators or working with array lengths:
Signed: (-Math.pow(2,53))…(+Math.pow(2,53))
These limitations are imposed by the internal representation of the “Number” type, which generally corresponds to IEEE 754 double-precision floating-point representation. (Note that unlike typical signed integers, the magnitude of the negative limit is the same as the magnitude of the positive limit, due to characteristics of the internal representation, which actually includes a negative 0!)
ECMAScript 6:
Number.MAX_SAFE_INTEGER = Math.pow(2, 53)-1;
Number.MIN_SAFE_INTEGER = -Number.MAX_SAFE_INTEGER;
Other may have already given the generic answer, but I thought it would be a good idea to give a fast way of determining it :
for (var x = 2; x + 1 !== x; x *= 2);
console.log(x);
Which gives me 9007199254740992 within less than a millisecond in Chrome 30.
It will test powers of 2 to find which one, when 'added' 1, equals himself.
Anything you want to use for bitwise operations must be between 0x80000000 (-2147483648 or -2^31) and 0x7fffffff (2147483647 or 2^31 - 1).
The console will tell you that 0x80000000 equals +2147483648, but 0x80000000 & 0x80000000 equals -2147483648.
JavaScript has received a new data type in ECMAScript 2020: BigInt. It introduced numerical literals having an "n" suffix and allows for arbitrary precision:
var a = 123456789012345678901012345678901n;
Precision will still be lost, of course, when such big integer is (maybe unintentionally) coerced to a number data type.
And, obviously, there will always be precision limitations due to finite memory, and a cost in terms of time in order to allocate the necessary memory and to perform arithmetic on such large numbers.
For instance, the generation of a number with a hundred thousand decimal digits, will take a noticeable delay before completion:
console.log(BigInt("1".padEnd(100000,"0")) + 1n)
...but it works.
Try:
maxInt = -1 >>> 1
In Firefox 3.6 it's 2^31 - 1.
I did a simple test with a formula, X-(X+1)=-1, and the largest value of X I can get to work on Safari, Opera and Firefox (tested on OS X) is 9e15. Here is the code I used for testing:
javascript: alert(9e15-(9e15+1));
I write it like this:
var max_int = 0x20000000000000;
var min_int = -0x20000000000000;
(max_int + 1) === 0x20000000000000; //true
(max_int - 1) < 0x20000000000000; //true
Same for int32
var max_int32 = 0x80000000;
var min_int32 = -0x80000000;
Let's get to the sources
Description
The MAX_SAFE_INTEGER constant has a value of 9007199254740991 (9,007,199,254,740,991 or ~9 quadrillion). The reasoning behind that number is that JavaScript uses double-precision floating-point format numbers as specified in IEEE 754 and can only safely represent numbers between -(2^53 - 1) and 2^53 - 1.
Safe in this context refers to the ability to represent integers exactly and to correctly compare them. For example, Number.MAX_SAFE_INTEGER + 1 === Number.MAX_SAFE_INTEGER + 2 will evaluate to true, which is mathematically incorrect. See Number.isSafeInteger() for more information.
Because MAX_SAFE_INTEGER is a static property of Number, you always use it as Number.MAX_SAFE_INTEGER, rather than as a property of a Number object you created.
Browser compatibility
In JavaScript the representation of numbers is 2^53 - 1.
However, Bitwise operation are calculated on 32 bits ( 4 bytes ), meaning if you exceed 32bits shifts you will start loosing bits.
In the Google Chrome built-in javascript, you can go to approximately 2^1024 before the number is called infinity.
Scato wrotes:
anything you want to use for bitwise operations must be between
0x80000000 (-2147483648 or -2^31) and 0x7fffffff (2147483647 or 2^31 -
1).
the console will tell you that 0x80000000 equals +2147483648, but
0x80000000 & 0x80000000 equals -2147483648
Hex-Decimals are unsigned positive values, so 0x80000000 = 2147483648 - thats mathematically correct. If you want to make it a signed value you have to right shift: 0x80000000 >> 0 = -2147483648. You can write 1 << 31 instead, too.
Firefox 3 doesn't seem to have a problem with huge numbers.
1e+200 * 1e+100 will calculate fine to 1e+300.
Safari seem to have no problem with it as well. (For the record, this is on a Mac if anyone else decides to test this.)
Unless I lost my brain at this time of day, this is way bigger than a 64-bit integer.
Node.js and Google Chrome seem to both be using 1024 bit floating point values so:
Number.MAX_VALUE = 1.7976931348623157e+308

two's complement binary VHDL

I have to transfer a binary number from A to -A in VHDL. I have some questions about two's complement notation.
For instance if I have A(8 bit) = 5, in binary 0000 0101. From online sources I realized that to translate it into two's complement negative form, I need to invert all bits adding 1 at the end:
0000 0101 --> 1111 1010 + 0000 0001 = 1111 1011 and this represents -A=-5;
My doubts now is about this final binary form that can represent -5 and 251, how can I recognize if it is -5 or 251?
By the way, this method is not so simply to be described in VHDL. Do you know if there is any simpler method?
Intuitively my reasoning would be: you are using two's complement, which is a signed number representation, therefore negative values exist. If negative values exist, you need a sign bit. That will leave 7-bits for the magnitude: hence you can only represent a value between -128 and 127. Value 251 is not within this range: It cannot be represented using 8-bits two's complement notation. Thus only -5 is valid.
The easiest way to realize two's complement sign inversion in VHDL is using the numeric_bit package.
library ieee;
entity bit_inv is
generic(width : positive);
port(
A : in bit_vector(width-1 downto 0);
A_inv : out bit_vector(width-1 downto 0));
end entity;
architecture rtl of bit_inv is
use ieee.numeric_bit.all;
begin
A_inv <= bit_vector(-signed(A));
end architecture;
entity bit_inv_tb is end entity;
library ieee;
architecture beh of bit_inv_tb is
use ieee.numeric_bit.all;
constant width : positive := 8;
signal A, A_inv : bit_vector(width-1 downto 0);
begin
DUT : entity work.bit_inv
generic map(width => width)
port map(A=>A, A_inv =>A_inv);
test: process begin
A <= bit_vector(to_signed(5,width));
wait for 1 ns;
assert to_integer(signed(A_inv)) = -5 report "A_inv is not equal to -5" severity failure;
wait;
end process;
end architecture;

32-bit int struct bits don't seem to match up (nodejs)

I have a file that defines a set of tiles (used in an online game). The format for each tile is as follows:
x: 12 bits
y: 12 bits
tile: 8 bits
32 bits in total, so each tile can be expressed as a 32 bit integer.
More info about the file format can be found here:
http://wiki.minegoboom.com/index.php/LVL_Format
http://www.rarefied.org/subspace/lvlformat.html
The 4 byte structures are not broken along byte boundaries. As you can see x: and y: are both defined as 12 bits. ie. x is stored in 1.5 bytes, y is stored in 1.5 bytes and tile is stored in 1 byte.
Even though x and y use 12 bits their max value is 1023, so they could be expressed in 10 bits. This was down to the creator of the format. I guess they were just padding things out so they could use a 32-bit integer for each tile? Either way, for x and y we can ignore the final 2 bits.
I'm using a nodejs Buffer to read the file and I'm using the following code to read the values.
var n = tileBuffer.readUInt32LE(0);
var x = n & 0x03FF;
var y = (n >> 12) & 0x03FF;
var tile = (n >> 24) & 0x00ff;
This code works fine but when I read the bits themselves, in an attempt to understand binary better, I see something that confuses me.
Take, for example a int that expresses the following:
x: 1023
y: 1023
tile: 1
Creating the tiles in a map editor and reading the resulting file into a buffer returns <Buffer ff f3 3f 01>
When I convert each byte into a string of bits I get the following:
ff = 11111111
f3 = 11110011
3f = 00111111
01 = 00000001
11111111 11110011 00111111 00000001
I assume I should just take the first 12 bits as x but chop off the last 2 bits. Use the next 12 bits as y, chopping off 2 bits again, and the remaining 8 bits would be the tile.
x: 1111111111
y: 0011001111
tile: 00000001
The x is correct (1111111111 = 1023), the y is wrong (0011001111 = 207, not 1023), and tile is correct (00000001 = 1)
I'm confused and obviously missing something.
It makes more sense to look at it in this order: (this would be the binary representation of n)
00000001 00111111 11110011 11111111
On that order, you can easily do the masking and shifting visually.
The problem with what you did is that for example in 11111111 11110011, the bits of the second byte that belong to the first field are at the right (the lowest part of that byte), which in that order is discontinuous.
Also, masking with 0x03FF makes those first two fields have 10 bits, with two bits just disappearing. You can make them 12 bits by masking with 0x0FFF. As it is now, you effectively have two padding bits.

what is the result of (54.125) - (184)10

I am practicing for midterm and apprently there's no answer key for it.
However, I practiced and got a result but not sure if this is correct since the solution is really long.
perfrom the following aritmetic operations using the 10 bit floating point standard(given on equation sheet)
-I need to convert to normalized
-standardize properly
-convert final answer to normalized 10 bit.
-convert the final answer to decimal.
-must indicate if there is any loss of precision
my solution is really long so I'm just gonna post only answer here.
-normalized
: 54.125 = 0 1100 10110
: 184 = 0 1110 01110
-standardized:
54.125 = 0 1110 0.01101
184 = 0 1110 1.01110
-result :
0 0010 0000
-denormalized: 0.03125
-precision lost: -129.90625
please help thanks!

How do I get binary byte length in Erlang?

If I have the following binary:
<<32,16,10,9,108,111,99,97,108,104,111,115,116,16,170,31>>
How can I know what length it has?
For byte size:
1> byte_size(<<32,16,10,9,108,111,99,97,108,104,111,115,116,16,170,31>>).
16
For bit size:
2> bit_size(<<32,16,10,9,108,111,99,97,108,104,111,115,116,16,170,31>>).
128
When you have a bit string (a binary with bit length not divisible by the byte size 8) byte_size/1 will round up to the nearest whole byte. I.e. the amount of bytes the bit string would fit in:
3> bit_size(<<0:19>>).
19
4> byte_size(<<0:19>>). % 19 bits fits inside 3 bytes
3
5> bit_size(<<0:24>>).
24
6> byte_size(<<0:24>>). % 24 bits is exactly 3 bytes
3
7> byte_size(<<0:25>>). % 25 bits only fits inside 4 bytes
4
Here's an example illustrating the difference in sizes going from 8 bits (fits in 1 byte) to 17 bits (needs 3 bytes to fit):
8> [{bit_size(<<0:N>>), byte_size(<<0:N>>)} || N <- lists:seq(8,17)].
[{8,1},
{9,2},
{10,2},
{11,2},
{12,2},
{13,2},
{14,2},
{15,2},
{16,2},
{17,3}]