Data type for huge binary numbers - binary

I have to handle huge binary numbers (<=4096 digits) - what is the best way to handle such big numbers? I have to multiply them afterward and apply the %-operation on these numbers.
Do I have to use structs or how am I supposed to handle such data?

If you've got it as a string of 4096 digit, you can convert it into a list with separate smaller chunks (eg into bytes each consisting of 8 bits), then if you need to multiply/apply the %-operation on these numbers, you probably will need create a function that converts those "chunks" from binary to denary (so you can multiply them and so on.)

Related

Representation of numbers in the computer

In the representation of inputs in the computer, the numbers are taken as characters and encoded with Ascii code or are they converted directly to binary? in other way: When my input is considered as integer and not a character?
Both are possible, and it depends on the application. In other words the software programmer decides. In general, binary representation is more efficient in terms of storage requirements and processing speed. Therefore binary representation is more usual, but there are good examples when it is better to keep numbers as strings:
to avoid problems with conversions
phone numbers
when no adequate binary representation is available (e.g. 100 digits of pi)
numbers where no processing takes places
to be continued ...
The most basic building block of electronic data is a bit. It can have only 2 values, 0 and 1. Other data structures are built from collection of bits, such as an 8-bit byte, or a 32-bit float.
When a collection of bits needs to be used to represent a character, a certain encoding is used to give lexical meaning to these bits, such as ASCII, UTF8 and others.
When you want to display character information to the screen, you use a graphical layer to draw pixels representing the "character" (collection of bits with matching encoding) to the screen.

How is JSON number encoded?

How is number represented in JSON internally and how many bytes of data does it take to store a JSON number?
I can't find any info specifying this internal detail.
According to the ECMA standard (PDF), §8:
A number is represented in base 10 with no superfluous leading zero. It may have a preceding minus sign (U+002D). It may have a (U+002E) prefixed fractional part. It may have an exponent of ten, prefixed by e (U+0065) or E (U+0045) and optionally + (U+002B) or – (U+002D). The digits are the code points U+0030 through U+0039.
So, pretty much text, except that (later on the page) NaN and Infinity aren't acceptable values.
BSON, however, has int32, int64, and double types that are a bit more traditional.
JSON is a data interchange format. It is just text. There is no "internal" representation of JSON, unless you are referring to how your particular system encodes and stores text data.
The number of bytes it takes to store a JSON number would be the length of the number, in characters, multiplied by the number of bytes required to store a character in your particular system.

Correct way to store a bit array

I'm working on a project that needs to store something like
101110101010100011010101001
into the database. It's not a file or archive: it's only a bit array, and I think that storing it into a varchar column is waste of space/performance.
I've searched about the BLOB and the VARBINARY type. But both of then allows to insert a value like 54563423523515453453, that's not exactly a bit array.
For sure, if I store a bit array like 10001000 into a BLOB/varbinary/varchar column, it will consume more than a byte, and I want that the minimum space is consumed. In the case of eight bits, it needs to consume only one byte, 16 bits two bytes, and so on.
If it's not possible, then what is the best approach to waste the minimum amount of space in this case?
Important notes: The size of the array is variable, and is not divisible by eight in every situation. Sometimes I will need to store 325 bits, other times 7143 bits....
In one of my previous projects, I converted streams of 1's and 0' to decimal, but they were shorter. I dont know if that would be applicable in your project.
On the other hand, imho, you should clarify what will you need to do with that data once you get it stored. Search? Compare? It might largely depend on the purpose of the database.
Could you gzip it and then store it? Is that applicable?
Binary is a string representation of a number. The string
101110101010100011010101001
represents the number
... + 1*25 + 0*24 + 1*23 + 0*22 + 0*21 + 1*20
As such, it can be stored in a 32-bit integer if were to be converted from a binary string to the number it represents. In Perl, one would use
oct('0b'.$binary)
But you have a variable number of bits. Not a problem! Just process them 8 at a time to create a string of bytes to place in a BLOB or similar.
Ah, but there's a catch. You'll need to add padding to get a number divisible by 8, which means you'll have to use a means of removing that padding. A simple approach if there's a known maximum length is to use a length prefix. e.g. If you know the number of bits is never going to exceed 65,535, encode the number of bits in the first two bytes of the string.
pack('nB*', length($binary), $binary)
which is reverted using
my ($length, $binary) = unpacked('nB*', $packed);
substr($binary, $length) = '';

Are there any illegal characters in MySQL which may not be stored in a field?

I'm looking for a shorthand solution to storing an md5 hash inside of a MySQL table, as string data. I had the idea that base256 could reduce the length of the string by half, down to a 16 digit string instead of 32 digits of hex. So I take hex and divide it up into chunks of two digits programatically then convert each set of two digits to ASCII. For example:
4cf5f5941a02573dc007e60442f5358a
is shortened to
Lõõ”W=ÀæBõ5Š
and it's OK if these characters don't print properly - I just need to store them. Would MySQL accept that sort of ASCII data into a text field without complaining?
MySQL will accept these values, but you must be very carefull when writing them - I strongly suggest binding parameters.
You might want to look into COMPRESS() and UNCOMPRESS() as an alternative:
INSERT INTO ... SET hashcode=COMPRESS('4cf5f5941a02573dc007e60442f5358a');
and
SELECT UNCOMPRESS(hashcode) AS hashcode FROM ... WHERE
might do the trick more readable

Why is it useful to know how to convert between numeric bases?

We are learning about converting Binary to Decimal (and vice-versa) as well as other base-conversion methods, but I don't understand the necessity of this knowledge.
Are there any real-world uses for converting numbers between different bases?
When dealing with Unicode escape codes— '\u2014' in Javascript is — in HTML
When debugging— many debuggers show all numbers in hex
When writing bitmasks— it's more convenient to specify powers of two in hex (or by writing 1 << 4)
In this article I describe a concrete use case. In short, suppose you have a series of bytes you want to transfer using some transport mechanism, but you cannot simply pass the payload as bytes, because you are not able to send binary content. Let's say you can only use 64 characters for encoding the payload. A solution to this problem is to convert the bytes (8-bit characters) into 6-bit characters. Here the number conversion comes into play. Consider the series of bytes as a big number whose base is 256. Then convert it into a number with base 64 and you are done. Each digit of the new base 64 number now denotes a character of your encoded payload...
If you have a device, such as a hard drive, that can only have a set number of states, you can only count in a number system with that many states.
Because a computer's byte only have on and off, you can only represent 0 and 1. Therefore a base2 system is used.
If you have a device that had 3 states, you could represent 0, 1 and 2, and therefore count in a base 3 system.