Why is Binary (machine code) based off Boolean Algebra? What would happen if numbers went from 0-9 instead of 0 or 1? - binary

I'm curious to know if it would be possible to create a computer that uses binary that can go from 0000 up to 9999 by having true and false be 1 and 0, but add numbers 2-9 to get more possibilities for numbers. Is binary code only consisting of 0's and 1's for simplicity? Is it because for some reason computers can only understand True and False?
Binary code starts with 0 (0000) and increases to 1 (0001) to 2 (0010) and 10 (1010). Could is be possible for a computer to recognize 0's and 1's but then go to 2's and other numbers? For example, 0000 = 0, 0001 = 1, 0002 = 2, 0009 = 9 then 0010 = 10, and so on.
If this isn't possible somehow, please explain why and give a general explanation of how computers work because I'm interested and want to learn more. If this isn't used because it's inefficient, please epxlain what makes it inefficient and what makes 0's and 1's more efficient.
Thank you.
I expect that it would be possible to create a computer like this but I searched online and couldn't find out why binary code can't have numbers other than 0's and 1's.
Answer to myself for future reference:
Binary is based on Boolean Algebra because it's a base 2 system, and Decimal is a base 10 system that goes from 0-9 instead of 0 or 1 like Binary which is a base 2 system. Computers easily understand binary because its based off on and off states (0 or 1) with 0 being off and 1 being on. Computers use logic gates which are composed of a multitude of transistors that use boolean logic to store data for the computer. Binary makes hardware convenient for computers. Other number systems are used for other purposes different from Binary's purpose. For example, hexadecimal is used to represent numbers that are large in a more simple way that decimal is able to, take the number one million for example,in decimal, it would be 1000000, in binary it would be 11110100001001000000, and in hexadecimal it would be F4240. This is why the Binary number system is based off boolean alegbra and why computers use binary and not other number systems.

It is based on how the data is stored. Each piece of data stored in your memory can have only two values. Think of your memory as a number of glasses which can either be empty or full. Which means the data is stored as bunch of 1s and 0s. This is the result of moving from analog systems to digital, analog values can be between 0 and 1. In analog systems for example, you can have 0.25 or 0.7. But since computers became digital, the logic became binary.
It will be really beneficial to research the history of computers and learn how they evolved over time, if you are interested in this topic.

Related

Sign-magnitude and two's compliment

I am doing an assignment and am stuck can anyone please help?
A new computer has been acquired by management of CIPOL a research lab sited around Teshie to help in the analysis of samples of blood taken from suspected cases of COVID-19. Upon testing the computer, it was realized by technicians that it can only process data fed into it in the form of sign magnitude. However, all the equipment in CIPOL work in the 2’complement environment. There is currently no interface to link the old systems and new computer. As the technical team leader, you have been summoned to brief management on the problem on hand. You are to:
Critically explain the challenge faced by your team in connection with the new and old computers and propose a solution to it.
This is my answer to the question I am not sure if I am right:
Sign-magnitude is a way the computer stores negative numbers. There are two other ways which is 1’s compliment and 2’s complement. The new computer that represents data in sign-magnitute will work perfectly the old systems.
The new computer will not work seamlessly with the old system.
1. Signed and Magnitude Binary Representation
Sign magnitude uses the first bit to denote the sign of the number. So, for example:
000 is 0.
001 is 1, the first 0 being the + sign.
101 is -1, the first 1 being the - sign.
2. Two-Complement Binary Representation
This representation flips the bits of the number and adds one to represent the negative counterpart:
000 is still 0.
001 is also still 1.
111 is -1. Basically flip every bit of 001 and add 1 to it.
3. The Solution
I suppose you're gonna want to convert from signed to two-complement representation, since everything else in the lab already uses the signed representation.
Positive Numbers
No need to change.
Negative Numbers
I.e., if the first bit is 1:
Replace the first bit with 0.
Flip all of the bits (that's a simple NOT logical operator).
Add 1 to the new number.
An example with 1011 or -3 in the signed representation with 4 bits:
Replace the first bit with 0 → 0011
Flip all of the bits → 1100
Add 1 to the new number → 1101
4. Further Reading
There are plenty of tutorials, videos and articles on this topic. If you wanna learn more:
Wikipedia: Signed Number Representation
Two's Complement, by Thomas Finley from Cornell University
Wikipedia: Two's Complement

Ternary computers: what would the third (unknown) part of a trit be used for?

I'm very interested in the idea of creating/designing (but most likely only imagining) a ternary computer rather than a binary computer.
If I were to do this, I would used a balanced base-3 system, so a trit (trit is to base-3 as bit is to base-2) could be -1, 0, or +1. Storing data using trits would be approximately 36% more compact than storing data using bits like we do on today's computers, however ternary arithmetic would be much more complicated so there's no telling whether an ALU using ternary-computing would be faster or slower than binary.
But I digress, that's just a little background stuff that doesn't entirely pertain to the question, but it is related. :)
So, possible values for a trit:
-1 is off/false, same as 0 in binary.
0 is unknown. No equivalent in binary.
+1 is on/true, same as 1 in binary.
My question is...what is the point of that 0 in terms of computing? For example, I've been reading up a lot on logic gates and I understand both how they work and how they can work together to create an ALU. A binary AND gate is very simple, and combined with other binary logic gates they could all be used in combination to perform arithmetic such as creating an Adder, or a unit that performs addition.
I can't even comprehend how this would be done using ternary-computing. How would the unknown (0) factor into the logic gates and be used to perform arithmetic? Hell, I can't even comprehend what outputs a ternary AND gate would put out and how they'd be used.
For example, I would assume for a ternary computer an AND gate would accept 3 inputs instead of 2. Let's call the inputs A, B, and C.
In a binary AND gate, A and B can be 0 or 1. There are four possible combinations of what A and B could be input into the AND gate as. This results in only four possible outputs from the AND gate. If A and B are both 1 then the AND gate outputs a 1. If it is any of the other three combinations of A and B then it outputs 0. (possible results from AND gate considering all possible A/B combinations: 0, 0, 0, 1)
A ternary AND gate would take in 3 inputs, right? So in a ternary AND gate, A, B, and C could be -1, 0, or 1. This means that there are 27 possible combinations for A/B/C. Rather than listing out the possible outcomes for each combination, I'll just add them up for you guys. :)
Anyway, there is only one combination in which 1 will be the output, there are 7 combinations in which 0 will be the output (assuming if A, B, and C are all 0 the AND gate will output 0), and there are 19 combinations in which -1 will be the output. In a binary Adder if the gate throws a 1 it would be sent off to another gate to be evaluated and so on until the addition is complete. In ternary...what would a gate do if it received a 0?
I know that's a lot of reading, so I'll try to sum it up and list the main questions below:
How would the 0 of a trit in a balanced ternary system be used/handled in logic gates?
If a logic gate outputs a 0, and the gate is being used in an ALU to perform arithmetic (let's say addition for example), how would the gate that receives the 0 be expected to handle it? Basically, how would one go about creating an Adder using ternary logic?
And lastly, am I correct assuming that in a ternary computer logic gates would accept 3 inputs instead of 2 like binary computers, or would logic gates still be dyadic?
An essential goal of ALU design is to perform arithmetic on integers. In the first place addition (subtraction), then multiplication and division.
When written in base 3, these operations are well defined. For instance
+ | 0 1 2
------------
0 | 0 1 2
1 | 1 2 10
2 | 2 10 11
As with binary arithmetic, on needs to compute a sum trit and a carry. When the carry is propagated, the following table applies
+c| 0 1 2
------------
0 | 1 2 10
1 | 2 10 11
2 | 10 11 12
So you indeed need two three-input functions (two trits in and a carry in), giving the sum trit and the carry out bit. (Notice that binary ALUs add the same way: two bits in and a carry in giving a sum and a carry out bit.)
Whether this can be implemented from elementary dyadic or triadic gates would be technology dependent.
The logical predicates AND/OR have no reason to be modified and should remain binary. Boolean arithmetic remains Boolean.
Besides, if you enumerate all ternary functions of two ternary arguments (i.e. 9 input combinations), you find 19683 of them. Contrast this to 16 binary functions. This mess is unmanageable. (Don't even think of all triadic ternary functions, 7625597484987 of them.)
Okay, so I believe I may have found the answer to my questions.
So, there's no reason to reinvent the wheel, using standard binary logic gates would work just fine in a ternary computer.
In binary, 0 is false and 1 is true, and a number in binary is to be read from right to left, each digit following a power of two. For example, 10010 would be 18 (2 + 16). This number system is incremental, meaning the more 1's you have to the left the higher the number would be if converted to decimal, but there's no way to decrement with this system. All of this is done using transistors that make use of only caring about whether there is voltage flowing through it or if there isn't, thus determining if there is voltage it is on, and the bit is a 1, and if not then it is off, and the bit is a 0.
In ternary, there wouldn't be an on/off. Unlike the standard transistor in binary computers that tests whether there is or isn't voltage to determine the bit's value, a transistor for a ternary computer would test for if the voltage is negative, positive, or a ground. (-1 being negative voltage, 0 being the ground, +1 being positive voltage).
Using this system, decimal numbers would be written in ternary similarly to how a decimal number would be written in binary with one exception: ternary allows for decrementation. Take a number in binary for example, with the first digit starting on the right the digits correlate to 2^0, then 2^1, and so on. In binary all the digits with 1's and their correlating power of 2 is then added up to give you your number.
Now imagine ternary. From right-to-left it would follow 3^0, 3^1, 3^2, and so on. however, a +1 trit would indicate that the power of 3 that the digit correlates to is positive, a 0 trit would mean that digit is ignored just as a 0 does the same in binary, and a -1 trit would mean that the power of 3 that the digit correlates to is negative. This allows for both incrementation and decrementation.
Take this ternary number for example: (I'm going to use '-' instead of '-1', and '+' to represent '+1': +-0+-
this would be, reading from left to right, (-1)+(3)+(-27)+(81) = 56
While it's true that ternary and dyadic logic gates are compatible, an ALU would have to be designed very differently. Basically that's it. :)

How to know if it is either a positive/negative number or it is referring to a number in binary?

I'm learning Integer data formats in a computer science book and as far as I understand that binary representation of a integer whether it is positive or negative is to have the leftmost bit (msb) be either a 0 for positive or 1 for negative, lets say in a 8-bit computer how would I know if it is talking about 10000010 - 130 in base 10 or if it is referring to negative 2?
I might be wrong, if i'm please correct me.
If you were to just see the string 10000010 somewhere, I don't know... written on a wall or something, how would you know how to interpret it?
You might say, hey, that's ten million and ten ( you thought it was base 10 ) or you might say hey, that's -126 ( you thought it was two's complement binary ), or you might say that's positive 130 ( you thought it was standard binary ).
It is, in a theoretical sense, up to whatever is doing the interpreting how it is interpreted.
So, when a computer is holding 8 bits of data, it's up to it how it interprets it.
Now if you're programming, you can tell the computer how you want something interpreted. For example, in c++
// char is 1 byte
unsigned char x = 130u;
Here I have told the compiler to put 130 unsigned into a byte, so the computer will store 10000010 and will interpret it as the value 130
Now consider
// char is 1 byte
char x = -126;
Here I have told the compiler to put -126 signed into a byte, so the computer will again store 10000010 but this time it will interpret it as the value -126.
Take a look at the answer posted to this question: Negative numbers are stored as 2's complement in memory, how does the CPU know if it's negative or positive?
The CPU uses something called an opcode in order to determine which function it will take when manipulating a memory location (in this case, the value 10000010). It is that function within the CPU that will either manipulate it as a negative or a positive number. The CPU doesn't have access to whether or not the number is signed or unsigned - it uses the op code when manipulating that number to determine whether or not it should be a signed or unsigned operation.

Compressing a binary matrix

We were asked to find a way to compress a square binary matrix as much as possible, and if possible, to add redundancy bits to check and maybe correct errors.
The redundancy thing is easy to implement in my opinion. The complicated part is compressing the matrix. I thought about using run-length after reshaping the matrix to a vector because there will be more zeros than ones, but I only achieved a 40bits compression (we are working on small sizes) although I thought it'd be better.
Also, after run-length an idea was Huffman coding the matrix, but a dictionary must be sent in order to recover the original information.
I'd like to know what would be the best way to compress a binary matrix?
After reading some comments, yes #Adam you're right, the 14x14 matrix should be compressed in 128bits, so if I only use the coordinates (rows&cols) for each non-zero element, still it would be 160bits (since there are twenty ones). I'm not looking for an exact solution but for a useful idea.
You can only talk about compressing something if you have a distribution and a representation. That's the issue of the dictionary you have to send along: you always need some sort of dictionary of protocol to uncompress something. It just so happens that things like .zip and .mpeg already have those dictionaries/codecs. Even something as simple as Huffman-encoding is an algorithm; on the other side of the communication channel (you can think of compression as communication), the other person already has a bit of code (the dictionary) to perform the Huffman decompression scheme.
Thus you cannot even begin to talk about compressing something without first thinking "what kinds of matrices do I expect to see?", "is the data truly random, or is there order?", and if so "how can I represent the matrices to take advantage of order in the data?".
You cannot compress some matrices without increasing the size of other objects (by at least 1 bit). This is bad news if all matrices are equally probable, and you care equally about them all.
Addenda:
The answer to use sparse matrix machinery is not necessarily the right answer. The matrix could for example be represented in python as [[(r+c)%2 for c in range (cols)] for r in range(rows)] (a checkerboard pattern), and a sparse matrix wouldn't compress it at all, but the Kolmogorov complexity of the matrix is the above program's length.
Well, I know every matrix will have the same number of ones, so this is kind of deterministic. The only think I don't know is where the 1's will be. Also, if I transmit the matrix with a dictionary and there are burst errors, maybe the dictionary gets affected so... wouldnt be the resulting information corrupted? That's why I was trying to use lossless data compression such as run-length, the decoder just doesnt need a dictionary. --original poster
How many 1s does the matrix have as a fraction of its size, and what is its size (NxN -- what is N)?
Furthermore, this is an incorrect assertion and should not be used as a reason to desire run-length encoding (which still requires a program); when you transmit data over a channel, you can always add error-correction to this data. "Data" is just a blob of bits. You can transmit both the data and any required dictionaries over the channel. The error-correcting machinery does not care at all what the bits you transmit are for.
Addendum 2:
There are (14*14) choose 20 possible arrangements, which I assume are randomly chosen. If this number was larger than 128^2 what you're trying to do would be impossible. Fortunately log_2((14*14) choose 20) ~= 90bits < 128bits so it's possible.
The simple solution of writing down 20 numbers like 32,2,67,175,52,...,168 won't work because log_2(14*14)*20 ~= 153bits > 128bits. This would be equivalent to run-length encoding. We want to do something like this but we are on a very strict budget and cannot afford to be "wasteful" with bits.
Because you care about each possibility equally, your "dictionary"/"program" will simulate a giant lookup table. Matlab's sparse matrix implementation may work but is not guaranteed to work and is thus not a correct solution.
If you can create a bijection between the number range [0,2^128) and subsets of size 20, you're good to go. This corresponds to enumerating ways to descend the pyramid in http://en.wikipedia.org/wiki/Binomial_coefficient to the 20th element of row 196. This is the same as enumerating all "k-combinations". See http://en.wikipedia.org/wiki/Combination#Enumerating_k-combinations
Fortunately I know that Mathematica and Sage and other CAS software can apparently generate the "5th" or "12th" or arbitrarily numbered k-subset. Looking through their documentation, we come upon a function called "rank", e.g. http://www.sagemath.org/doc/reference/sage/combinat/subset.html
So then we do some more searching, and come across some arcane Fortran code like http://people.sc.fsu.edu/~jburkardt/m_src/subset/ksub_rank.m and http://people.sc.fsu.edu/~jburkardt/m_src/subset/ksub_unrank.m
We could reverse-engineer it, but it's kind of dense. But now we have enough information to search for k-subset rank unrank, which leads us to http://www.site.uottawa.ca/~lucia/courses/5165-09/GenCombObj.pdf -- see the section
"Generating k-subsets (of an n-set): Lexicographical
Ordering" and the rank and unrank algorithms on the next few pages.
In order to achieve the exact theoretically optimal compression, in the case of a uniformly random distribution of 1s, we must thus use this technique to biject our matrices to our output number of range <2^128. It just so happens that combinations have a natural ordering, known as ranking and unranking of combinations. You assign a number to each combination (ranking), and if you know the number you automatically know the combination (unranking). Googling k-subset rank unrank will probably yield other algorithms.
Thus your solution would look like this:
serialize the matrix into a list
e.g. [[0,0,1][0,1,1][1,0,0]] -> [0,0,1,0,1,1,1,0,0]
take the indices of the 1s:
e.g. [0,0,1,0,1,1,1,0,0] -> [3,5,6,7]
1 2 3 4 5 6 7 8 9 a k=4-subset of an n=9 set
take the rank
e.g. compressed = rank([3,5,6,7], n=9)
compressed==412 (or something, I made that up)
you're done!
e.g. 412 -binary-> 110011100 (at most n=9bits, less than 2^n=2^9=512)
to uncompress, unrank it
I'll get to 128 bits in a sec, first here's how you fit a 14x14 boolean matrix with exactly 20 nonzeros into 136 bits. It's based on the CSC sparse matrix format.
You have an array c with 14 4-bit counters that tell you how many nonzeros are in each column.
You have another array r with 20 4-bit row indices.
56 bits (c) + 80 bits (r) = 136 bits.
Let's squeeze 8 bits out of c:
Instead of 4-bit counters, use 2-bit. c is now 2*14 = 28 bits, but can't support more than 3 nonzeros per column. This leaves us with 128-80-28 = 20 bits. Use that space for array a4c with 5 4-bit elements that "add 4 to an element of c" specified by the 4-bit element. So, if a4c={2,2,10,15, 15} that means c[2] += 4; c[2] += 4 (again); c[10] += 4;.
The "most wasteful" distribution of nonzeros is one where the column count will require an add-4 to support 1 extra nonzero: so 5 columns with 4 nonzeros each. Luckily we have exactly 5 add-4s available.
Total space = 28 bits (c) + 20 bits
(a4c) + 80 bits (r) = 128 bits.
Your input is a perfect candidate for a sparse matrix. You said you're using Matlab, so you already have a good sparse matrix built for you.
spm = sparse(dense_matrix)
Matlab's sparse matrix implementation uses Compressed Sparse Columns, which has memory usage on the order of 2*(# of nonzeros) + (# of columns), which should be pretty good in your case of 20 nonzeros and 14 columns. Storing 20 values sure is better than storing 196...
Also remember that all matrices in Matlab are going to be composed of doubles. Just because your matrix can be stored as a 1-bit boolean doesn't mean Matlab won't stick it into a 64-bit floating point value... If you do need it as a boolean you're going to have to make your own type in C and use .mex files to interface with Matlab.
After thinking about this again, if all your matrices are going to be this small and they're all binary, then just store them as a binary vector (bitmask). Going off your 14x14 example, that requires 196 bits or 25 bytes (plus n, m if your dimensions are not constant). That same vector in Matlab would use 64 bits per element, or 1568 bytes. So storing the matrix as a bitmask takes as much space as 4 elements of the original matrix in Matlab, for a compression ratio of 62x.
Unfortunately I don't know if Matlab supports bitmasks natively or if you have to resort to .mex files. If you do get into C++ you can use STL's vector<bool> which implements a bitmask for you.

Is there is some mathematical "optimum" base that would speed up factorial calculation?

Is there is some mathematical "optimum" base that would speed up factorial calculation?
Background:
Just for fun, I'm implementing my own bignum library. (-: Is this my first mistake? :-).
I'm experimenting with various bases used in the internal representation and regression testing by printing out exact values (in decimal) for n factorial (n!).
The way my bignum library represents integers and does multiplication, the time is proportional to the total number of "1" bits in the internal representation n!.
Using a base 2, 4, 8, 16, 2^8, 2^30, or etc. in my internal representation all give me exactly the same total number of "1" bits for any particular number.
Unless I've made some mistake, any given factorial (n!) represented in base 18 has fewer "1" bits than the same value represented in base 10 or or base 16 or base 19.
And so (in principle), using base 18 would make my bignum library run faster than using base 10 or some binary 2^w base or base 19.
I think this has something to do with the fact that n! is either shorter or has more "trailing zeros" or both when printed out in base 18 than in base 10 or or base 16 or base 19.
Is there some other base that would work even better than base 18?
In other words,
Is there a base that represents n! with even fewer "1" bits than base 18?
This is not a dup of "What is a convenient base for a bignum library & primality testing algorithm?" because I suspect "the optimum base for working with integers that are known to be large factorials, with lots of factors of 2 and 3" is different than "the optimum base for working with integers that don't have any small factors and are possibly prime".
(-: Is speeding up factorial calculations -- perhaps at the expense of other kinds of calculations -- my second mistake? :-)
edit:
For example:
(decimal) 16! ==
(decimal ) == 20,922,789,888,000 // uses decimal 14 "1" bits
(dozenal ) == 2,41A,B88,000,000 // uses decimal 10 "1" bits
(hexadecimal) == 130,777,758,000 // uses decimal 18 "1" bits
(octadecimal) == 5F,8B5,024,000 // uses decimal 14 "1" bits
(I'm more-or-less storing the digits on the right, without the commas, plus some metadata overhead).
(While one might think that "As you increase the base you will use fewer "1" bits to represent a given number", or ""As you increase the base you will use fewer nonzero digits to represent a given number", the above example shows that is not always true.)
I'm storing each digit as a small integer ("int" or "long int" or "byte"). Is there any other reasonable way to store digits?
I'm pretty sure my computer stores those integers in binary -- each "1", "2", "4", "8", and "G" digit use one "1" bit; each "3", "5", "6", "9", and "A" digit use two "1" bits; each "7" and "B" digit use three "1" bits; each "F" digit uses four "1" bits, etc.
Both the decimal and the octadecimal representation of this value (16!) require 14 "1" bits.
So I made a mistake in my earlier calculation: For every n, representing n! in octadecimal doesn't always have fewer "1" bits than representing the same value in decimal.
But the question still stands: is there some other "optimum" base that requires the fewest number of 1 bits for storing large factorials?
Someone asks: "How do you store those numbers?"
Well, that's exactly my question -- what is the best way of storing numbers of the form n! ?
I could internally use digits in base 10, or some power-of-two base, or base 18, or some other base. Which one is best?
I could store these integers internally as a 1D array of digits, with a length however long is needed to store all the digits. Is there any reasonable way of printing out 100! in decimal without such an array?
If you're just trying to optimize running time for calculating factorial, and changing the base is the only parameter you're changing, then the optimum base will likely contain small factors. 60 might be a reasonable choice. If you want to experiment, I would try various bases of the form (2^a)(3^b)(5^c)
Improving the speed of multiplication is probably the best way performance. What algorithm are you using for multiplication? (school-book, Karatsuba, Toom-Cook, FFT, ...)
There are other factors to consider, too. If you will be converting the numbers to decimal frequently, then a base that is a power of 10 will make the conversion as fast as possible.
Many(*) years ago, I wrote a base-6 floating point library specifically to solve a problem with repeated multiplication/division by 2 and/or 3. But unless you are trying to solve a specific problem, I think you will be better served by optimizing your algorithms in general than by just trying to optimize factorial.
casevh
(*) I originally said "Several years ago" until I remembered the program ran for many days on a 12Mhz 80286.
While from purely mathematical viewpoint the optimal base is e (and after rounding to nearest integer - 3), from a practical standpoint for a bignum library on a computer, pick a machine word size as the base for your numeric system ( 2^32 or 2^64 ). Yes, it's huge, but the higher abstraction layer of your bignum system is the choke point, the underlying calculations on machine words are the fast part, so delegate as much computation to the CPU low-level instructions while keeping your own work to minimum.
And no, it's not a mistake. It's a very good learning exercise.
I will not pretend I know any math, so do not take my answer as the holy "optimum" you are probably looking for. If I would have to do factorial as fast as possible, I would either try some approximation (something like Stirling approximation) or reduce the number of multiplications, because multiplication is expensive operation. If you represent the number in k-base, you can simulate multiplication by k with help of shifting. If you choose the 2-base, half of all multiplications will be shifts. The other multiplications are shifts and one bit switch. If you aim for minimizing the number of "1"s in your representation of the number, this depends on what numbers you represent. As you increase the base you will use fewer "1"s to represent a given number, but you will need to have more bits for every order, which means more potential "1"s. I hope it helps at least a bit, if not, just ask, I will try to answer.
If by '"1" bits' you mean digits, then I suggest a base of 256 or 65536. In other words, make each byte / word a "digit" for the purposes of your math. The computer handles these numbers routinely and is optimized for doing so. Your factorials will be swift, and so will other operations.
Not to mention the computer handles a great deal of the conversions from a similar base to these with ease. (rhyming unintentional)