What is the difference between binary and ASCII based file comparison? - binary

If I use a file comparison tool like fc in Windows, you can choose between ASCII and binary comparison.
What is the actual difference between these two comparisons? If I compare two ASCII files, don't I want the binary data of the files to be identical?

WARNING: this is 5 year old loose remembrance of knowledge from uni
Binary representation means you compare the binary exactly, and ascii is a comparison of data type. to put it in a simple case the char 'A' is a representation of 01000001, but that is also an 8 bit integer equal to '65', so that means A = 65 in binary. so if you were doing A + A as a string and 65 43 65 (43 is '+' in binary to decimal), in binary they would be equivalent, but in ascii they would not. This is a very loose explanation and i'm sure i missed a lot, but that should sum it up loosely.
In a text file you want ASCII because you write in ascii characters. In say, a program state saved to a file you want binary to get a direct comparison.

Related

What's the exact meaning of the statement "Since ASCII used 7 bits for the character, it could only represent 128 different characters"?

I come across the below statement while studying about HTML Character Sets and Character Encoding :
Since ASCII used 7 bits for the character, it could only represent 128
different characters.
When we convert any decimal value from the ASCII character set to its binary equivalent it comes down to a 7-bits long binary number.
E.g. For Capital English Letter 'E' the decimal value of 69 exists in ASCII table. If we convert '69' to it's binary equivalent it comes down to the 7-bits long binary number 1000101
Then, why in the ASCII Table it's been mentioned as a 8-bits long binary number 01000101 instead of a 7-bits long binary number 1000101 ?
This is contradictory to the statement
Since ASCII used 7 bits for the character, it could only represent 128
different characters.
The above statement is saying that ASCII used 7 bits for the character.
Please clear my confusion about considering the binary equivalent of a decimal value. Whether should I consider a 7-bits long binary equivalent or a 8-bits long binary equivalent of any decimal value from the ASCII Table? Please explain to me in an easy to understand language.
Again, consider the below statement :
Since ASCII used 7 bits for the character, it could only represent 128
different characters.
According to the above statement how does the number of characters(128) that ASCII supports relates to the fact that ASCII uses 7 bits to represent any character?
Please clear the confusion.
Thank You.
In most processors, memory is byte-addressable and not bit-addressable. That is, a memory address gives the location of an 8-bit value. So, almost all data is manipulated in multiples of 8 bits at a time.
If we were to store a value that has by its nature only 7 bits, we would very often use one byte per value. If the data is a sequence of such values, as text might be, we would still use one byte per value to make counting, sizing, indexing and iterating easier.
When we describe the value of a byte, we often show all of its bits, either in binary or hexadecimal. If a value is some sort of integer (say of 1, 2, 4, or 8 bytes) and its decimal representation would be more understandable, we would write the decimal digits for the whole integer. But in those cases, we might lose the concept of how many bytes it is.
BTW—HTML doesn't have anything to do with ASCII. And, Extended ASCII isn't one encoding. The fundamental rule of character encodings is to read (decode) with the encoding the text was written (encoded) with. So, a communication consists of the transferring of bytes and a shared understanding of the character encoding. (That makes saying "Extended ASCII" so inadequate as to be nearly useless.)
An HTML document represents a sequence of Unicode characters. So, one of the Unicode character encodings (UTF-8) is the most common encoding for an HTML document. Regardless, after it is read, the result is Unicode. An HTML document could be encoded in ASCII but, why do that? If you did know it was ASCII, you could just as easily know that it's UTF-8.
Outside of HTML, ASCII is used billions—if not trillions—of times per second. But, unless you know exactly how it pertains to your work, forget about it, you probably aren't using ASCII.

How computers convert decimal to binary integers

This is surely a duplicate, but I was not able to find an answer to the following question.
Let's consider the decimal integer 14. We can obtain its binary representation, 1110, using e.g. the divide-by-2 method (% represents the modulus operand):
14 % 2 = 0
7 % 2 = 1
3 % 2 = 1
1 % 2 = 1
but how computers convert decimal to binary integers?
The above method would require the computer to perform arithmetic and, as far as I understand, because arithmetic is performed on binary numbers, it seems we would be back dealing with the same issue.
I suppose that any other algorithmic method would suffer the same problem. How do computers convert decimal to binary integers?
Update: Following a discussion with Code-Apprentice (see comments under his answer), here is a reformulation of the question in two cases of interest:
a) How the conversion to binary is performed when the user types integers on a keyboard?
b) Given a mathematical operation in a programming language, say 12 / 3, how does the conversion from decimal to binary is done when running the program, so that the computer can do the arithmetic?
There is only binary
The computer stores all data as binary. It does not convert from decimal to binary since binary is its native language. When the computer displays a number it will convert from the binary representation to any base, which by default is decimal.
A key concept to understand here is the difference between the computers internal storage and the representation as characters on your monitor. If you want to display a number as binary, you can write an algorithm in code to do the exact steps that you performed by hand. You then print out the characters 1 and 0 as calculated by the algorithm.
Indeed, like you mention in one of you comments, if compiler has a small look-up table to associate decimal integers to binary integers then it can be done with simple binary multiplications and additions.
Look-up table has to contain binary associations for single decimal digits and decimal ten, hundred, thousand, etc.
Decimal 14 can be transformed to binary by multipltying binary 1 by binary 10 and added binary 4.
Decimal 149 would be binary 1 multiplied by binary 100, added to binary 4 multiplied by binary 10 and added binary 9 at the end.
Decimal are misunderstood in a program
let's take an example from c language
int x = 14;
here 14 is not decimal its two characters 1 and 4 which are written together to be 14
we know that characters are just representation for some binary value
1 for 00110001
4 for 00110100
full ascii table for characters can be seen here
so 14 in charcter form actually written as binary 00110001 00110100
00110001 00110100 => this binary is made to look as 14 on computer screen (so we think it as decimal)
we know number 14 evntually should become 14 = 1110
or we can pad it with zero to be
14 = 00001110
for this to happen computer/processor only need to do binary to binary conversion i.e.
00110001 00110100 to 00001110
and we are all set

Number Systems-Hex vs Binary

Question with regards to the use of hexadecimal. Here's a statement I have a question about:
Hexadecimal is often the preferred means of representing data because it uses fewer digits than binary.
For example, the word vim can be represented by:
Hexadecimal: 76 69 6D
Decimal: 118 105 109
Binary: 01110110 01101001
01101101
Obviously, hex is shorter than binary in this example, however, wont the hex values eventually be converted to binary at the machine level so the end result for hex-binary is exactly the same?
This is a good question.
Yes, the hexadecimal values will be converted in binaries at machine level.
But you are looking the question from the machine point of view.
Hexadecimal notation was introduced because:
It's more easy to read and memorize than binaries for human. For example if you are reading memory addresses, you can observe that they are actually written in hexadecimal, that is far more simple to read then binary.
It's easy to do calculation from binaries to hexadecimal than other base (like
our today-used base 10). For example, it's easy to group binary digits into hex in your head (4 bits per hex digit).
I suggest you this article that gives some easy example calculations to better understand hexadecimal advantages.

What is the difference between 65 and the letter A in binary?

What is the difference between 65 and the letter A in binary as both represent same bit level information?
Basically, a computer only understand numbers, and not every numbers: it only understand binary represented numbers, ie. which can be represented using only two different states (for example, 1 and 2, 0V and 5V, open and close, true or false, etc.).
Unfortunately, we poor humans doesn't really like reading zeros and ones... So, we have created some codes, to use number like if they were characters: one of them is called ASCII (American Standard Code for Information Interchange), but there is also some others, such as Unicode. The principle is simple: all the program have to do is manipulating numbers, what any CPU does very well, but, when it comes to displaying these data, the display represent them as real characters, such as 'A', '4', '#', or even a space or a newline.
Now, as soon as you are using ASCII, the number 65 will represent the letter 'A'. All is a question of representation: for example, the binary number 0bOOOO1111, the hexadecimal one 0x0F, the octal one 017 and the decimal number 15 all represent the same number. It's the same for letter 'A': think of ASCII as a base, but instead of using the base 2 (binary), 8(octal), 10(decimal) or 16(hexadecimal), to display numbers, it's used in a complete different manner.
To answer your question: ASCII 'A' is hexadecimal 0x41 is decimal 65 is octal 0101 is binary 0b01000001.
Every character is represented by a number. The mapping between numbers and characters is called encoding. Many encodings use for the letter A the number 65. Since in memory there are no special cells for characters or numbers, they are represented the same way, but the interpretation in any program could be very different.
I may be misunderstanding the question and if so I apologise for getting it wrong
But if I'm right I believe your asking what's the difference between a char and int in binary representation of the value 65 which is the ascii decimal value for the letter A (in capital form)
First off we need to appreciate data types which reserve blocks of memory in the ram modules
An interget is usually 16 bits or more if a float or long (in c# this declaration is made by stating uint16, int16, or int32, uint32 so on, so forth)
A character is an 8 bit memory block
Therefore the binary would appear as follows
A byte (8 bits) - char
Decimal: 128, 64, 32, 16, 8, 4, 2, 1
Binary: 01000001
2 bytes (16 bit) - int16
Binary; 0000000001000001
Its all down to the size of the memory block reserved based on the data type in the variable declaration
I'd of done the decimal calculations for the 2 bit but I'm on the bus at the moment
First of all, the difference can be in size of the memory (8bits, 16bits or 32bits). This question: bytes of a string in java
Secondly, to store letter 'A' you can have different encodings and different interpretation of memory. The ASCII character of 'A' in C can occupy exact one byte (7bits + an unused sign bit) and it has exact same binary value as 65 in char integer. But the bitwise interpretation of numbers and characters are not always the same. Just consider that you can store signed values in 8bits. This question: what is an unsigned char

How to convert alphabet to binary?

How to convert alphabet to binary? I search on Google and it says that first convert alphabet to its ASCII numeric value and than convert the numeric value to binary. Is there any other way to convert ?
And if that's the only way than is the binary value of "A" and 65 are same?
BECAUSE ASCII vale of 'A'=65 and when converted to binary its 01000001
AND 65 =01000001
That is indeed the way which text is converted to binary.
And to answer your second question, yes it is true that the binary value of A and 65 are the same. If you are wondering how CPU distinguishes between "A" and "65" in that case, you should know that it doesn't. It is up to your operating system and program to distinguish how to treat the data at hand. For instance, say your memory looked like the following starting at 0 on the left and incrementing right:
00000001 00001111 000000001 01100110
This binary data could mean anything, and only has a meaning in the context of whatever program it is in. In a given program, you could have it be read as:
1. An integer, in which case you'll get one number.
2. Character data, in which case you'll output 4 ASCII characters.
In short, binary is read by CPUs, which do not understand the context of anything and simply execute whatever they are given. It is up to your program/OS to specify instructions in order for data to be handled properly.
Thus, converting the alphabet to binary is dependent on the program in which you are doing so, and outside the context of a program/OS converting the alphabet to binary is really the exact same thing as converting a sequence of numbers to binary, as far as a CPU is concerned.
Number 65 in decimal is 0100 0001 in binary and it refers to letter A in binary alphabet table (ASCII) https://www.bin-dec-hex.com/binary-alphabet-the-alphabet-letters-in-binary. The easiest way to convert alphabet to binary is to use some online converter or you can do it manually with binary alphabet table.