In what situations is octal base used? - language-agnostic

I've seen binary and hex used quite often but never octal. Yet octal has it's own convention for being used in some languages (ie, a leading 0 indicating octal base). When is octal used? What are some typical situations when one would use octal or octal would be easier to reason about? Or is it merely a matter of taste?

Octal is used when the number of bits in one word is a multiple of 3, or if the grouping of the bits makes sense to notate in groups of 3. Examples are
ancient systems with 18bit word sizes (mostly historical)
systems with 9bit bytes (mostly historical)
unix file permissions with 9bits (3*3bits, "rwxr-x---" 0750)
unix file permissions with 12bits (the same as the 9bit version but adding three bits in front for setuid, setgid, and sticky, 01777 but the letters are more complicated here)
I have not encountered any uses of octal other than unix file permission bits during my about 25 years in IT.
If the number of bits in your word is a multiple of 4, however, please do use hex, by all means.

Octal is used as a shorthand for representing file permissions on UNIX systems. For example, file mode rwxr-xr-x would be 0755.

Octal is used when the syntax is a relic from the ages when it perhaps made sense on some platform (system words haven't always been a multiple of 8 bits). Nowadays hex is the thing to use.

Didn't think of this but Digital displays!
Several other uses from:
http://en.wikipedia.org/wiki/Octal

One of the main reasons octal used to be more frequently used was that it is easier to convert between octal and binary in your head than hex to binary: you only have to remember the binary representation of the 8 octal digits (0-7).
Back in the days when debugging meant reading register contents from a row of LEDs, or entering data with an array of toggle switches, this was a big concern. The panels on many of these early computers grouped the LEDs and switches in groups of threes to facilitate this.
However, hex began to win out as word sizes that are multiples of 8-bit bytes began to win out, and the need to read and enter data in binary became unecessary (with console text UI and later GUI debuggers).

If birds could count, my guess would be that they use octal. While most birds have 3 digits on their feathered "hands", most are Tetradactyly, meaning 4 toes on each foot.

In avionics, ARINC 429 word labels are almost always expressed in octal.

Music, as long as you stay away from (most) sharps and flats.

FYI, there are a few places that windows and javascript automatically decide that a number prefixed with a zero is octal and convert the number.
In windows if you ping and address like 10.0.2.010 it will actually ping 10.0.2.8
Windows also does this if you enter it as the ip/dns address for the computer
Though it is deprecated, Javascript does this by default on some functions like parseInt if you do not specify a radix http://www.w3schools.com/jsref/jsref_parseint.asp

Related

Representation of numbers in the computer

In the representation of inputs in the computer, the numbers are taken as characters and encoded with Ascii code or are they converted directly to binary? in other way: When my input is considered as integer and not a character?
Both are possible, and it depends on the application. In other words the software programmer decides. In general, binary representation is more efficient in terms of storage requirements and processing speed. Therefore binary representation is more usual, but there are good examples when it is better to keep numbers as strings:
to avoid problems with conversions
phone numbers
when no adequate binary representation is available (e.g. 100 digits of pi)
numbers where no processing takes places
to be continued ...
The most basic building block of electronic data is a bit. It can have only 2 values, 0 and 1. Other data structures are built from collection of bits, such as an 8-bit byte, or a 32-bit float.
When a collection of bits needs to be used to represent a character, a certain encoding is used to give lexical meaning to these bits, such as ASCII, UTF8 and others.
When you want to display character information to the screen, you use a graphical layer to draw pixels representing the "character" (collection of bits with matching encoding) to the screen.

Why HTML decimal and HTML hex?

I have tried to Google quite a while now for an answer why HTML entities can be compiled either in HTML decimal or HTML hex. So my questions are:
What is the difference between HTML decimal and HTML hex?
Why are there two systems to do the same thing?
Originally, HTML was nominally based on SGML, which has decimal character references only. Later, the hexadecimal alternative was added in HTML 4.01 (and soon implemented in browsers), then retrofitted into SGML in the Web Adaptations Annex.
The apparent main reason for adding the hexadecimal alternative was that all modern character code and encoding standards, such as Unicode, use hexadecimal notation for the code numbers of characters. The ability to refer to a character by its Unicode number, written in the conventional hexadecimal notation, just prefixed with &#x and suffixed with ;, helps to avoid errors that may arise if people convert from hexadecimal to decimal notation.
There are three radixes used in computer technologies:
Binary, radix 2, because ultimately integers are arrays of switches, each which may be on (1) or off (0).
Octal, radix 8, because each digit represents exactly 3 bits, so it's easy to convert to binary.
Decimal, radix 10, because humans have 10 fingers and because we grew up using this radix.
Hexadecimal, radix 16, because like octal it's easy to convert to bits, but even better because 2 hex digits can be expressed in exactly 1 byte. If, for example, you see an rgba value given in hex as 0x00ff00ff, you can see instantly that it represents opaque green.
So, to answer the question posed, for some of us hex is the natural way to express integers as it gives more insight into the storage. For others it's decimal. To each his or her own!
Finishing with an HTML example: could &65536; be a character of utf-16? In hex it's easy to see that the answer is no, because its the same as &x10000; which needs more than 16 bits.

Why is it useful to know how to convert between numeric bases?

We are learning about converting Binary to Decimal (and vice-versa) as well as other base-conversion methods, but I don't understand the necessity of this knowledge.
Are there any real-world uses for converting numbers between different bases?
When dealing with Unicode escape codes— '\u2014' in Javascript is — in HTML
When debugging— many debuggers show all numbers in hex
When writing bitmasks— it's more convenient to specify powers of two in hex (or by writing 1 << 4)
In this article I describe a concrete use case. In short, suppose you have a series of bytes you want to transfer using some transport mechanism, but you cannot simply pass the payload as bytes, because you are not able to send binary content. Let's say you can only use 64 characters for encoding the payload. A solution to this problem is to convert the bytes (8-bit characters) into 6-bit characters. Here the number conversion comes into play. Consider the series of bytes as a big number whose base is 256. Then convert it into a number with base 64 and you are done. Each digit of the new base 64 number now denotes a character of your encoded payload...
If you have a device, such as a hard drive, that can only have a set number of states, you can only count in a number system with that many states.
Because a computer's byte only have on and off, you can only represent 0 and 1. Therefore a base2 system is used.
If you have a device that had 3 states, you could represent 0, 1 and 2, and therefore count in a base 3 system.

Should implicit octal encoding be removed or changed in programming languages?

I was looking at this question. Basically having a leading zero causes the number to be interpreted as octal. I've ran into this problem numerous times in multiple languages.
Why doesn't the language explicitly require you to specify octal with a function call or a type (in strong typed languages) like:
oct variable = 2;
I can understand why hexadecimal (0x0234) has this format. Hex is pretty useful. An integer from the database will never have an x in it.
But octal numbers 0123 look like ints and are a pain to deal with. I've never used octal for anything.
Can anyone explain the rationale behind this usage? Is it just a bit of historical cruft?
It's largely historic. The best solution I've seen is in the new version of Python, where octal is indicated with a special prefix character "o", much like hexadecimal's "x" prefix:
0o10 == 0x8 == 8
99.9% of the reason it exists is to support chmod() calls, i.e. chmod(fd, 0755).
It does rather seem like a format more like hex's would be superior.
It exists since working with 3-bit segments is almost as useful as working with 4-bit segments. This was more true in the past (e.g., seven-segment LEDs, chmod, etc.).
The real question is why haven't more languages adopted octal and binary notations in a more regular fashion:
10 == 0b1010 == 0o12 == 0x0A
I know that Python finally adopted the 0o8 notation... not sure if they have adopted the binary one as well. I guess a better question is Why does this still trip people up?
I hate this too, I don't know why it's been carried forward into so many modern languages. I once knew someone who had a zip code like "09827" when he lived in NYC. Sometimes he had to input his zip code as "9827," because the leading zero would lead to error messages (since 9's and 8's are illegal characters in octal numbers).
Yes, it's historical. C uses this way to specify literals in octal, and possibly it was used somewhere before that.
I've experienced it in Javascript, where parsing dates stops working in august. Up to july it works as '07' parsed as octal is still seven, but '08' is not a valid number... (The solution is to specify the number base in the parseInt call.)
In C# there are no binary or octal literals, perhaps the reasoning is that you shouldn't do as much bit fiddling that the language needs it...
Personally, I blame the programmer in this case. Why are you formatting an integer by zero padding? Zero padding is for strings, not numeric types.

Why do most languages not allow binary numbers?

Why do most computer programming languages not allow binary numbers to be used like decimal or hexadecimal?
In VB.NET you could write a hexadecimal number like &H4
In C you could write a hexadecimal number like 0x04
Why not allow binary numbers?
&B010101
0y1010
Bonus Points!... What languages do allow binary numbers?
Edit
Wow! - So the majority think it's because of brevity and poor old "waves" thinks it's due to the technical aspects of the binary representation.
Because hexadecimal (and rarely octal) literals are more compact and people using them usually can convert between hexadecimal and binary faster than deciphering a binary number.
Python 2.6+ allows binary literals, and so do Ruby and Java 7, where you can use the underscore to make byte boundaries obvious. For example, the hexadedecimal value 0x1b2a can now be written as 0b00011011_00101010.
In C++0x with user defined literals binary numbers will be supported, I'm not sure if it will be part of the standard but at the worst you'll be able to enable it yourself
int operator "" _B(int i);
assert( 1010_B == 10);
In order for a bit representation to be meaningful, you need to know how to interpret it.
You would need to specify what the type of binary number you're using (signed/unsigned, twos-compliment, ones-compliment, signed-magnitude).
The only languages I've ever used that properly support binary numbers are hardware description languages (Verilog, VHDL, and the like). They all have strict (and often confusing) definitions of how numbers entered in binary are treated.
See perldoc perlnumber:
NAME
perlnumber - semantics of numbers and numeric operations in Perl
SYNOPSIS
$n = 1234; # decimal integer
$n = 0b1110011; # binary integer
$n = 01234; # octal integer
$n = 0x1234; # hexadecimal integer
$n = 12.34e-56; # exponential notation
$n = "-12.34e56"; # number specified as a string
$n = "1234"; # number specified as a string
Slightly off-topic, but newer versions of GCC added a C extension that allows binary literals. So if you only ever compile with GCC, you can use them. Documenation is here.
Common Lisp allows binary numbers, using #b... (bits going from highest-to-lowest power of 2). Most of the time, it's at least as convenient to use hexadecimal numbers, though (by using #x...), as it's fairly easy to convert between hexadecimal and binary numbers in your head.
Hex and octal are just shorter ways to write binary. Would you really want a 64-character long constant defined in your code?
Common wisdom holds that long strings of binary digits, eg 32 bits for an int, are too difficult for people to conveniently parse and manipulate. Hex is generally considered easier, though I've not used either enough to have developed a preference.
Ruby which, as already mentioned, attempts to resolve this by allowing _ to be liberally inserted in the literal , allowing, for example:
irb(main):005:0> 1111_0111_1111_1111_0011_1100
=> 111101111111111100111100
D supports binary literals using the syntax 0[bB][01]+, e.g. 0b1001. It also allows embedded _ characters in numeric literals to allow them to be read more easily.
Java 7 now has support for binary literals. So you can simply write 0b110101. There is not much documentation on this feature. The only reference I could find is here.
While C only have native support for 8, 10 or 16 as base, it is actually not that hard to write a pre-processor macro that makes writing 8 bit binary numbers quite simple and readable:
#define BIN(d7,d6,d5,d4, d3,d2,d1,d0) \
( \
((d7)<<7) + ((d6)<<6) + ((d5)<<5) + ((d4)<<4) + \
((d3)<<3) + ((d2)<<2) + ((d1)<<1) + ((d0)<<0) \
)
int my_mask = BIN(1,1,1,0, 0,0,0,0);
This can also be used for C++.
for the record, and to answer this:
Bonus Points!... What languages do allow binary numbers?
Specman (aka e) allows binary numbers. Though to be honest, it's not quite a general purpose language.
Every language should support binary literals. I go nuts not having them!
Bonus Points!... What languages do allow binary numbers?
Icon allows literals in any base from 2 to 16, and possibly up to 36 (my memory grows dim).
It seems the from a readability and usability standpoint, the hex representation is a better way of defining binary numbers. The fact that they don't add it is probably more of user need that a technology limitation.
I expect that the language designers just didn't see enough of a need to add binary numbers. The average coder can parse hex just as well as binary when handling flags or bit masks. It's great that some languages support binary as a representation, but I think on average it would be little used. Although binary -- if available in C, C++, Java, C#, would probably be used more than octal!
In Smalltalk it's like 2r1010. You can use any base up to 36 or so.
Hex is just less verbose, and can express anything a binary number can.
Ruby has nice support for binary numbers, if you really want it. 0b11011, etc.
In Pop-11 you can use a prefix made of number (2 to 32) + colon to indicate the base, e.g.
2:11111111 = 255
3:11111111 = 3280
16:11111111 = 286331153
31:11111111 = 28429701248
32:11111111 = 35468117025
Forth has always allowed numbers of any base to be used (up to size limit of the CPU of course). Want to use binary: 2 BASE ! octal: 8 BASE ! etc. Want to work with time? 60 BASE ! These examples are all entered from base set to 10 decimal. To change base you must represent the base desired from the current number base. If in binary and you want to switch back to decimal then 1010 BASE ! will work. Most Forth implementations have 'words' to shift to common bases, e.g. DECIMAL, HEX, OCTAL, and BINARY.
Although it's not direct, most languages can also parse a string. Java can convert "10101000" into an int with a method.
Not that this is efficient or anything... Just saying it's there. If it were done in a static initialization block, it might even be done at compile time depending on the compiler.
If you're any good at binary, even with a short number it's pretty straight forward to see 0x3c as 4 ones followed by 2 zeros, whereas even that short a number in binary would be 0b111100 which might make your eyes hurt before you were certain of the number of ones.
0xff9f is exactly 4+4+1 ones, 2 zeros and 5 ones (on sight the bitmask is obvious). Trying to count out 0b1111111110011111 is much more irritating.
I think the issue may be that language designers are always heavily invested in hex/octal/binary/whatever and just think this way. If you are less experienced, I can totally see how these conversions wouldn't be as obvious.
Hey, that reminds me of something I came up with while thinking about base conversions. A sequence--I didn't think anyone could figure out the "Next Number", but one guy actually did, so it is solvable. Give it a try:
10
11
12
13
14
15
16
21
23
31
111
?
Edit:
By the way, this sequence can be created by feeding sequential numbers into single built-in function in most languages (Java for sure).