2011-10-29

Binary, octal, and hexadecimal

Why do computer programmers use binary, octal, and hexadecimal codes?  Binary is good because a computer becomes simpler and more efficient when working in binary, and octal and hexadecimal makes it easier for a human to relate to binary.  Why not decimal?  I'll write another post soon to explain the relationship between binary and decimal, which is less natural than octal/hex.


One of the fundamental problems in computing is representation, which is to say making it possible to input data into a computing system and have the system process, store, and output data in a consistent manner, i.e. the representation never distorts the data content or enters state that doesn't translate back to valid data.

Humans typically use decimal representation for numerical data.  This means that we use a combination of the digits 0--9 to represent every natural number.  This allows us to store numerical data by writing digits on paper, and to process it by using computing procedures such as short division etc.

It is quite possible to build a machine that works with decimal representation, but any kind of representation will work for a machine.  This is fortunate, because a large set of digits means that the machine will be very complex.  The smallest possible number of digits in a computing system is 2 (0 and 1), as in a binary representation digital computer.

Early computers (ca 1940s) were binary and handled data as sets of off/on, high/low, punched hole/no hole signals translating to 0 and 1.  You could input data e.g. by setting an array of switches to either on or off, and the computer could output results e.g. by setting indicator lights to either on or off.  Internally, the computer could use either high or low voltage for digit transfer.  This was a simple and efficient system for handling data in a machine, and modern computers still use it even though it's mostly invisible to the user today.

However, binary digits ("bits") quickly become hard to work with when the size of the data gets big.  00101110 is manageable, but bit strings of 12, 24 or 36 bits (the word sizes of some 1960s computers) are a pain to input manually or to read on a binary display.


 The solution was to split the bit string into groups of three bits and read each group as a digit from 0--7 (see http://en.wikipedia.org/wiki/Octal).  Pressing the keys 6, 5, 6, and 4 on an octal keyboard enters the bit groups 110, 101, 110, and 100 in the processor, filling a 12-bit word with the bit string 110101110100.  The same bit string can be sent, three bits at a time, to a four-digit 7-segment LED display to show the octal digits 6564:


This is essentially a machine using octal interface (input and output) and binary internal data handling.

By this time, computer users already used QWERTY keyboards for input and character display screens (or text printers) for output, but for programmers and system technicians octal code was still indispensible as it made it possible to get closer to the inner workings of the machine.

As word sizes changed to 16, 32, and 64 bits in the 1980s, it became impossible to divide words evenly into three-bit groups, and a new kind of coding, hexadecimal, was introduced.  Hex code divided the bit string in four-bit groups, and had each possible bit pattern replaced by a digit 0--F (0--9, A, B, C, D, E, F).


Hex codes became as dear to programmers and technicians as octal had been, and hobbyists working with single-board computers are often constrained to use hex code unless they add special hardware for decimal/character input devices and corresponding displays.

The beauty of octal and hexadecimal codes, compared to decimal, is that they can map directly and exhaustively to binary.

Octal mapping

    octal   binary
      0  ←–→  000
      1  ←–→  001
      2  ←–→  010
      3  ←–→  011
      4  ←–→  100
      5  ←–→  101
      6  ←–→  110
      7  ←–→  111

Hexadecimal mapping

    hexadecimal   binary
         0     ←–→ 0000
         1     ←–→ 0001
         2     ←–→ 0010
         3     ←–→ 0011
         4     ←–→ 0100
         5     ←–→ 0101
         6     ←–→ 0110
         7     ←–→ 0111
         8     ←–→ 1000
         9     ←–→ 1001
         A     ←–→ 1010
         B     ←–→ 1011
         C     ←–→ 1100
         D     ←–→ 1101
         E     ←–→ 1110
         F     ←–→ 1111

No comments:

Post a Comment