Data representation using binary

Vocab

A bit is a single binary digit, with a value of 0 or 1.

A byte is a group of 8 bits.

Binary (base 2) represents data using sequences of bits. Base 2 refers to each digit (bit) having exactly 2 possible values (0 or 1).

Decimal (base 10) refers to our usual system for representing numbers. Base 10 refers to each digit having exactly 10 possible values (0, 1, 2, 3, 4, 5, 6, 7, 8, 9).

Hexadecimal (base 16) represents data using sequences of hexadecimal digits. Base 16 refers to each digit having exactly 16 possible values. The values 0 - 9 are represented as in base 10. The values 10 - 15 are represented by the letters A - F.

Unicode is a scheme for representing text (letters, numbers, and other symbols) as numbers. ASCII is a subset of Unicode. ASCII includes all uppercase and lowercase letters. ('A' is represented as 65. 'B' is represented as 66. 'a' is represented as 97. 'b' is represented as 98.)

A signed integer is a whole number that can be either negative or non-negative (0 or positive).

An unsigned integer is a whole number that is assumed to be non-negative (0 or positive).

Base conversion

From base 10 (decimal)

Repeatedly divide the base 10 number by the desired base. Keep both the quotient and the remainder. Stop when the quotient is 0. Read the remainders from bottom to top.

What is `233` in base 10 in base 2?

233 / 2 = 116 R1
116 / 2 =  58 R0
 58 / 2 =  29 R0
 29 / 2 =  14 R1
 14 / 2 =   7 R0
  7 / 2 =   3 R1
  3 / 2 =   1 R1
  1 / 2 =   0 R1

233 in base 10 is 11101001 in base 2.

What is `702` in base 10 in base 16?

When working with hexadecimal (base 16), writing out the digits above 9 is often helpful.

10: A
11: B
12: C
13: D
14: E
15: F

702 / 16 = 43 R14 (E)
 43 / 16 =  2 R11 (B)
  2 / 16 =  0 R2

702 in base 10 is 2BE in base 16.

To base 10

Write each digit with its place value underneath. Multiply each digit by its place value. Add the results.

What is `11101001` in base 2 in base 10?

1      1      1      0      1      0      0      1
2^7    2^6    2^5    2^4    2^3    2^2    2^1    2^0
128 +  64 +   32 +   0 +    8 +    0 +    0 +    1  = 233

11101001 in base 2 is 233 in base 10.

What is `2BE` in base 16 in base 10?

2       B       E
16^2    16^1    16^0
512 +   176 +   14 = 702

2BE in base 16 is 702 in base 10.

Binary limits

Number of values that can be stored

A 1 bit binary number can store 2 different values:

0
1

A 2 bit binary number can store 4 different values:

Each additional bit doubles the number of values that can be stored.

An n bit binary number can store 2^n different values.

Range of numbers that can be represented

If a sequence of bits is used to store a number, the largest number that can be stored depends on how numbers are represented.

When using 4 bits to represent an unsigned integer, what is the largest number that can be represented?

4 bits can be used to represent 16 different numbers (2^4). If 0 is one of those numbers, the largest number that can be represented is 15.

When using 8 bits to represent a signed integer, what are the largest and smallest numbers that can be represented?

Signed integers use 1 bit to represent the sign (negative or non-negative). The remaining 7 bits can be used to represent 128 different numbers (2^7).

Zero is usually represented as a non-negative number. The largest number that can be represented is 127.

There is no need to represent negative zero. The smallest number that can be represented is -128.

Integer overflow

Example in Java

Any fixed (unchanging) number of bits can only represent a fixed number of values. Most programming languages represent integers using a fixed number of bits.

Java represents signed integers using 32 bits. The largest integer that can be represented directly in Java is 2147483647 (2^(32 - 1) - 1). The smallest integer that can be represented directly is -2147483648 (-(2^(32 - 1))).

The code segment below adds 1 to the largest integer that can be stored directly.

int x = 2147483647;
System.out.println(x);
x++;
System.out.println(x);

The code segment above outputs:

2147483647
-2147483648

This is known as integer overflow. Calculations that exceed that range of values that can be represented wrap around the end (or beginning) of the range, potentially more than once.

Example using 4 bits to represent an unsigned integer

When used to represent an unsigned integer, the binary (base 2) number 1111 represents 15 in base 10. This is the largest unsigned integer that can be represented with 4 bits.

Adding 1 to 1111 results in 10000 in base 2 (16 in base 10). If only 4 bits are available, the rightmost 4 bits are retained and the leftmost bit is dropped. The result is 0000 in base 2 (0 in base 10).

Additional resources

See Working with char values for a demonstration of how Java handles Unicode. The discussion includes converting between the numeric and symbolic representations.

See Floating point roundoff error for discussion of issues with calculations involving floating point numbers (numbers with a decimal point).

Comments

Comment on Data representation using binary