Unsigned Binary

Introduction

At the very least, most people want to represent integers on computers. There are two issues, however. First, an integer can be arbitrarily large. That is, an integer can be infinite in size. Computers have finite storage. Thus, you must represent integers using finite number of bits, which forces an arbitrary limit to how big an integer you can represent.

Second, even though memory is now quite cheap, that doesn't mean we should use a large number of bits on integers. In particular, if you use too many bits, then it's cumbersome to store large numbers of integers. Imagine if you have a million integers (a very large database, for example). If each integer requires 1000 bytes of memory, that's one billion bytes of memory.

You want the number of bits per integer to allow you to represent a wide range of usuable numbers, but in a few number of bytes.

Historically, memory was a big consideration. Thus, integers used to be stored in 16 bits. 16 bits allows for 65,536 different bit patterns (usually called 64K), thus up to 65,536 different representable values.

64K isn't a lot of values, but memory was a consideration. It wasn't until about the 1990s when 32 bit CPUs, and therefore 32 bit int values, were much more commonly seen. This allows values up to about 4 billion bitstring patterns. That's still not very large, but is more reasonable.

The number of bytes used in an int has often been a function of memory and CPU. The more bits you need, the more pins need to appear on a CPU, and the more memory you need to store a large number of ints.

Int vs. Integers

I will often refer to integers stored in computers are int. That's to avoid confusion with mathematical integers which can be arbitarily large. It's implied, from using int, that the values have a maximum size.

Unsigned Int

It's easier to discussed unsigned int, i.e., representations of ints that are non-negative (0 and greater).

Let's assume that a typical unsigned int uses 32 bits of memory. We define the following terms, to be used throughout the discussion of signed and unsigned ints.

Chart

Number of Bits Number of Values Min Value Max Value
16 216 0 216 - 1
32 232 0 232 - 1
N 2N 0 2N - 1

You'll notice that the minimum value for unsigned int is always 0. Thus, the max value is 2N - 1 for N bits.

There's two ways to convince yourself 2N - 1 is the correct maximum value.

First Way: Indices of Array

First, you need to know that given N bits, there are 2N different bitstring patterns. Imagine an array of that size. Its minimum index is 0. Its maximum index is the size of the array minus 1, which is 2N - 1.

Second Way: Binary Odometer

The other way to view this is to think about what 2N looks like in binary. It looks like 1, followed by N zeroes. What happens when you subtract one from 1 followed by N zeroes?

It may be easier to answer this question in base 10, first. Imagine you have an odometer that reads 1000. That's 1 followed by 3 zeroes. Subtract 1 from 1000, and you get 999, which is three 9's. Now, Suppose you have 1,000,000 (1 followed by 6 zeroes). Subtract 1 from that and you get 999,999 (six 9's).

What's the general pattern for subtracting 1 from 1 followed by N zeroes? It is N 9's.

Similarly, subtract 1 from 1 followed by 2N is N 1's, Thus 2N - 1 is the formula for N 1's.

Conclusion

It's often a good idea to see things presented in two different ways. In this case, you saw two explanation to understand why the largest value in for N bits, unsigned in 2N - 1.

Why Unsigned?

It will be easier to see why you use unsigned, once we discussed signed int, but one obvious reason to use unsigned int is when you want to declare variables which will never be negative. For example, a particular quantity (e.g., how many sodas you drank), or the index of an array.

Even though these quantities are never negative, sometimes you want to have negative numbers for error conditions. For example, suppose you want to write a function that determines where a certain character appears in a character array. It returns the index of that character.

Suppose the character isn't there. What value should you return? You can't return -1 because you can't represent negative values for variables with type, unsigned int.

One possibility is to return the array size, since the array size is larger than all valid indices of an array. However, the problem with that approach is that this number varies from array to array. -1 is nice because you can use it regardless of the size of an array.

A possible solution is to use a second variable that is set to 0 if not found and 1, if found.

MAXINT

Sometimes C defines a constant called MAXINT. This constant is the largest possible value for unsigned int (or sometimes signed int).

This value can sometimes be used as an error value. However, I'm not sure how standard it is on C compilers.