How to Convert To Base 10

I want to provide an algorithm for converting an N-digit number, written in base K, to base 10. To begin, let's define: digit.

Usually, a digit refers to a value written in base 10, between 0 and 9. Non-negative numbers written in base 10 are a sequence of one or more digits.

However, I want to use a more general definition, which allows a digit to be a single value written in base K, instead of simply in base 10.

Definition A digit is a single symbol used to represent a value in base K.
Just so you don't get confused, I will write the base of a number as a subscript. For example, 7238 is a 3-digit value written in base 8. It consists of the digits: '7', '2', and '3'.

Suppose a number has N digits. To make it easier to talk about each digit, I will subscript the digits from right to left, starting with the subscript 0. Therefore, an N digit value will look like: dN-1 dN-2 ... d1 d0.

For example, for the number 723, the digits are d2=7, d1=2, and d0=3. I'll explain why the digits are subscripted right to left rather than left to right later on.

Range of Values for a Single Digit

Think about the following question: what is the range of valid values for a single digit in base K?

To answer this, let's pick a specific base to work in. What is the valid range of digits in base 10? The valid digits range from '0' through '9' inclusive.

In binary (base 2), the valid values are 0 and 1.

Do you see a pattern? Notice that the range of the values for a single digit always fall short of the base itself. Thus, for a arbitrary base, K, the valid range of values for a single digit are between 0 and K - 1, inclusive.

That raises the question: what happens when you go past base 10. For example, what about base 11? If you write out the valid range of digits, you'd get '0', '1', ... ,'9', but what comes after '9'? Should you write '10'? That doesn't seem right, because it uses TWO digits. We'd like a way to represent the value 10 with one "symbol".

It's common to use alphabetic characters, where 'A' stands for 10, B for 11, and so forth. Using alphabetic characters only goes so far. You can go up to 'Z' and represent numbers up to base 36.

(In case you're wondering, the base is equivalent to the number of symbols used for a single digit. There are 10 digits from '0' to '9', plus 26 alphabetic characters, so 36 total symbols means you can go up to base 36).

You might wonder what happens if you have go to a higher base. Do you start making up even more symbols?

Here's where we cheat a little. You don't really have to invent even more symbols to represent other digits. What you can do is to make up a symbol like vi where i is a value written in base 10. For example, suppose you need to write some numbers in base 100. One possible way to write this is: v23 v98 v4.

We still think of this as a 3 digit value. The v's help us determine where the individual digits are.

Number of possible values for N digit number in base K

Computers store numbers using a finite number of digits. Obviously, if it were infinite, you would need an infinite amount of memory to store a number. However, it's even more restrictive. Computers typically store numbers using very few bits. Numbers are often stored using 32 or 64 bits. This means there is both a maximum and a minimum value. This also means there are a finite number of representable numbers.

This is very important! There are some numbers that can't be natively represented on a computer (there are some inefficient ways around this problem, though).

To explore this further, let's consider an N digit number in base K. How many possible different N-digit patterns are there? (I occasionally call the numbers, patterns, so don't let that bother you).

The answer requires simple combinatorics. Since each digit is in base K, there are K choices for the first digit, K choices for the second digit, and so forth. The answer is to multiply the choices together to get K multiplied to itself N times, or simply, KN.

Again, let's consider a simple example. How many 4-digit base-3 patterns are there? The answer is 34 = 81. Since these values are contiguous starting at 0, we can represent numbers from 0 up to 80 in base 3 using exactly 4 base-3 digits.

Converting from base 10 to base 10

After all that introduction, we're ready to convert numbers. We're going to do this in a similar way that you were taught in elementary school (assuming you did your education in the US---if not, hopefully, this won't be too hard to follow).

In elementary school days, you may have been that that digits have places. For example, there's the 1's place, the 10's place, the 100's place and so forth.

Here's an example of "1976" written. The first row indicates the "place" and the second row are the individual digits of 1976.

1000's 100's 10's 1's
1 9 7 6


You were taught that you could write these numbers out as:
   1976 = (1 * 1000) + (9 * 100) + (7 * 10) + (6 * 1) 
        = 1976
In words, you multiplied the digit by its place, and summed the result together. For example, 9 is in the hundred's place, so you multiply 9 by 100.

This elementary school exercise must have seem utterly trivial to you, and perhaps you wondered why you were taught that 1976 happens to equal 1976.

Only this is not nearly as easy as you might think. In elementary school, they may have also taught you Roman numerals. You hardly ever see Roman numerals outside of some copyright years on films (and that's starting to be rare) or the current incarantion of the Super Bowl (which seems to the be last major display of Roman numerals).

Consider a number like DCLXXIX. If memory serves, this Roman numerals for 679 in base 10. Roman numerals have bizarre rules. For example, 7 is written VII. 8 is written VIII. 9 is written IX. When the I comes before the X, it means to subtract 1. Roman numerals have some odd combination of unary values (III meaning 3) and subtraction (such as IX). The rules are more complex than Arabic numerals, the numbers you and I are most familiar with.

Instead of calling it the thousands place, the hundreds places, and so forth, we're going to modify the scheme just a little by using powers of 10.

Thus, we can write:

103 102 101 100
1 9 7 6


Recall that any positive number raised to the power 0 is 1.

Thus, we can rewrite 1976 as:

   1976 = (1 * 103) + (9 * 102) + (7 * 101) + (6 * 100) 
        = 1976
Let's use that notation with the d and the subscripts.

d3 d2 d1 d0
103 102 101 100
1 9 7 6


What do you notice in common between the green rows and the red ones? You'll notice that the subscript of the green rows match the superscript of the red one. For example, d3 refers to 103.

Earlier on, I said that I'd explain why we number right to left, starting at index 0. This is the reason. The subscripts refer to the power of 10 you multiply the value.

This leads us to the following formula:

You multiply the digit's value by 10 raised to the power of the digit's subscript.

Converting from base K to base 10

While the above formula may have seemed trivially stupid in elementary school, it's important when it comes to other bases. For example, what if I ask you "what is 2123? in base 10". How would you answer?

The summation formula given above works in other bases, but we have to modify it somewhat.

The main difference is that 10 has been replaced by K, where K is the base.

Let's try a simple example:

d2 d1 d0
32 31 30
2 1 2


Here, K = 3, so we just use the summation formula:
   2123 = (2 * 32) + (1 * 31) + (2 * 30) 
        = (2 * 9) + (1 * 3) + (2 * 1)
        = 18 + 3 + 2
        = 2310
If K (the base) is greater than 10, then the problem is a little trickier. You must convert the alphabetic characters to their base 10 values. For example,

d2 d1 d0
112 111 110
1 1 A


We wish to convert 11A11 to base 10.
   11A11 = (1 * 112) + (1 * 111) + (A * 110) 
        = (1 * 121) + (1 * 11) + (10 * 1)
        = 121 + 11 + 10
        = 14210

Converting Fractional Values From Base 10 to Base K

You can write fractional values in base 10 by including a decimal point and values afterwards. Since decimal point suggest "base 10", we'll use the more general radix point as our terminology. In base 10, the radix point is called the decimal point.

We only need to extend the summation formula given earlier to handle a finite number of digits left of the radix point and right of the radix point. Assume that we have a N digits left of the radix point and M digits right of the radix point.

The summation formula given in previous sections can be generalized to:

Let's look at an example of this. Suppose you want to convert 101.0112 to base 10. You'd write:

d2 d1 d0 d-1 d-2 d-3
22 21 20 2-1 2-2 2-3
1 0 1 0 1 1


Using the summation formula, where N=3 and M=3, we get:
   101.0112 = (1 * 22) + (0 * 21) + (A * 20) 
                         + (0 * 2-1) + (1 * 2-2) + (1 * 2-3) 
        = (1 * 4) + (0 * 2) + (1 * 1) 
            + (0 * .5) + (1 * .25) + (1 * .125)
        = 4 + 0 + 1 + 0 + .25 + .125
        = 5.37510
Notice that the summation still works. We now add negative subscripts for digits right of the radix point, and just like before, we raise the base to the negative power, based on the subscript.

Summary

To convert a number written in base K, with N digits left of the radix point and M digits right of the radix point, use the following summation:

If there are no values right of the radix point. That is, if M=0, then the number of numeric "patterns" (i.e., the number of possible values) of an N digit base-K number is KN.

Those are probably the two most pertinent facts presented here.