Induction is a powerful method of proof, often useful in cases where we
wish to prove that some statement holds for all values of an integer
parameter,
, starting at some smallest value
. For example,
suppose we wish to prove that
, for all
. Then
will be
.
The idea behind induction is to show two things: (i) if the statement
ever ``gets going'' by being true for some
then it ``keeps
going'' by staying true for all larger
(what we might acronymize
as SIS (StartsImpliesStays), or ForeverAfter - if it
ever Starts to be true then it Stays true forever after); and (ii)
that the statement does in fact hold for the smallest value,
(the
so-called base case).
We usually do the ``start'' (or base) case first, and then show the
``never stops'' part (the so-called inductive step); but this is
merely a convention, and as long as we have done both parts then we
have shown that the statement holds for all
.
Another fanciful way to think of it is in terms of an inheritance law, eg, a fictitious law passed in 1800 saying that the children of every rich person must inherit all their parents' wealth. This is not enough to guarantee that anyone at all is rich today, because perhaps no one happened to be rich in the first place. But if we also know that person A was rich (say in 1890) then it follows all the descendants of A are rich as well, including future descendants not yet born. The key fact about A (rich) then corresponds to the base case, and the inheritance law is the inductive step. Thus richness - once it starts - never stops (at least in this imaginary world).
But it is a bit astonishing that one can ever show that a *mathematical* version of inheritance never stops - since we cannot expect Congress to pass laws guaranteeing our mathematics to come out right; and even if they did, we would not trust that kind of law to be a reliable guide to what is true.
So how can we be sure that a math statement stays true forever after,
ie, for all
just from its
being true for some particular
? Wouldn't we have to check it for
every larger value of
, to be sure? Yes and no. We do check it,
but (when the method works) we check it by showing that there is a
very nice connection between consecutive cases of
. That is, we
show that there is a mathematical inheritance law for the statement,
a law (or inductive step) showing (when it works) that the truth of the
statement for a given value of
, say
implies its truth for
. This still seems astonishing; and there are two points to be
aware of here: (i) induction does not always work, and (ii) when it
works it is because the statement we are trying to prove is one
where there is a special ``inheritance'' relation between consecutive
cases of
.
It is time to stop abstract descriptions, and see some examples.
How would we show, for instance, that
for
all
?
The forever after part requires us to show that, if the statement
(which in this case is an equation) is ever true (for some
, say
) then it stays true for the next
, ie it ``inherits'' to
). We show this by
SUPPOSING it does happen to be true for some
(this is called the
inductive hypothesis, or IH):
NOTE: It is crucial that
here not be arbitrary rather than a
particular number; it is used to stand for any number at all that
might make the statement true. This is because we are using
to
show that an inheritance law holds: whenever the statement happens to
hold, for any number, then it passes its truth on to the next
number. We use
and
simply as easy ways to refer to ``any
number'' and to ``the next number''.
We now use this supposition, IH, at
) SHOWING that the statement
stays true for the next
, namely
:
This always proceeds by
rewriting this new case (
) in a way that allows us to apply the
previous case (
), which we begin here:
We are not done; but we have taken the crucial step and the rest is
just algebra, to show that the above expression is equal to
. The crucial step was to rewrite at least some piece
of the
case so that the
case can be applied.
Note that we ASSUME the
case (the IH) - this is the ``if it ever starts''
part - we do NOT show it. What we do show is that, IF it starts (
)
then it CONTINUES (
). Of course, we do eventually need to show
it really does start somewhere, and that is the base case (
).
We'll get to that in a moment. Let's first finish that inductive
step from where we left off above:
So we have shown that
assuming
This means that the original statement,
,
will stay true if it ever can get ``going'', because once it has
become true for some value of
then it will be true for the next
value, and since it is then true for that next value, it stay true for
the one after that, and so on.
It is essential here that
NOT be taken to be any particular
constant value, such as
. This is why what we have shown is
really ``once-true, always-true''. The point is that if AT ANY TIME
(ie for ANY value of
) the statement should happen to be true, that
it will forever remain so after that. We need it to be this general
- with truth flowing from ANY
where truth might be, to the next
- and not just that it flows from
to
since the latter condition would not keep the statement's truth going
on past
!
Does it ever ``get going'' at all? This we quickly settle in our
present example (although sometimes it can be the hardest part). The
base case here is
, namely:
and this is simply
.
Note that not all (true) statements about integers are best proven via
inductions. For instance, it is true that if
is even then
is
even. Here we do NOT get any advantage from induction; and a more direct
approach works just fine:
Let
be any integer. We will show that if
is even then
is
even. Suppose
is even; this means
for some integer
. So
, so
is twice some
integer (which is what it means to be even). [Note: the converse is
also true: if
is even then so is
, but it is harder to prove.]
========================================================================
There are various versions of induction. The one presented above is
sometimes called ``ordinary'' (or weak) induction. A fancier version
called ``course-of-values'' (or strong) induction is this: we show
that if a statement manages to ``stay true'' all the way from
to
then it will also ``stay true'' at
. That is, we show
truth flows from an historical trend (from
to
) further on to
the one more step (
). (We also still need to show the process can
get started, so we still need a base case here, as before.)
The course-of-values version is sometimes easier to use, since to
prove the case
one has a ``bigger IH'' - more hypotheses - to
help out (the past course of truths all the way back to
and not
just the immediately previous step
).
For instance one would now be able to assume not only
It turns out that these extra hypotheses don't help us much in this example, so here is another example where course-of-values induction really helps:
THEOREM: Every integer
greater than 1 has a factorization into primes.
PROOF: The base case is
, and indeed 2 has a (trivial) prime
factorization, namely itself, since 2 is prime.
Now suppose the statement of the theorem holds whenever
. This is our (strong) IH: that all integers greater than 1, up
to
, do have a prime factorization. Using this, we want to show the
statement holds for
. Now if
is prime, we are done. And
if it is not prime, then (by definition) it is a product
where
. So by the IH, each of
and
does have a prime factorization. But since
is their
product, then it has the prime factorization which is the product of
the prime factorizations of
and
.
CONSTRUCTIVE INDUCTION
Constructive induction is best learned by example. Here is a typical
one:
Suppose we wish to prove that
SUM_i=1_to_n i
is a quadratic expression in n, but we don't know exactly which
expression. (We do happen to know, in fact, that it is really
n(n+1)/2, but let's pretend we don't).
So, we try to prove the SUM above is equal to an^2 + bn + c,
where a,b,c are unknown parameters. (In trying, we will find
out what values they need to have.) Suppose we try to prove
SUM = an^2 + bn + c
by induction on n, for n > 0. The base case is n=1:
SUM = a + b + c
which becomes
1 = a + b + c
since for n=1, SUM = 1. So, we will have to make sure that
in fact a + b + c = 1, so that the base case will go through.
Now we proceed to the inductive step: Assume
IH: SUM = an^2 + bn + c
for some n > 0, and consider SUM_i=1_to_n+1 i. This is
n+1 + SUM_i=1_to_n i
which by IH is
n+1 + an^2 + bn + c
and we want this to be equal to
a(n+1)^2 + b(n+1) + c
Setting these equal and doing some algebra:
n+1 + an^2 + bn + c = a(n+1)^2 + b(n+1) + c
n+1 = 2an + a + b
Notice that c has dropped out, so we can try letting c = 0,
to make things simpler; then a + b = 1, since we had above
that a + b + c = 1. Then
(*) n+1 = 2an + 1
or
2an = n
a = 1/2
and thus
b = 1 - a = 1/2
Therefore
SUM = 1/2 n^2 + 1/2 n
and this is the same as
n(n+1)
SUM = -------
2
Thus, WITHOUT KNOWING IN ADVANCE the exact formual for the sum, we
were nonetheless able to FIND IT AND PROVE IT simply by guessing,
not the exact formula but rather only its general form (quadratic).
The method then determined the exact values of the coefficients,
as part of the inductive method of proof.
Had we chosen c = 1 instead of c = 0 (perhaps in order to get a + b =
0) then we'd have found n+1 = 2an (instead of n+1 = 2an + 1 as in (*)
above). It would follow that a = (n+1)/2n = 1/2 + 1/2n which is not a
constant, and thus the choice of c = 1 cannot give us a solution of
the desired form. At this point we should then go back and chose
another value for c, such as c=0 as we did above. Does this mean that
there might still be other solutions that do not have a=constant?
Well, once we do find a solution where a-constant, then any other
solutions must be equal to that one (even if in a different form),
since all solutions must be equal to the original SUM.
--------------------------------------------------------------------------
Note that if we feel our original guess is too general (too many
parameters to worry about) we could have strengthened (and simplified)
it at the beginning by a little more computing. For example for n
equal to 2, we get the following condition on a, b, and c:
1 + 2 = 3 = a 2^2 + b 2 + c = 4a + 2b + c
which gives us, together with c = 1 - a - b,
2 = 3a + b
a relation between only a and b: b = 2 - 3a. We then could use,
instead of the original guess an^2 + bn + c, the more specific
guess an^2 + (2 - 3a)n + (1 - a - (2 - 3a)), i.e.,
SUM = an^2 + (2 - 3a)n + 2a - 1
This we can also prove by induction, first the base case and then
the inductive step, finding that for the inductive step to work,
we need a = 1/2 as before.
---------------------------------------------------------------------------
Some further examples of induction proofs:
THEOREM: Let X be a set with n elements. Then the powerset of X,
P(X), has 2^n elements.
PROOF (by induction): If n=0 (base case) then X is empty, and P(X) has
only one element. Since 1 = 2^0, this case is done.
Now suppose (IH) that the theorem is true for all sets with n=k
elements; we must show that it also holds for sets with k+1 elements.
So let X be a set with k+1 elements. How do we get a set with k
elements from X,so we can apply the IH? We remove one element, say b,
from X, leaving a smaller set Y with k elements: X = Y u {b}. Then by
the IH, P(Y) has exactly 2^k elements. Notice that all these 2^k
subsets of Y are also subsets of X, and that none of them contains the
missing element b. The *other* subsets of X, ie the ones *not* among
the 2^k subsets of Y, are precisely those that *do* contain b -- how
many of those are there? Well, each such subset is of the form {b} u
S where S consists of elements other than b, ie S is a subset of Y,
and there are 2^k such S's. So there are 2^k subsets of X that contain
b, and since there are also 2^k that do not, then the total number is
the sum: 2^k + 2^k = 2^(k+1), and we are done.
THEOREM: Let T be a binary tree with height h. Then T has at most
2^(h+1) -1 nodes.
PROOF (by strong induction): If h=0 (base case) then T has only one
node (the root) and 1 = 2^(0+1) - 1.
Now suppose (IH) that the theorem is true for all binary trees of
height h <= k; we must show that it also holds for trees of height
h=k+1. So let T be a tree of height k+1. How do we get a tree of
height k from T, so that we can apply the IH? Here is the trick:
remove the root from T, which then leaves two subtrees T_left and
T_right. Each of these has height less than the height of T, hence at
most k, so the IH applies to them: n_left <= 2^(k+1)-1 and n_right<=
2^(k+1)-1. Let the total number of nodes in T be n, so that n = n_left
+ n_right + 1. Then n <= 2 * 2^(k+1) -1 -1 +1 = 2^(k+1) - 1, and we
are done.