When tracing most recursive functions, there is winding and unwinding part. The "winding" part occurs as the recursion heads to the base case. The "unwinding" part occurs when the recursion returns back to the original call. Most people forget there is the "unwinding" phase.
The winding and unwinding is not really special to recursion. It occurs with any function.
Suppose function a() has a call to function b(), and function b() has a call to function c(), and function c() has a call to function d(). Once function d() is done, what happens next? It goes back to c(), then to b(), and finally back to a().
So you can think of going from a() to d() as the "winding" of the recursion, and returning back to a() as the unwinding.
The same thing happens with recursive functions, which goes to show you that recursive functions aren't any more special than normal functions. If function f() makes a recursive call to function f(), which makes a call to function f(), which makes a call to function f() (which is the base case), then it will eventually go back to f(), then f(), and finally back to the original f(). That may be harder to follow, but it's really the same principle.
The block is divided into several parts. The rectangle has two squares inside. Those will store the results of any recursive call (assuming the recursive call returns back values). The block above can store the results of two recursive calls (in the two small squares on the right).
To the left of the small squares are the parameters of the current function call. The parameters can either be "by value" (in which case, they store copies) or "by reference" or pointers, in which case they point to some "global" memory location.
The circle on the right will store the return value of the recursive call.
Let's consider an easy recursive call. We want to sum
the elements of an array. This is the code:
int sum( int arr[], int n )
{
if ( n == 0 )
return 0;
else
{
int smallResult = sum( arr, n - 1 ); // A
return smallResult + arr[ n - 1 ];
}
}
Assume the array contains: { 2, 4, 6 }, and that the
call to the sum is: sum( arr, 3 ) which will sum the
first three elements of the array.
The initial call to sum fills in the block. Since arr is an array and arrays are really pointers, there's a pointer to the "global" array. The arrow in the diagram represents a pointer to the array at the top. n, however, is a value parameter, so a copy of n resides in the box.
The letter "A" lies under the recursive call, and also appears in the code above. The reason for labeling recursive calls is to make it easier to know where to go back to once the recursive call is done. In this case, there's only one recursive call, so it's easy to find. However, some recursive functions have two calls, so labeling makes it easier to follow.
As the initial call to sum is made, the base case is not true (i.e., n is not 0), so you go into the "else" and make a recursive call to sum, this time passing a value of 2 (which is n - 1, where n is 3 at the time of the call.
This produces a diagram that looks like:
The top sum makes a call to sum, passing in the same arr pointer (it is a copy of the pointer, but the copy points to the same array). Notice that n has a value of 2.
Here's the important question, to see if you're keeping up. How many n's are there? One or two? Recall that n is a value parameter. Because it is a value parameter, that means it's a copy. Thus, it occupies a different memory location.
Thus, there are two copies of n. As it turns out, there are two copies of arr too. However, since it's a pointer, the copies are copies of addresses, and both addresses point to the array, arr.
In the second call, n is 2, which means we aren't at the base case yet. So, it again goes into the else case, and makes a recursive call.
Again, we're not yet to the base case, so we must make one
more recursive call.
At this point, we're at the base case. This makes no
recursive call (hence, A isn't used here). The value of 0
is returned (that's what's in the circle) and gets placed
in the "A" box of the recursive call just above it. It's
worth pointing out that the base case
Recall that "A" is the line that stores smallResult. It adds the result in "A" to arr[ n - 1 ]. The value of n is 1 (within the box), so arr[ n - 1 ] = arr[ 1 - 1 ] = arr[ 0 ] = 2. So, add 0 + 2. This is why it's important to have 0 be the base case value. We add 0 to the value at array element 0. Any other value produces an incorrect answer. So, 2 is returned back to the previous call.
That's returned at step "A", which again, is stored in smallResult.
Then 2 is added to arr[ n - 1 ] = arr[ 2 - 1 ] = arr[ 1 ] = 4.
2 + 4 is 6, and that's returned back.
Finally, 6 is stored in smallResult and that will be
added to arr[ n - 1 ] = arr[ 3 - 1 ] = arr[ 2 ] = 6,
which is 12, and 12 is the final result of the call. Fortunately,
this is the answer expected.
Second, it's helpful to label recursive calls, In this case, it was labeled with letters. You do this to keep track of what's going on. Recall that a recursive call, like any other function call, eventually returns back to the point of being called. However, since you're calling the same function, it's easy to make mistakes when tracing the code.
Third, recursion involves a "winding" phase where the calls are progressively getting closer to the base case, and you are getting to smaller and smaller problems, and an "unwinding" phase, when you begin to return back to the original call. It's usually in the "unwinding" phase where the solution is generated.
Starting at the base case, you have a value that is then used to solve the call from the function that called the base case, which is used to solve the call that called the call that called the base case, and so forth. Basically, the solution is being built up, until finally, you reach the original call, and the final solution is arrived at, having been built up from the base case.
Example 1
int sumOdd( int arr[], int n )
{
if ( n == 0 )
return 0;
else
{
int smallResult = sumOdd( arr, n - 1 ); // "A"
if ( arr[ n - 1 ] % 2 == 1 ) // test for odd
return smallResult + arr[ n - 1 ];
else
return smallResult;
}
}
This only adds odd values of the array. Try tracing
this version.
Example 2
int sum( int arr[], int n )
{
if ( n == 0 )
return 0;
else
{
int smallResult = sum( arr + 1, n - 1 ); // "A"
return smallResult + arr[ 0 ];
}
}
This example is trickier. It uses pointer arithmetic
to move the pointer "arr" forward. In fact, in each recursive
call arr moves forward by 1. Initially, arr is
& arr[ 0 ]. Eventually, it becomes arr becomes
& arr[ 2 ].
What happens? Recall that in C/C++, pointers are actually pass-by-value. While the original "arr" is fixed (i.e., the array declared in, say, main()), the "arr" as a parameter is merely a pointer, and pointers can be copied.
Tracing the code is a little complicated, because you need
to have something like:
In this example, arr is the parameter variable, while
list is the original array passed in. By giving them different
names, it's easier to follow what's going on.
When the next call is made, you get:
As you can see, arr now points to list[ 1 ]. Thus, arr[ 0 ] is the same as list[ 1 ].
This is why you can repeatedly add arr[ 0 ] and still get the correct answer. For each recursive call arr points to a different element of list.
Example 3
int fib( int n )
{
if ( n == 0 ) // base case 1
return 1;
else if ( n == 1 ) // base case 2
return 2;
else
{
int resultA = fib( n - 1 ); // "A"
int resultB = fib( n - 2 ); // "B"
return resultA + resultB;
}
}
This example is interesting to trace for several reasons.
First, it has more than one base case. In fact, it has two.
Second, there are two recursive calls (labeled "A" and "B").
Third, it's very inefficient. Once you trace, you will see
why. There's a much more efficient solution using loops.
Here's a picture of the initial call.
Notice there's a spot to store the value for the first and second recursive call.
Example 4
int sum( int arr[], int n, int result )
{
if ( n == 0 ) // base case 1
return result;
else
return sum( arr, n - 1, result + arr[ n - 1 ] );
}
This example is interesting to trace because there is no
real return value to worry about. Once you reach the base
case, result is returned, and that's returned all the
way up to the original call.
Whenever the return statement of the recursive call has no more work to do AFTER the recursive call, the function is said to be tail-recursive.
None of the previous three examples are tail-recursive since each of the previous examples do work after the initial recursive call. Tail-recursive functions can be optimized by compilers by converting the recursion to a loop, which saves on the amount of memory used. However, many C++ compilers do not make this optimization.
Tail recursive answers usually pass the solution from call to call. Thus, result is carrying the current solution.
In fact, many loops can be converted to tail recursive calls by using reference parameters. Since reference parameters refer to a specific memory location, it's basically like updating a global variable.
The function has to be written a little differently if you use
reference parameters, because reference parameters only accept
lvalues (i.e., variables or array elements) as arguments. So,
you'd have to rewrite sum as:
void sum( int arr[], int n, int & result )
{
if ( n == 0 ) // base case 1
; // nothing to do, result has answer
else
{
result += arr[ n - 1 ];
return sum( arr, n - 1, result );
}
}
Notice that the return type is now void, and that you must
compute the result before passing it to sum, since the
third argument of sum needs to be an lvalue (so while
result is an lvalue, result + arr[ n - 1 ] is not.
That's an rvalue, and you can't pass rvalues to reference
parameters.