The topic of this page is to determine whether or not a Loop can be run in parallel. This is determined by looking at the structure of the loop and figuring out what the data dependancies are, and whether they will allow the compiler to Unroll the loop.

The primary data dependancies we need to look at are between two instructions INSIDE the loop, and between two instructions accross iterations, or an inter-iteration dependance.
The first dependance will not cause a problem for unrolling, save that the instructions must be kept in the same order, when the loop is unrolled. The second dependance may or maynot cause the loop to be Non-Parallel. If the a statement requires a value it produces in a previous iteration, then thatstatement is said to have a Circular Dependance. The occurance of a Circular Dependance, means the loop cannot be made Parallel.
Just because a loop has an inter-iteration dependance does not mean that the loop cannot be unrolled, as long as the dependance is NOT Circualr, than it can. The only thing that is neccessary is that the inter-iteration dependance needs to be removed by moving peices of the code around.

Example 1

Can the following Loop be made Parallel?

for (i=1;i<=100;i=i+1){
A[i+1] = A[i] + C[i]; /*S1*/
B[i+1] = B[i] + A[i+1]; /*S2*/
}

To find the solution, we first need to look at the dependancies.
Inside one iteration:
S2 is dependant on S1 for the value of A[i+1].

Between iterations:
S1 depends on S1 for the value of A[i] computed in iteration i-1.
S2 depends on S2 for the value of B[i] computed in iteration i-1.
Note that S1 depends on S1, and S2 depends on S2 inbetween loop iterations. This is the RED flag we discussed earlier. Because the loop containscircular data dependancies, this loop can not be unrolled.

Example 2


Lets look at another loop.

for (i=1;i<=100;i=i+1)
{
A[i] = A[i] + B[i]; /*S1*/
B[i+1] = C[i] + D[i]; /*S2/
}

First start at the dependancies inside one interation of the loop. The only dependance inside the loop is A[i] as a destination and source, but this is not really an issue.
Ok, since that is not an issue, lets take a look at the dependancies between iterations:
B[i+1] is computed, so for iteration i, S1 depends on S2 in iteration i-1.
Since no line contains a circular dependance, the loop can be unrolled. However, it will need a little arranging to remove the one dependance that is still there.
Since S1 does not depend on S2, the instructions do not need to be executed in the order that they are above. This is important to note, since we want to remove the inter-iteration dependance. To remove this dependance we need to shift the loop a little.
If we execute S2 for iteration i, then S1 for iteration i+1 as the loop. Doing this will cause S1 to be 'missed' on the first iteration, so to fix that, we will need to run it once before entering the loop, with an index of 1. In effect this executes half of the first iteration of the loop and so we need to run S2 once the loop is complete to finish the other half of it. (Remember here that we are actually executing half of each iteration in our new loop) With those modifications the loop now looks like this:
A[1] = A[1] + B[1];
for (i=1;i<=99;i=i+1)
{
    B[i+1] = C[i] +D[i];
    A[i+1] = A[i+1] + B[i+1];
}
B[101] = C[100] + D[100];
This loop can be unrolled, and then run multiple iterations with fewer branches, which, of course, eliminates stalls.

A quick summary of the material presented here:
If there is a circular dependance in between loop iterations, Loop can not be unrolled. End of discussion.

If there are dependancies inbetween too iterations, but they are not circular, you may be able to rewrite the code so that the dependancies are located in a inside a single iteration, and then the loop can be unrolled. If you can not rewrite the loop to remove the inter-iteration dependancies then the loop is not parrallel.

If there are no data dependancies, then the loop can be unrolled, as it is.