CMSC 411 Project - Fall 1998

CMSC 411 Project A General Overview of Parallel Processing Fall 1998
Synchronization
Mark Pacifico	mjpac@wam.umd.edu	Mike Merrill	mwm@glue.umd.edu
WHAT IS SYNCHRONIZATION?
Synchronization is another function that a parallel architecture must provide. It involves getting two or more processors to "agree" on the value of a variable or a time at which to start or stop. For our purposes, we will focus on the three most common instances (or lack) of synchronization: mutual exclusion, barriers, and memory consistency. With mutual exclusion, only one processor is allowed access to a particular memory location at a time. Barriers involve making processors wait at a certain point for others to catch up to maintain consistency in processing a parallel job. Finally, memory consistency refers to a problem that occurs on shared memory machines with caches. All three of these important issues with regard to synchronization are discussed below.
MUTUAL EXCLUSION
As stated above, mutual exclusion refers to permitting only one processor access to a particular memory location at a time. Let us examine the simple problem of computing the sum s of n numbers in parallel. Let us further assume that each processor contains one of the numbers to be computed, namely, x1, x2, ....., xn. Also, s resides in another processor, Proc0. The obvious algorithm for computing this sum is for each processor to fetch s from Proc0, add its value to s, and store it back in Proc0. The problem with this is that depending on the order that each processor accesses Proc0, the answer could range form the true sum to any partial sum of the numbers. To illustrate, let us take a simple example using only two numbers x1 and x2 stored in processors Proc1 and Proc2. Here is one possibility for the execution in the order in which the instructions occur: Proc1 loads s in Proc0, Proc2 loads s in Proc0, Proc1 adds its value to s, Proc2 adds its value to s, Proc1 stores s back into Proc0, Proc2 stores s back into Proc0. In this example, the value that s contains when BOTH Proc1 and Proc2 fetch it is 0. Therefore, the final value at the end of the algorithm with this execution path is x2 since this is the last value stored in s. If the two stores are in reverse time, the final value of s would be x1. This is called a race condition because the final value of s depends on the condition of whether Proc1 or Proc2 is slightly ahead of the other. Mutual exclusion provides a mechanism to avoid the race condition by allowing just one processor access to a variable. This is often implemented by providing an instruction that sets a particular word to a particular value while fetching the old value, all without other processors being able to interfere.
BARRIERS/MEMORY CONSISTENCY
Barriers are a simple concept which ensure consistency in processing a parallel job. All a barrier does is make each processor wait at the same point in the program so that all others can "catch up". Then, the processors are allowed to proceed. This is necessary to make sure a parallel job that the all the processors had been cooperating on is finished. Memory consistency is the final issue we will discuss concerning synchronization. As stated earlier, memory consistency refers to a problem on shared memory machines with caches. As you may know, the purpose of caches is to place commonly used data in them for faster access instead of needing to access the slower memory. To illustrate, suppose processors Proc1 and Proc2 both read memory location 12 into their caches. After reading and using location 12, Proc1 and Proc2 then try to store new, different values into it. The question is which value will make it back into memory at location 12? There are several ways of solving this problem using memory consistency, each with different drawbacks. One method is to insist that each processor has the same view of all memory locations at the same time so that location 12 has a single value across the machine. This is called sequential memory consistency due to the fact that it is the same way memory looks on a sequential machine. The problem with this approach is that all writes must hit main memory and update all other caches in the machine, which is very expensive. Another approach in some shared memory machines is to use weaker memory consistency models than the sequential approach. We will not go into detail on these, but in these models, the user is left to program in such a way that the memory consistency problems will not occur.
Proposal \| Introduction \| History \| Parallelism \| Communication \| Synchronization \| Summary & Questions
Copyright 1998, Mark Pacifico & Mike Merrill