Re: JavaMemoryModel: RE: Question on memory models

From: Dan Scales (scales@pa.dec.com)
Date: Wed Jun 30 1999 - 18:42:57 EDT


I believe that Sarita's message is misleading in implying that something
as sophisticated as value prediction is needed on current systems
to result in the scenario described by Bill Pugh:

> Initially, Mem[0] = 1, Mem[1] = 3, Mem[2] = 4
>
> Processor/thread 1:
>
> Mem[2] := 5
> memory barrier
> Mem[0] := 2
>
> Processor/thread 2:
>
> R1 := Mem[0]
> R2 := Mem[R1]
>
> On a number of processor memory models, including the Dec Alpha,
> these actions could result in processor/thread 2 loading 4 into R2
> (Seeing the new value for Mem[0] and the old value for Mem[2]).

Given the Alpha memory model, all you need is a processor/system that
immediately acknowledges incoming invalidation requests, allowing a
remote write operation to complete, but does not necessarily process
the invalidation request until the next memory barrier on the local
processor. Such an optimization is a logical way to take advantage of
the Alpha memory model in a large multiprocessor system. In such a
system, it may make sense to give incoming invalidation requests lower
priority and only handle them when free cache cycles are available,
until a memory barrier operation requires that they be processed.

Given that the handling of invalidation requests may be delayed, the
scenario described by Bill happens very easily if Mem[2] is initially
cached on Processor 2, but Mem[0] is not.

When processor 1 initially modifies Mem[2], it sends an invalidation
request to processor 2. Processor 2 immediately acknowledges
receiving the inval request, but does not immediately process the
inval request. The write to mem[2] completes when it receives the
invalidation acknowledgment, and then processor 1 proceeds to write to
Mem[0].

At this point, processor 2 tries to read Mem[0], gets a cache miss,
and fetches the new value of Mem[0], all the while not processing the
inval request, because it has not executed a memory barrier. When
processor 2 receives back the data in Mem[0], it immediately reads
Mem[2] and sees the stale value of Mem[2] in its cache.

Thus, this scenario can easily happen for straightforward designs of
large multiprocessor systems that use relaxed memory models, and does
not require new techniques such as value prediction. So, I don't
think there's any way that Java can influence hardware design (which
Sarita mentioned as a "last resort") so as to eliminate the need for
the memory barrier on the read side.

Dan Scales
-------------------------------
This is the JavaMemoryModel mailing list, managed by Majordomo 1.94.4.

To send a message to the list, email JavaMemoryModel@cs.umd.edu
To send a request to the list, email majordomo@cs.umd.edu and put
your request in the body of the message (use the request "help" for help).
For more information, visit http://www.cs.umd.edu/~pugh/java/memoryModel



This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:14 EDT