JavaMemoryModel: Question about the semantics of volatile

From: Bill Pugh (pugh@cs.umd.edu)
Date: Wed Mar 17 2004 - 11:04:01 EST


OK, a question has come up regarding the semantics of volatile.
There is a point on which there are two different interpretations.
As far as the model goes, it can handle either interpretation.

So we'd like to open the question up to discussion and debate by
the broader JMM community.

For each volatile, there is a total order over all accesses to that
volatile.

Strong interpretation:
        There is a happens-before (or release/acquire) relationship from
        each write to each latter read of that volatile.

Weak interpretation:
        There is a happens-before (or release/acquire) relationship from
        each write to each latter read of that volatile _that sees that
        write_.

Here is an example that shows the difference:

Initially, x = y = v = 0.
v is a volatile variable.

Thread 1:
r1 = x
v = 0
r2 = v
y = 1

Thread 2:
r3 = y
v = 0
r4 = v
x = 1

Is the behavior r1 == r2 == 1 possible?

Under the weak interpretation, each thread might see its own volatile
write,
in which case the volatile accesses would have no impact. In fact, under
the weak semantics, the compiler could eliminate the volatile reads and
transform this to:

Thread 1:
r1 = x
v = 0
r2 = 0
y = 1

Thread 2:
r3 = y
v = 0
r4 = 0
x = 1

Under the strong semantics, this would not be possible, since we would
be
guaranteed an hb edge from the first volatile write to both volatile
reads.

The definition of Lazy Release Consistency uses the weak definition.
We are investigating what the actual behavior of a DSM like Treadmarks
would be.

Another place where this makes a difference: back in 2000, Rob Strom
wanted guidance
on whether his optimistic readers transformation could be implemented
in Java.
The strong semantics makes this possible (see email below). I'm not
sure it
is possible to do the optimistic reads transformation with the weak
semantics.

Thoughts?

        Bill

----
	From: 	  pugh@cs.umd.edu
	Subject: 	JavaMemoryModel: Roach motel volatile ordering and the 
optimistic readers trans
	Date: 	July 6, 2001 3:25:17 PM EDT
	To: 	  javamemorymodel@cs.umd.edu

At 1:09 PM +1000 6/29/01, David Holmes wrote: > It is not clear to me that volatile and monitors should have the same > semantics when it comes to reorderings. The basic "roach motel" > approach > with synchronized blocks works fine (so we've been persuaded :) ) > because of > the added mutual exclusion. With volatiles there is no mutual > exclusion and > I am concerned that these ordering rules may inhibit the use of > volatile in > implementing various wait-free/non-blocking algorithms. > > As an example, the optimistic readers transform of Strom and Auerbach > (ref > ECOOP 2001) requires insertion of a dummy volatile read after a > volatile > write to prevent subsequent non-volatile writes from being performed > prior > to the volatile write. If volatiles are to be useful as flags in > wait-free/non-blocking algorithms then it seems to me that ordering is > critical and so volatile accesses should act as code motion barriers > in both

The load/acquire and store/release semantics for synchronization memory accesses has a long history in the architecture community. For example, see:

K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J.L. Hennessy. Memory consistency and event ordering in scalable shared-memory multiprocessors. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 15--26. IEEE, May 1990.

Making volatile read/write be full two directional memory barriers will likely at least double the cost of volatile reads on IA-64 and Alpha SMPs.

Plus, you don't really need it. Here is an example of Rob's optimistic readers idiom (from his paper):

volatile int vno = 0; int a,b; synchronized void inc(int x) { vno++; a += x; b += x; vno++; }

/* unsynchronized */ int prod() { int v1 = vno; /* pre-inspect */ int t1 = a; int t2 = b; int v2 = vno; /* post-inspect */ if (v1 == v2 && v1%2 == 0) return t1*t2; /* commit */ else ... /* abort */ }

Rob's example only works if volatiles are 2-way memory barriers. Here is how you need to change it to make it work for volatiles that have load.acquire and st.release semantics:

volatile int vno = 0; volatile boolean force; int a,b; synchronized void inc(int x) { int oldVersion = vno; vno = oldVersion+1; boolean ignore = force; /* Don't allow accesses to be hoisted */ a += x; b += x; vno = oldVersion+2; }

/* unsynchronized */ int prod() { int v1 = vno; /* pre-inspect */ int t1 = a; int t2 = b; force = false; /* force all accesses to be committed */ int v2 = vno; /* post-inspect */ if (v1 == v2 && v1%2 == 0) return t1*t2; /* commit */ else ... /* abort */ }

If the critical section of call to prod() completes before the critical section of a call to inc(), then the second read of vno in prod() occurs before the first write to vno in inc() therefore the write to force in prod() occurs before the read of force in inc() therefore all the memory accesses in the critical region of prod() are ordered by the happens_before relation to be before the memory accesses in the critical region of inc()

If the critical section of call to inc() completes before the critical section of a call to prod(), then the second write to vno in inc() occurs before the first read of vno in prod() therefore all the memory accesses in the critical region of inc() are ordered by the happens_before relation to be before the memory accesses in the critical region of prod()

Note that no one ever cares about the value stored into the force variable. It is just used for establishing memory ordering.

I am fairly confident that any situation that really depends on 2-way memory barriers can be handled under acquire/release semantics be inserting some additional dummy memory accesses. And this way, you will only incur the costs of the additional memory/compiler barriers when you really need them.

For example, if you need a memory barrier:

private volatile boolean force;

final void memoryBarrier() { force = false; boolean ignore = force; }

Note that this memory barrier would only work between threads using the same force field. If force were thread local, the compiler would be free to optimize away the memory barrier.

I know that dummy volatile memory accesses are not intuitive. But this isn't something that most people should be trying to do.

Bill

------------------------------- JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel

------------------------------- JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel



This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:59 EDT