JavaMemoryModel: Roach motel volatile ordering and the optimistic readers trans

From: Bill Pugh (pugh@cs.umd.edu)
Date: Fri Jul 06 2001 - 15:25:17 EDT


At 1:09 PM +1000 6/29/01, David Holmes wrote:
>It is not clear to me that volatile and monitors should have the same
>semantics when it comes to reorderings. The basic "roach motel" approach
>with synchronized blocks works fine (so we've been persuaded :) ) because of
>the added mutual exclusion. With volatiles there is no mutual exclusion and
>I am concerned that these ordering rules may inhibit the use of volatile in
>implementing various wait-free/non-blocking algorithms.
>
>As an example, the optimistic readers transform of Strom and Auerbach (ref
>ECOOP 2001) requires insertion of a dummy volatile read after a volatile
>write to prevent subsequent non-volatile writes from being performed prior
>to the volatile write. If volatiles are to be useful as flags in
>wait-free/non-blocking algorithms then it seems to me that ordering is
>critical and so volatile accesses should act as code motion barriers in both

The load/acquire and store/release semantics for synchronization
memory accesses has a long history in the architecture community. For
example, see:

K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and
J.L. Hennessy. Memory consistency and event ordering in scalable
shared-memory multiprocessors. In Proceedings of the 17th Annual
International Symposium on Computer Architecture, pages 15--26. IEEE,
May 1990.

Making volatile read/write be full two directional memory barriers
will likely at least double the cost of volatile reads on IA-64 and
Alpha SMPs.

Plus, you don't really need it. Here is an example of Rob's
optimistic readers idiom (from his paper):

volatile int vno = 0;
int a,b;
synchronized void inc(int x) {
   vno++;
   a += x;
   b += x;
   vno++;
   }

/* unsynchronized */
int prod() {
   int v1 = vno; /* pre-inspect */
   int t1 = a;
   int t2 = b;
   int v2 = vno; /* post-inspect */
   if (v1 == v2 && v1%2 == 0)
     return t1*t2; /* commit */
   else ... /* abort */
   }

Rob's example only works if volatiles are 2-way memory barriers.
Here is how you need to change it to make it work for volatiles that
have load.acquire and st.release semantics:

volatile int vno = 0;
volatile boolean force;
int a,b;
synchronized void inc(int x) {
   int oldVersion = vno;
   vno = oldVersion+1;
   boolean ignore = force; /* Don't allow accesses to be hoisted */
   a += x;
   b += x;
   vno = oldVersion+2;
   }

/* unsynchronized */
int prod() {
   int v1 = vno; /* pre-inspect */
   int t1 = a;
   int t2 = b;
   force = false; /* force all accesses to be committed */
   int v2 = vno; /* post-inspect */
   if (v1 == v2 && v1%2 == 0)
     return t1*t2; /* commit */
   else ... /* abort */
   }

If the critical section of call to prod() completes before the
critical section of a call to inc(), then
        the second read of vno in prod() occurs before
          the first write to vno in inc()
        therefore the write to force in prod() occurs before
          the read of force in inc()
        therefore all the memory accesses in the critical region of prod()
          are ordered by the happens_before relation to be before the memory
          accesses in the critical region of inc()

If the critical section of call to inc() completes before the
critical section of a call to prod(), then
        the second write to vno in inc() occurs before the first read of
          vno in prod()
        therefore all the memory accesses in the critical region of inc()
          are ordered by the happens_before relation to be before the memory
          accesses in the critical region of prod()

Note that no one ever cares about the value stored into the force
variable. It is just used for establishing memory ordering.

I am fairly confident that any situation that really depends on 2-way
memory barriers can be handled under acquire/release semantics be
inserting some additional dummy memory accesses. And this way, you
will only incur the costs of the additional memory/compiler barriers
when you really need them.

For example, if you need a memory barrier:

private volatile boolean force;

final void memoryBarrier() {
        force = false;
        boolean ignore = force;
        }

Note that this memory barrier would only work between threads using
the same force field. If force were thread local, the compiler would
be free to optimize away the memory barrier.

I know that dummy volatile memory accesses are not intuitive. But
this isn't something that most people should be trying to do.

        Bill

-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel



This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:32 EDT