Re: JavaMemoryModel: Word tearing

From: David F. Bacon (dfb@watson.ibm.com)
Date: Thu Jan 03 2002 - 12:35:16 EST


I haven't been following these discussions too closely, but I'm somewhat dismayed
about the trend of this word-tearing discussion. The whole point of revising the
JMM spec is to make it more comprehensible, better defined, and easier to program.

Adding word-tearing of arrays as a "feature" of the language, with funky volatile
definitions to override it, is a step in the wrong direction for many reasons:

  1) It makes the programming model *more* complicated, not to mention uglifying
the syntax.

  2) It violates Java's principle of isolating the programmer from memory layout
issues.

  3) Word tearing is not an issue on most machines. SPARC (as of my V9 manual)
specifies that loads and stores, even double-word, are atomic. PowerPC specifies
that all loads and stores up to the native word size are atomic. And while Intel
may not specify the semantics explicitly, mainstream manufacturers are unlikely to
go against the standard implementation.

  4) Since word tearing is not an issue on most machines, most programmers will
ignore it. Then they may get burned on obscure
hardware platforms where the problem will turn up as the nastiest sort of bug.

Creating a new JMM, which will not be part of the language for some time to come
yet, that goes out of its way to support already obscure hardware implementations,
is exactly what the spec should *not* be doing. Similarly, the spec does not need
to go out of its way to support funky optimizations. As Cliff points out, the most
important write-combining optimizations are already possible.

Let's put the burden on the machine and the compiler-writer, not on the
programmers.

david

Cliff Click wrote:

> Hotspot will issue a store-byte instruction on all platforms,
> even if the inner loop is unrolled. The Sparc hardware Does
> The Right Thing. Intel makes no claims about what happens
> on Intel hardware, which makes sense since lots of folks make
> motherboards and much of the correctness depends on the
> motherboard. Intel appears to Do The Right Thing in the
> hardware I have in front of me.
>
> Denying the compiler from write-combining for unrolled tight
> loops with short element sizes is painful for certain uses;
> character conversions and the like.
>
> I can write-combine if I'm going to slather over the whole
> array anyways, right? I.e., if there's already a race
> condition and 2 threads are busy wiping over the whole
> array it doesn't matter whose write wins.
>
> E.g., this is ok to unroll & write-combine:
> for( int i=0; i<A.length; i++ )
> A[i] = translate(A[i]);
>
> Really, what I am denied is reading an old value, then
> writing the same old value back (as part of a
> write-combining optimization). I can't write to a value
> that isn't being written to already.
>
> E.g., this is NOT ok to write-combine:
> for( int i=0; i<A.length; i+=2 /* skip every other element! */)
> A[i] = translate(A[i]);
>
> The write-combined code looks something like this (modulo
> hastily written syntax errors and an alignment pre-loop):
> byte A[];
> for( int i=0; i<A.length; i+=4 /*unrolled!*/ ) {
> int I = *(int*)A[i]; /* bogus Java syntax for doing an int-load
> from a byte array */
> int B0 = translate( X & 0xFF);
> int B1 = (X>> 8) & 0xFF ;
> int B2 = translate((X>>16) & 0xFF);
> int B3 = (X>>24) & 0xFF ;
> I = (B3<<24) | (B2 << 16) | (B1 << 8) | B0;
> (int*)A[i] = I; /* bogus Java syntax for doing an int-write to a
> byte array */
> }
>
> This will show word-tearing if another thread is trying to
> write to the alternate bytes. I can live without this, but
> I still want to write-combine in the first example above.
>
> Cliff
>
> Doug Lea wrote:
>
> > One of the harder cases to deal with about word-tearing is when
> > different threads all write into different, adjacent elements of a
> > shared array. As in:
> >
> > class SharedArray {
> > final static int N = 100;
> > final static byte[] array = new byte[N];
> >
> > public static void main(String[] args) {
> > for (int i = 0; i < N; ++i) {
> > final int index = i;
> > Thread t = new Thread() {
> > volatile int old = 0;
> > public void run() {
> > for (int k = 0; k < 10000000; ++k) {
> > int current = ++array[index];
> > if ((current & 0xFF) != ((old+1) & 0xFF)) throw new Error();
> > old = current;
> > }
> > }
> > };
> > t.start();
> > }
> > }
> > }
> >
> >
> > Can/should we say that this is guaranteed to work only if "array" is
> > declared as "volatile"? The argument here is that the array itself is
> > shared, so should be marked as volatile (even though none of its
> > elements are shared). This is basically the same story we give for
> > other uses of volatile arrays. (The underlying snag is, as usual,
> > that there is not syntax to declare the elements of arrays final or
> > volatile.)
> >
> > This might be enough of a hook so that compilers could do the right
> > thing (here, maybe use 32bits for the elements) on machines otherwise
> > susceptible to word-tearing.
> >
> > (BTW, this code runs without error on multiway sparcs using hotspot 1.4beta3)
> >
> > -Doug
> >
>
> -------------------------------
> JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel

-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel



This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:37 EDT