JavaMemoryModel: Motivation and Semantics for Immutable objects

From: Bill Pugh (pugh@cs.umd.edu)
Date: Fri Aug 03 2001 - 14:42:11 EDT


Sorry for the delay in leading this discussion along; I had a couple
of other things on the stove.

OK, first, an axiom about Java security

        Strings must be absolutely immutable

Any code that can cause a String to mutate can essentially bootstrap
its way to complete bypassing of the security manager. Therefore, we
take it as an axiom that in the presence of untrusted code with a
data race, we must preserve the immutability of Java Strings.

Why do data races allow Strings to mutate?

Consider the following code:

Thread 1:
StringBuffer b = ...;
Foo.global = new String(b);

Thread 2:
String s = Foo.global;
use s

Now, this code has a data race. The effect of the data race is that
Thread 2 might see the new value for Foo.global, but not see some of
the writes that initialized the String object. Later, thread 2 might
see the rest of the writes, so it sees the String object change. I'm
not going to go into how this can happen at length because it has
been covered a number of times before. Check my 2001 JavaGrande paper
for more details.

OK, it would be possible to fix this by making all of the methods of
the String synchronized. But that would impose unacceptable
performance penalties, and is only needed when a user of the String
class introduces their own data race, outside of the String class.

So what we need is a way to allow immutable objects to be truly
immutable, even in the presence of external data races. While the
most obvious need for this is for the String class, there are likely
a number of other immutable classes that need this guarantee. So we
don't want to do something specific to the String class.

We also want something that might require memory barriers at object
construction time, but would not require memory barriers when
accessing immutable objects, unless you are on platform with a very
weak memory model (e.g., an Alpha SMP).

The general informal consensus that has been reached includes the following:

* If you have a class that is immutable, and all instance fields of
that class are final, then you get special race-proof immutable
semantics: Any thread that obtains, in any way, a reference to an
instance of that class sees the initialization done by the
constructor for that instance.

OK, now that is very informal, and not exactly correct.

A little more detailed, this time with regards to a particular final field f
of class A:

* All writes visible to a thread when exiting the A constructor for an object x

* Are visible to when any other thread reads x.f, or any variable
derived from the value read from x.f.

* A variable reference v is derived from a local l when
        v is referenced as a field or element of the object referenced by
                a local l' loaded from a variable derived from l

        For example, if p.a is an a reference to an array of arrays, then
        p.a[0][0] is derived from p.a.

This writeup isn't very good, part of the reason I've been delaying
sending it out. It is much easier to do informally in pictures. See
slides 27-38 of

        http://www.cs.umd.edu/~pugh/java/memoryModel/JavaOneBOF/BOF.pdf

Does anyone want to raise issues, questions or objections to this
approach for handling immutable objects?

        Bill
-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel



This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:34 EDT