Re: JavaMemoryModel: Re: Ownership Types for Safe Programming (was: Disallowing badly synchronized programs)

From: jmanson@cs.umd.edu
Date: Sun Feb 16 2003 - 12:33:38 EST


> In Boyapati's system, the race-freeness follows (as I understand it) from
> the fact that unique pointers and pointers to immutable objects are
> themselves recursively protected by one of the four protection mechanisms.
> The recursion stops at thread-local objects and objects protected by a lock.
> I think making the unique pointer or the pointer to the immutable object
> volatile is not enough, since it does not protect the fields of the object.

I am afraid I don't really understand this. The semantics of volatile in
the upcoming revision to the JMM are that every write that was visible to
a thread before it wrote to a volatile is visible to another thread after
it sees that write. That includes writes to the fields of the object to
which the unique pointer is pointing (assuming that you have to write to
the fields before you give up the unique pointer!).

The way I imagined it acting was: one thread constructs an object, then
places a volatile reference to that object in a shared area (thereby
losing the unique pointer, and "publishing" all of the writes it has seen,
including the object's fields). Another thread sees that unique pointer
by performing a volatile read (thereby acquiring both the unique pointer
and all of the writes it has seen). Is this not how this sort of thing
would work?

As for immutable objects, I would equally think that they can be enforced
by making all of the fields of the object final (which will also give
transitive guarantees to the contents of the final field).

> > This is, of course, another interesting thing to explore, but much harder:
> > is there a way to allow complete freedom to roll your own synchronization
> > and still prevent data races? It can't be done with static data race
> > detection, because that is NP Hard. What we have now is a set of
> > heuristics for detection, like "if there is a unique pointer, there is no
> > data race".
>
> I assume you mean "undecidable" instead of "NP Hard"?

My understanding of this is that it is NP Hard. See Netzer and Miller's
paper "On the Complexity of Event Ordering for Shared-Memory Parallel
Program Executions" in The Proceedings of 1990 International Conference on
Parallel Processing.

http://citeseer.nj.nec.com/netzer90complexity.html

> A language cannot be everything to everyone. For those who want maximum
> efficiency and flexibility, there is C++. Disallowing data races aligns with
> Java's design goals, by making it easier to write secure code (remember for
> example the attack with the "/tmp/usr" string), by simplifying the semantics
> (more specifically: the memory model), and by removing a source of bugs that
> are hard to "detect, reproduce and eliminate" (Boyapati).

Well, of course, the design goals of Java are not up to me. I do think
that the current state of the art in languages based on Java that disallow
data races would arguably be difficult to teach, and definitely too
restrictive for a class of programming techniques that are desirable in
Java. Miles's I/O example is one of these restrictions; I would also be
surprised if Chandra's system could express everything in util.concurrent
with ease.

I should point out that I do feel the informal semantics of the memory
model are relatively easy to learn - certainly, if you are doing serious
multithreaded programming, the notions shouldn't be too crazy. The
formalism is difficult, but people don't really have to understand the
formalism. The informal notions that everyone should know are just:

1) An unlock, or a write to a volatile, acts as a publication of data seen
by the thread performing that action.

2) A lock, or a read of a volatile, acts to acquire, for the thread that
performed the action, the data published by the last unlock or write to a
volatile on the same monitor or volatile variable.

3) The upshot of this is that you are not guaranteed to see anything
written by another thread unless you perform synchronization on the same
monitor / volatile variable.

That's it, for normal and volatile fields. The formalism is complicated
because we had a bunch of minimal guarantees to make for when the code is
unsynchronized. The only real change for programmers is that they have to
discard their notions of sequential consistency for unsynchronized code,
but they arguably shouldn't have had those in the first place.

For final fields, there is one important rule:

If you construct the final field and all objects reachable from the final
field in an object's constructor, and don't let a reference to that object
escape the constructor, then the field is immutable, and other threads
will see the correctly constructed values. If another thread sees a
reference to the object that escaped the constructor, then that thread
doesn't have the guarantees for the final field.

That's the programmer's eye view, really. It is only complicated if you
try to thoroughly understand what happens when your code is incorrectly
synchronized. Even then, we can probably explain the upshot relatively
simply.

Anyway, can you tell that "the memory model is complicated" is not a meme
I really want to take hold? I'm just afraid that everyone will ignore it
and keep writing double check without using volatiles.

                                        Jeremy
-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel



This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:42 EDT