Re: JavaMemoryModel: RE: Question on memory models

From: William Pugh (pugh@cs.umd.edu)
Date: Wed Jun 30 1999 - 23:19:18 EDT


At 2:23 PM -0500 6/30/99, Sarita Adve wrote:
> But I think the first step should be to converge on the precise
>minimal definition of safety that people really want. Doug Lea has
>started this with the definition of immutable objects. Can we get agreement
>on this?

OK. First, I hope everyone will agree that nothing should be allowed
to crash or corrupt the VM, even if a program contains unsynchronized
access to shared data.

   - this primarily means that the system must not act on stale values
        in the object header, in the vtbl or other class data structures.
        In addition to virtual method dispatch, you would also need to worry
        about things like instanceof, checked casts, lengths of arrays,
        and store type checks for arrays of references.

   - you also have to make sure the program can't see garbage in pointer fields,
        but this is handled by allocating objects out of pre-zeroed memory.

More controversial is supporting initialization safety (which includes having immutable objects be immutable). This was described in my initial "Getting the ball rolling" message:
        http://www.cs.umd.edu/~pugh/java/memoryModel/archive/0000.html

It all really comes down to the processor being able to reorder two
reads, when the address of the second read depends on the value
loaded by the first read. If the hardware doesn't allow this
reordering, then both safe virtual method dispatch and initialization
safety are easy. If the hardware allows the reordering, we'd have to
fix it in software, and similar techniques might work for both.

At 2:23 PM -0500 6/30/99, Sarita Adve wrote:
>- Before taking the leap to relying on hardware, I would suggest spending
>some energy on determining if there is a pure software solution to this
>problem for exactly the cases we want this to work on.

For the vtbl, my first thought was to fix the problem in the SIGSEGV
handler. After talking with a couple of people here, I'm less
confident that will work. The instruction that causes the exception
isn't the one that went wrong. Rather,it is an earlier instruction
that went wrong (got 0 for the address of the vtbl). To handle this
in the SIGSEGV handler, you have to figure out which register should
contain the vtbl, figure out how to load the correct vtbl, and then
resume execution without having the OS restore all the registers to
the state they were in before the signal handler was invoked. Would
probably require kernel hacks, at the least.

Next idea: Every time you load a vtbl, check to see if it is null. If so, do
a memory barrier and reload the vtbl. Since a vtbl can never be null,
this should work, and the code to do the memory barrier and reload
the vtbl would rarely, if ever, be taken.

Of course, you have to worry about every memory reference: could this
value be stale/null? For example, perhaps you don't get a stale
reference to the vtbl, but when you read a slot of the vtbl, you get
a stale/null pointer to a method. This could happen if another
processor loaded a class, created an instance of that class, and then
store a reference to that instance in a shared variable. When another
thread followed that reference (without synchronization), it could
get stale values for either the vtbl pointer in the object, or for
any of the fields in the class/vtbl structure.

If you wanted to extend this to providing initialization safety, you could say:
whenever you load a field, if the value you get is the value memory
is initialized with, do a memory barrier and reload. If you made null
be a different address than 0, then for references, it would only
happen when you actually got a stale value. For ints, things are more
of a problem since there are no forbidden patterns. Whenever you
loaded a 0, you do a memory barrier and a reload. Yuck.

Of course, you would only need to worry about this on a
multiprocessor machine. On a single processor machine, I'm pretty
sure you could get away with just a memory barrier as part of the
context switch between threads.

Anyone have any other software-only solutions to the virtual method
dispatch problem? Or the more general initialization safety problem?

At 2:23 PM -0500 6/30/99, Sarita Adve wrote:
>- Given that Java has enough clout, it may be possible to "force" hardware
>designers to accept that they have to provide the safeguard discussed for
>future machines. But I think this should be the last resort.

It is misleading to think of this as a Java-only issue. This will be an issue
for any multiprocessor system with:
        * OO virtual method dispatch
        * the possibility of unsynchronized access to shared data
        * a guarantee that your system won't core dump just because you have
                a data race.

Now, maybe for C++ & PThreads, people are so used to random core
dumps that this wouldn't bother them. But I think it probably should.

At 2:23 PM -0500 6/30/99, Sarita Adve wrote:
>- I also think more thought needs to be given to whether this is a desirable
>constraint at all.

As I've said, I hope that every agrees it is desirable that a
misbehaving program not be able to cause a VM to core dump. For
systems where this is can be efficiently handled in hardware,
guaranteeing initialization safety will be very cheap.

As to why we should provide guarantees: If we were designing a
programming language to be used just for writing concurrent programs
and only by programmers who had taken an OS course and read Doug
Lea's book ( :-) ), we might be able to get away with requiring high
standards. But the fact of the manner is that millions of programmers
are coding in Java, and many of them are pretty clueless.

I also worry that if Java doesn't provide initialization safety, a
cracker will use that to attack sensitive code written by someone
else.

And of course, there is the fact that much of the existing Java code
base is broken if we don't have initialization safety.

At 2:23 PM -0500 6/30/99, Sarita Adve wrote:
>- Final comment on going the hardware route: I also think that requiring
>ordering between data dependent reads but not requiring ordering between
>control dependent reads is inelegant.

Having an instruction set where we could efficiently say "this read
must occur after that one" would be just as good. There are a lot of
places where I'd be happy to allow dependent reads to be reordered.
For example, in loading the value of a[i], I wouldn't care if the
contents of a[i] were loaded before the value of i.

But data dependence works pretty well as a substitute if we don't
have that finer control.

If making data dependences special is inelegant from a hardware point
of view, it just made be an indication that the hardware people can't
think just about hardware; they need to start thinking about
programming models as well.

        Bill

-------------------------------
This is the JavaMemoryModel mailing list, managed by Majordomo 1.94.4.

To send a message to the list, email JavaMemoryModel@cs.umd.edu
To send a request to the list, email majordomo@cs.umd.edu and put
your request in the body of the message (use the request "help" for help).
For more information, visit http://www.cs.umd.edu/~pugh/java/memoryModel



This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:14 EDT