JavaMemoryModel: recap on threading models

From: Lanchon (lanchon@hotmail.com)
Date: Fri Dec 05 2003 - 05:10:35 EST


well! when i posted against spurious wakeups a few days ago i was told getting rid of them was only possible, maybe, in a parallel universe. now things seem to have changed, the bad news is i don't think i like where they are going. there seems to be some competing goals, and trying to satisfy them all is adding complexity to the sematics of the language and the VM implementations. i'm not familiar with pthreads or the like since i tend to work with higher level languages or platforms, but i'll try to recap what i understand form these discussions anyway. from what i read, there are 3 models being considered:

allow spurious wakeups (SW) (community review 2) model

the spec forces no lost notifications (LN) and allows the prefer-IE policy, and it was belived that SW were required to allow efficient implementations of that policy. i understand this is the only reason for allowing SW on most if not all platforms. (am i right?)

pros:

-easy to implement in all platforms. weak spec, room for future innovations.

cons:

-STRONG: it breaks code. it does, my otherwise correct code at least, and probably common code involving many posible waiters and a notifyAll. it is not true that using no-priority monitors forces you to use the "signals as hints" paradigm, some code can be simpler and faster without it.

-it adds complexity or uglyness to the semantics.

-it may cause inefficiencies where threads are interrupted while waiting on a monitor with a long queue, since notifyAll must be issued to avoid LN (or at least, so it was thought).

-as i posted before, due to SW it seems impossible to make any multithreaded java program that uses wait/notify and is guaranteed to ever run. it complicates or makes it impossible to define any useful form of progress guarantee. how would SW fit in into runtime java??

force prefer-notify model

again, i understand that by this model SW are not needed for efficient implementation on most if not all platforms. (am i right?)

pros:

-STRONG: cures SW.

-Doug says: "plausible path for making the spec simpler", "avoiding yet more years of confusion and unintended non-portability".

cons:

-STRONG: Doug: "JVMs won't be able to help make systems more responsive to interruptions as a quality-of-implementation feature". some say, if you need responsiveness you should check int status after wait anyway, since prefer-IE would never be mandated. i say, that's not applicable most of the time. i belive most programmers will avoid interrupting their own code, rather they would signal termination by the more mundane and ordinary interthread communication means they are accustomed to use, and reserve interrupts mostly to terminate code written by others (and also native code i guess- never used JNI, can native code respond to ints?). so adding a check there is out of the question.

layer X or Sylvia's model

accurately detect would-be LN and keep track of 'generations' of waiting threads to avoid both LN and SW while allowing efficient prefer-IE implementations, by introducing a software layer (layer X) above most threading libraries on most platform.

pros:

-STRONG: it's the only model that satisfies all competing goals (no LN, no SW, efficient prefer-IE).

-may hide unexpected/wrong threading behaviour from java. Sylvia: "the less dependent that a JVM is on the correct functioning of the OS ... the less vulnerable the JVM are to mistakes on the OS". "the Itanium processor has a weaker memory model than current Pentium processors do, and that might also impact on the correctness of some threading implementations".

cons:

-STRONG?: VM complexity and performance. will the VMs need to allocate heap memory in places where no-layer X VMs wouldn't in order to keep track of generations of waiting threads? will there be a footprint increase in all java objects in the VM?

-STRONG: semantics complexity. the behaviour of a notified and interrupted thread acting like it was never notified by notifying another thread that could have been notified back then when the interrupted thread was, plus the need to carefully choose a thread in every notify that would allow all previous notifies to be "undone" in case of later interruptions, strikes me as sort of complex. note that in case of interruption this involves a form of asynch selection of a thread to notify an arbitrary ammount of time after the original notify was done -even though the spec calls for sync notify- and this fact is hidden by layer X's history keeping. but the question is, can ALL this complexity be hidden from the spec, or will some parts show through? suppose it can, which i doubt, can a thoughtfull user be expected to ever reconcile in his head prefer-IE, no LN, and synch notifies, all at the same time?? she shouldn't, it can't be done! notifies would be async in this case, that'!
 s what layer X history data is for. this is what i would maybe think if presented with this model: the spec is wrong, assume only 2 of these 3 properties hold on a certain VM, VMs will be different, distrust all 3 properties. wouldn't be a bright outcome for so much effort. i think the spec semantics will have to be complex.

-easy way out: will VM implementors shy away of layer X and fall back to prefer-notify to reduce complexity or just to benchmark better? in this case users would have the complex semantics of layer X and prefer-IE and none of the benefits.

-may hide too much of the platform's threading behaviour. Trotter: "The arguments in the J2ME space are even more tricky. In that market the 'quality of service' delivered by the OS is a key differentiator and there are many realtime OS's which are similar but absolutely not the same. Threading is a key differentiator and the last thing that customers want is for Java to 'insulate' them from the OS threading semantics; they've just paid good money for them!". in particular, this model restricts what threads can notify choose. this may interfere with the realtime semantics of notify (whether mandated by an RT spec, or enforced as a quality of service differentiator), for example, by forcing notify to choose a lower priority thread. (i belive this can happen but Sylvia or Doug should check.)

as i understand it, the discussion is centered on these 3 models. i don't like any of them. on my previous posting i proposed other alternatives, but no one told me why they were not acceptable. so here they are again:

restricted spurious wakeups model

in the spec, restrict in which cases SW are legal (ie: in the presence of interrupts and all that).

pros:

-easy to implement in all platforms (actually, same impl as the SW model).

-it breaks much less code than unrestricted SW. (maybe none at all, since VMs may already be producing SW when threads are interrupted, but i really don't know how they behave. can someone comment?)

-it cures (sort of) the difficult progress guarantee issue, and should be easier on RT extensions.

cons:

-the complexity or uglyness is still there.

-some code still broken.

allow lost notifications model

allow LN when a thread is interrupted after being notified. i can't see why EVERYBODY has rejected this. am i missing something? i'm not sure of some things so i'll avoid the good/bad classification.

-cures SW.

-cures the difficult progress guarantee issue, does not interfere with RT extensions.

-complexity: can this be done without a layer X? if not, would it be substancially simplier than Sylvia's model? runtime and footprint costs?

-simple semantics and no uglyness. synch notifies plus prefer-IE logically implies the existence of LN. that is simple to understand, no magic involved.

-problem is, does it break code? does the pre-Tiger semantics allow LN (sorry i'm lazy)? do current VMs incurr in LN when a thread is interrupted after being notified? does it actually break more existent code than SW?

-easy on the developer. an algorithm realying on no LN, say some form of dasy chaining, is easier to write since there will be no SW. only if the developer is also interrupting threads he would face a problem, in which case he would have to notify after interrupting, and most probably also recheck status after waits (same as he would always be forced to do under the SW model).

-would it be fair to say that code broken by this model would probably also be broken by SW anyway?

i do like these 2 models, specially the last one, but i think it was discarded by you for some reason. i would appreciate comments on this as i would like to better my understanding of these matters.

regards,
Rodrigo

-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel



This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:55 EDT