Chris Brumme's comments don't look correct to me. His argument is basically that
1) X86 only allows loads to be reordered.
2) Therefore double checked locking works on X86.
3) Since double checked locking is useful and frequently used, we should strengthen
the memory model to look like the X86 one.
4) Actual Itanium implementations are strong enough that it works there, too.
(2) doesn't follow from (1). If you allow arbitrary load reordering, DCL
(double checked locking) is clearly broken, since the load of the "initialized"
data may occur before the initialization check completes. For the same reason,
(4) isn't correct in general, even if st instructions are really treated as st.rel.
(I neither know nor care.) His example may (or may not) work on Itanium.
But if you use a separate initialization flag, it definitely doesn't.
Thus, in my opinion, the change he proposes to the memory model would make
things more, rather than less, confusing.
Whether or not DCL works on X86 is unclear. The Intel documentation David pointed
out (Developing Multithreaded Applications: A Platform Consistent Approach) in the
next message recommends it. But it doesn't seem to distinguish between X86 and
Itanium, and the supplied code is clearly broken on Itanium. It also definitely
breaks on X86 if the compiler generates speculative loads, which may be beneficial
to performance. Hence it at least needs a "volatile". (And adding the "volatile"
to the flag would fix the Itanium code, since volatile implies .rel/.acq there.)
Assuming the compiler preserves ordering (which it will with enough "volatile"
declarations, I presume), the Intel X86 spec states:
In single-processor systems:
1. Reads can be carried out speculatively and in any order.
In multiple-processor systems:
* Individual processors use the same ordering rules as in a single-processor system.
I read that as saying that loadload ordering is not guaranteed.
However, as Doug has also pointed out, there is a reasonable amount of evidence that:
1) Current processors do not need any barriers other than in the store-load case.
(There is some code in the Linux kernel that suggest that IDT WinChips are an
exception to this. But I don't know whether those were even MP-capable.)
2) If any X86 processor actually did visibly reorder loads lots of software would break.
Therefore it's not likely to happen. The fact that they're still recommending DCL tends
to confirm this.
Thus in the absence of compiler reordering DCL probably actually works on X86, at least
as far as I can tell.
> -----Original Message-----
> From: David Holmes [mailto:firstname.lastname@example.org]
> Sent: Monday, May 19, 2003 7:14 PM
> To: jmm
> Subject: JavaMemoryModel: Memory Model updates in the CLR
> FYI just came across this via some interesting discussions in
> David Holmes
> JavaMemoryModel mailing list -
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel
This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:45 EDT