JavaMemoryModel: Reading List for lurkers

From: Joel Jones (jjones@uiuc.edu)
Date: Fri Aug 06 1999 - 13:01:46 EDT


For those of us who are still grappling with exactly what a memory
consistency model is and what kind of consistency models current machines
have, I have found the following to be very helpful:

@Article{Adve:1996:SMC,
  author = "Sarita V. Adve and Kourosh Gharachorloo",
  title = "Shared memory consistency models: {A} tutorial",
  journal = "Computer",
  volume = "29",
  number = "12",
  pages = "66--??",
  month = dec,
  year = "1996",
  coden = "CPTRB4",
  ISSN = "0018-9162",
  bibdate = "Mon Jan 6 14:08:32 MST 1997",
  acknowledgement = ack-nhfb,
}

@TechReport{STAN//CSL-TR-95-685,
  type = "Thesi",
  number = "CSL-TR-95-685",
  title = "Memory Consistency Models for Shared-Memory
                 Multiprocessors",
  month = dec,
  notes = "[Adminitrivia V1/Prg/19960625]",
  pages = "392",
  year = "1995",
  bibdate = "June 25, 1996",
  author = "Kourosh Gharachorloo",
  url =
  "ftp://elib.stanford.edu/pub/reports/csl/tr/95/685/CSL-TR-95-685.pdf",
  abstract = "The memory consistency model for a shared-memory
                 multiprocessor specifies the behavior of memory with
                 respect to read and write operations from multiple
                 processors. As such, the memory model influences many
                 aspects of system design, including the design of
                 programming languages, compilers, and the underlying
                 hardware. Relaxed models that impose fewer memory
                 ordering constraints offer the potential for higher
                 performance by allowing hardware and software to
                 overlap and reorder memory operations. However, fewer
                 ordering guarantees can compromise programmability and
                 portability. Many of the previously proposed models
                 either fail to provide reasonable programming semantics
                 or are biased toward programming ease at the cost of
                 sacrificing performance. Furthermore, the lack of
                 consensus on an acceptable model hinders software
                 portability across different systems. This dissertation
                 focuses on providing a balanced solution that directly
                 addresses the trade-off between programming ease and
                 performance. To address programmability, we propose an
                 alternative method for specifying memory behavior that
                 presents a higher level abstraction to the programmer.
                 We show that with only a few types of information
                 supplied by the programmer, an implementation can
                 exploit the full range of optimizations enabled by
                 previous models. Furthermore, the same information
                 enables automatic and efficient portability across a
                 wide range of implementations. To expose the
                 optimizations enabled by a model, we have developed a
                 formal framework for specifying the low-level ordering
                 constraints that must be enforced by an implementation.
                 Based on these specifications, we present a wide range
                 of architecture and compiler implementation techniques
                 for efficiently supporting a given model. Finally, we
                 evaluate the performance benefits of exploiting relaxed
                 models based on detailed simulations of realistic
                 parallel applications. Our results show that the
                 optimizations enabled by relaxed models are extremely
                 effective in hiding virtually the full latency of
                 writes in architectures with blocking reads (i.e.,
                 processor stalls on reads), with gains as high as
                 80\\%. Architectures with non-blocking reads can
                 further exploit relaxed models to hide a substantial
                 fraction of the read latency as well, leading to a
                 larger overall performance benefit. Furthermore, these
                 optimizations complement gains from other latency
                 hiding techniques such as prefetching and multiple
                 contexts. We believe that the combined benefits in
                 hardware and software will make relaxed models
                 universal in future multiprocessors, as is already
                 evidenced by their adoption in several commercial
                 systems.",
  institution = "Stanford University, Computer Systems Laboratory",
}

Joel Jones
jjones@uiuc.edu

-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel



This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:18 EDT