SIGMETRICS 2001 / Performance 2001

Evaluating The Performance of Non-Blocking Synchronisation on Shared-Memory Multiprocessors

Authors

Philippas Tsigas
Yi Zhang

Chalmers University of Technology

Abstract

Parallel programs running on shared memory multiprocessors coordinate via shared data objects/structures. To ensure the consistency of the shared data structures, programs typically rely on some forms of software synchronisations. Unfortunately typical software synchronisation mechanisms usually result in poor performance because they produce large amounts of memory and interconnection network contention and, more significantly, because they produce convoy effects that degrade significantly in multiprogramming environments: if one process holding a lock is preempted, other processes on different processors waiting for the lock will not be able to proceed. Researchers have introduced non-blocking synchronisation to address the above problems. Non-blocking implementations allow multiple tasks to access a shared object at the same time, but without enforcing mutual exclusion to accomplish this. However, its performance implications are not well understood on modern systems or on real applications. In this paper we study the impact of the non-blocking synchronisation on parallel applications running on top of a modern, 64 processor, cache-coherent, shared memory multiprocessor system: the SGI Origin 2000. Cache-coherent non-uniform memory access (ccNUMA) shared memory multiprocessor systems have attracted considerable research and commercial interest in the last years. In addition to the performance results on a modern system, we also investigate the key synchronisation schemes that are used in multiprocessor applications and their efficient transformation to non-blocking ones.

[Last updated Fri Mar 23 2001]

Web Accessibility