You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format. However, this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the Department of Computer Science of the University of Maryland at College Park under terms that include this permission. All other rights are reserved by the author(s).
XMT-M: A Scalable Decentralized Processor. Efraim Berkovich. Joseph Nuzman. Manoj Franklin. Bruce Jacob. Uzi Vishkin. September 1999.
A defining challenge for research in computer science and engineering has been the ongoing quest for reducing the completion time of a single computation task. Even outside the parallel processing communities, there is little doubt that the key to further progress in this quest is to do parallel processing of some kind. A recently proposed parallel processing framework that spans the entire spectrum from (parallel) algorithms to architecture to implementation is the explicit multi-threading (XMT) framework. This framework provides: (i) simple and natural parallel algorithms for essentially every general-purpose application, including notoriously difficult irregular integer applications, and (ii) a multi-threaded programming model for these algorithms which allows an ``independence-of-order'' semantics: every thread can proceed at its own speed, independent of other concurrent threads. To the extent possible, the XMT framework uses established ideas in parallel processing. This paper presents XMT-M, a microarchitecture implementation of the XMT model that is possible with current technology. XMT-M offers an engineering design point that addresses four concerns: buildability, programmability, performance, and scalability. The XMT-M hardware is geared to execute multiple threads in parallel on a single chip: relying on very few new gadgets, it can execute parallel threads without busy-waits! Existing code can be run on XMT-M as a single thread without any modifications, thereby providing backward compatibility for commercial acceptance. Simulation-based studies of XMT-M demonstrate considerable improvements in performance relative to the best serial processor even for small, and therefore practical, input sizes. (Also cross-referenced as UMIACS-TR-99-55) University of Maryland Institute for Advanced Computer Studies, Department of Computer Science, University of Maryland,
Last Generated Fri Aug 11 04:01:01 EDT 2000