Benchmarking a Network of PCs Running Parallel Applications
(Presented at IPCCC'98 Phoenix, AZ, USA, Feb. 1998).
We present a benchmarking study that compares
the performance of a network of four PCs connected by 100 Mb/s
Fast Ethernet running three different system software
configurations: TCP/IP on Windows NT, TCP/IP on Linux and a light
weight message passing protocol (U-Net Active messages) on Linux.
For each configuration, we report results for communication
micro-benchmarks and the NAS parallel benchmarks. For the NAS
benchmarks, the overall running time using Linux TCP/IP was 12 to
500 percent less than the Windows NT TCP/IP configuration. We
show that longer code paths and poor memory hierarchy utilization
are responsible for this difference. Likewise, the Linux U-Net
based message passing protocol outperformed the Linux TCP/IP
version by 5 to almost 200 percent. We also show that using Linux
U-Net we are able to achieve 125 micro-second latency between two
processes using PVM. In addition, we show that poor memory system
performance in the form of increased kernel and user mode cache
misses contributes to NTs poor performance on the NAS
kernels. Finally, we report that the default math libraries
supplied under NT (for both gcc and Visual C++) are substantially
slower than the one supplied with Linux.
The slides and
paper for this presentation are available.
MDL: A Language and Compiler for Dynamic Program Instrumentation
(Presented at PACT'97 San Francisco, CA, USA, November 1997).
We use a form of dynamic code generation, called dynamic instrumentation,
to collect data about the execution of an application program.
Dynamic instrumentation allows us to instrument running programs to collect performance and other types of informations.
The instrumentation code is generated incrementally and can be inserted
and removed at any time.
Our system currently runs on the SPARC, PA-RISC, Power2, Alpha,
and x86 architectures.
Specification of what data to collect are written in a
specialized, platform-independent language, called the Metric Description
Language.
MDL provides a concise way to specify how to constrain performance
data to particular resources such as modules, procedures, nodes,
files, or message channels (or combinations of these resources).
The slides and
paper for this presentation are available.
Using Content-Derived Names for Configuration Management
Presented at the 1997 ACM Symposium on Software Reuse (SSR)
Configuration management of compiled software artifacts (programs,
libraries, icons, etc.) is a growing problem as soft-ware reuse becomes
more prevalent. For an application com-posed from reused libraries and
modules to function correctly, all of the required files must be
available and be the correct version. In this paper, we present a
simple scheme to address this prob-lem: content-derived names (CDNs).
Computing an objects name automatically using digital signatures
greatly eases the problem of disambiguating multiple versions of an
object. By using con-tent-derived names, developers can ensure that
only those soft-ware components that have been tested together are
permitted to run together.
The slides and
paper for this presentation are available.
Internet: The Technology Behind the Hype,
Presented as part of Math Awareness Week 1997 at the University of Maryland
Starting with a small project called ARPANET in 1969, the Internet has
evolved into a network of networks connecting tens of millions of hosts.
In this talk, I will review the history and evolution of the technology
that is used in the Internet.
In addition, I will explain the ideas behind many of the buz words and
acronyms associated with computer networking.
I will also discuss issues and technologies that will
drive Internet evolution for the next decade.
The slides for this presentation are available.
Internet: The Technology Behind the Hype,
Originally Presented to the University of Maryland ACM Chapter
In the past couple of years, the Internet has gone from an obscure
research project to a ubiquitous icon in popular culture. Behind the
hype is a sophisticated collection of technologies that have been in
continuous use for almost thirty years. Starting with a small project
called ARPANET in 1969, the Internet has evolved into a network of
networks so large that it is impossible to count the number of users or
computers connected. In this talk, I will review the history and
evolution of the technology that is used in the Internet. In addition,
I will explain the ideas behind many of the buz words and acronyms
associated with computer networking such as router, bridge, fuzz ball,
ARP storm, IPNG, SNMP, and HTTP. I will also discuss issues and
technologies that will drive Internet evolution for the next decade.
The slides for this presentation are available.
Online "what-if" Metrics,
Originally Presented at the University of Wisconsin
In this talk I will describe a new technique to assist programmers with
making informed decisions about how to tune their parallel programs.
The technique permits programmers, while their programs are running,
to ask "what-if questions about potential tuning alter-natives. Two
examples of "what-if" questions will be presented. First, I will
describe a non-trace based algorithm to compute the critical path
profile of the execution of a mes-sage passing parallel program. The
algorithm permits starting or stopping the critical path computation
during program execution and reporting intermediate values. Second, I
will describe a metric called Load Balancing Factor (LBF) that
accesses the impact of moving where a computation is performed rather
than reducing its execution time. LBF can be computed at either
process or procedure granularity. Finally, I will present a brief
case study that quantifies the runtime overhead of our algorithms and
demonstrates their ability to accurately predict the performance of
the tuning options.
The slides and
two papers (Critical Path and
Load Balancing Factor)
for this presentation are available.
An Online Computation of Critical Path Profiling ,
Originally Presented at SPDT'96 (Philadelphia, PA)
The slides and
paper for this presentation are available.
Tuning the Performance of IO Intensive Parallel Applications,
Originally Presented at USC Information Sciences Institute
Getting good I/O performance from parallel programs is a critical
problem for many application domains. In this paper, we report
our experience tuning the I/O performance of four application
programs from the areas of sensor data processing and linear algebra.
After tuning, three of the four applications achieve effective
I/O rates of over 100MB/s, on 16 processors. The total volume
of I/O required by the programs ranged from about 75MB to over
200GB. We report the lessons learned in achieving high I/O performance
from these applications, including the need for code restructuring,
local disks on every node and overlapping I/O with computation.
We also report our experience on achieving high performance on
peer-to-peer configurations. Finally, we comment on the necessity
of complex I/O interfaces like collective I/O and strided requests
to achieve high performance.
The slides and
paper for this presentation are available.