We based the performance evaluation of the network module on total execution times. If our model is accurate for communication latencies and computation time, then any program executed on the simulator should have a good curve fit.
There were a few things that had to be done before comparing the performance of applications on Proteus with our SP-2 network module and the SP-2.
First, since Proteus measures the execution time of a program in cycles, we had to determine the ratio of Proteus cycles to SP-2 seconds for computation. Not that determining this scaling factor does not take into account the differences in the simulator's host processor, a MIPS processor and the IBM's RS/6000 processor (i.e., multi-function pipelines to exploit concurrency) Also, the possible effects of cache misses are glossed over. So, in theory, even though we do calculate a factor that could be used to scale all programs, we found that this is not always the case for our experiments.
=
Sender:xxxxxxxxx¯ measure start-time
xxxxxxxxx¯ for i:=1 to k do
xxxxxxxxxxx¯ send(message);
xxxxxxxxx¯ end-for
xxxxxxxxx¯ measure stop-time;
xxxxxxxxx¯ elapsed-time = (stop-time - start-time)/k
xx¯ Receiver:
xxxxxxxxx¯ measure start-time
xxxxxxxxx¯ for i:=1 to k do
xxxxxxxxxxx¯ recv(message);
xxxxxxxxx¯ end-for
xxxxxxxxx¯ measure stop-time;
xxxxxxxxx¯ elapsed-time = (stop-time - start-time)/k
Also, since the major thrust of this project was in developing and validating a module to simulate the IBM SP-2 interconnection network and message passing software layer, we had to determine the latency incurred by the message passing library. This was done using the ping benchmark in [4]. This benchmark is used to determine the overhead of sending a message to another processor. The pseudo-code is shown in Figure 2.
Figure 3: Actual vs. esimated sending overhead
First ping was coded, instrumented and run on the SP-2 for
message sizes that
ranged from 0-bytes to 250K-bytes.
Given the recorded SP-2 library overheads, we determined a best
fit curve to give overhead as a function of number of bytes requested.
To make it simple, we chose a linear equation of the form
, where
is given in
and X in bytes.
This equation is then used to determine the overhead of a send()
operation in our library layer in units of cycles (we use the scaling
factor to convert from
to cycles).
The graph in Figure 3 compares the
overhead from ping run on Proteus to the actual SP-2
overhead.
As can be observed, our curve fitting is pretty accurate.
Although the overhead incurred when sending a message is a key part of the overall communication latency in the network module, the simulation of the actual time the message spends in the network as a function of contention, etc. will determine how well the module performs. As will be observed in the next section, the overall module performance did not meet our expectations.