To measure the disk performance of a single node, the diskperf benchmark was used. This is loosely modeled after the iozone benchmark [4]. iozone determines the possible I/O bandwidth from a user program for both reads and writes to a single disk. diskperf extends this to multiple disks hanging off a single node by using a set of asynchronous writes, a barrier to make sure all writes have completed, and a set of asynchronous reads. This will give a rough idea of what the upper bound for bandwidth to and from a node's drives can be.
For a set file size of 300MB, and for block sizes from 4K to 128K we ran this simple experiment. The 300MB was chosen to insure we are not simply reading the disk cache, and since 256MB is the memory for each node, we can be sure the disk cache will be overfilled. The disk configuration was two local disks, labeled /loc (for local) and /scr (for scratch). This was the only configuration available, since the alpha farm is not configured exclusively for I/O. Table 2 summarizes the results of running diskperf on a representative node.
It appears the /loc disk slightly outperforms the /scr disk for writes, but have comparable rates for reads. As expected, when the two disks are used together, we see increased bandwidth. Note that this is not the aggregate bandwidth, since the bottleneck is probably caused by bus arbitration between the disks by the disk controller. For simplicity, we will consider 5.44 MB/sec to be the approximate bandwidth a node can expect for writes of 64k blocksizes, and 5.02 MB/sec for reads. Note that higher block sizes than shown in Table 2 do not achieve a remarkable increase in I/O rate. This is in contrast to the SP-2 performance where 1MB sized blocks achieved the best performance [1].
Note these values can vary substantially from run to run. A more thorough test should be conducted for all nodes, for varying outstanding numbers of asynchronous I/O calls and for repeated runs. After throwing out the min and max, the resulting mean would be a fair choice for disk bandwidth. Note that time was the only factor causing this not to be done. The author will be following up on this to get a better picture of the alpha farm disks.
Table 2: Representative single node user-level I/O rates (MB/sec).