The basic approaches to performance evaluation of computer
systems include analytical modeling, simulation, and measurements.
My work focuses on analytical modeling.
An example model is a multiclass multiserver queueing system
whose attributes include the system's workload in the form
of N job arrival processes each with
its own arrival rate, the system's resources
in the form of a queue and K servers each
with its own service rate,
and a service scheduling discipline.
In general,
arrival and service rates may be a function
of the current model state, and not all servers need be
activated at all times (where activation of more servers
improves the performance but may increase cost).
Given such models, common performance metrics of interest
include expected job response time, server utilization, throughput,
and so on.
Performance evaluation of real systems is complex, suffers from
scalability problems (such as
the socalled ``state space explosion'' problem),
and usually requires advanced model solution techniques.
Often, such techniques are based on exploitation
of ``special structure'' in the models.
With large
and complex models, these special structures are usually too
expensive to expose automatically
(as that may involve searching through a combinatorial number
of permutations).
My work on analytical modeling focuses on discovery of
special structure in models which results in
order(s) of magnitude improvement in computational efficiency.
Thresholdbased Systems
We have developed Markovian models and efficient solutions of systems
that employ thresholdbased techniques with hysteresis
behavior to dynamically adjust the amount of resources available for
job service [A4,
D10,
D20].
A thresholdbased approach is useful in systems which
incur significant server setup, usage, and removal
costs. It is also a good approach to dynamic management of
a pool of resources between multiple workload classes using
the same system.
The hysteresis is needed
to guard against momentary fluctuations in workload.
For instance, such techniques are used in
Novell file servers in managing their buffer pool.
However, performance prediction for thresholdbased systems
is often difficult as such systems exhibit
anomalous response time behavior, as observed in
our work as well as that of others.
There has been much interest in developing efficient performance
evaluation techniques for such systems. However, past results
have been obtained only for fairly
restricted models which include limitations on one or more of the
following:
(a) number of servers in the system (often restricted to 2),
(b) size of the waiting room,
(c) heterogeneity of servers,
(d) number of job classes (usually 1),
and
(e) absence of hysteresis.
We have developed a divideandconquer type methodology
(using stochastic complementation)
which allows extremely
efficient computation of performance metrics for a variety of
thresholdbased models with hysteresis without such restrictions,
handling
heterogeneous servers, bulk arrivals,
noninstantaneous server activation, and multiclass
workloads, all without restrictions on the number of servers or
the size of the waiting room.
For a variety of singleclass models
[A4,
D10]
we derived either:
(1) an exact closedform expression, or
(2) an exact algorithmic solution,
or (3) provable (through sample path analysis)
and tight performance bounds,
which, for all practical purposes, is as good as having an exact solution
(e.g., experiments in
[D10]
show less than 1.5% spread between
upper and lower bounds with an order of magnitude improvement in
computational cost over the exact solution).
Moreover, allowing
noninstantaneous resource activation in a
model, while difficult to analyze accurately,
is important because activation time is often
the distinguishing characteristic of a resource management policy.
No other work addresses this problem with more than 2 servers.
The efficiency is obtained
by breaking the model into smaller
pieces, solving each piece separately, and then using these solutions
in constructing the exact solution to the original model.
For cases where it was not possible to apply the divideandconquer
approach directly, we developed modified models which were suited
for the divideandconquer approach and produced provable
and tight bounds on performance metrics of the original
model
[D10].
Due to the divideandconquer nature of our approach, the
reduction in computation grows as the number of servers in the system grows.
Based on the success with singleclass models,
we have developed an efficient method for
multiclass models
[D20].
In systems with multiple job classes it is
desirable to find good approaches to sharing resources
without statically partitioning them among the job classes.
Thresholdbased techniques constitute one category of such approaches.
There have
been no prior analytical results for the multiclass version
of this model,
which corresponds to an important characteristic of real systems.
Solving this model exactly is difficult because
it is infinite in multiple dimensions and appears
to lack sufficient structure.
We developed
[D20]
an approximate iterative solution method. This method
``breaks up'' the original Nclass model into
N single class models, which are ``coupled'' through a set of
blocking probabilities that express the interaction between classes.
Empirical comparison against simulation
indicates that this method is fast and fairly accurate.
Most of our test cases are within 5% of the simulation results
with more than two orders of magnitude improvement in computation time.
Although this is an approximate solution technique, our experiments indicate
that good threshold settings result in much higher performance
improvements (e.g., 50%) than the loss in accuracy due to the approximation.
Hence, our solution technique is a promising approach to evaluating
multiclass thresholdbased systems.
Mixedworkload Multimedia Systems
This work focuses on modeling and analysis of storage servers that serve a
variety of applications requesting video, image, audio, and
text data with vastly different performance and qualityofservice (QoS)
requirements as well as resource demand characteristics.
For instance, the QoS requirement of continuous media (CM) workload
(e.g., video) is to meet specific deadlines in data delivery with
high probability (e.g., greater than .99), while the nonCM
workload (e.g., images and text) requires fast response time with no errors.
Hence, we model multimedia servers which share resources dynamically
among the different types of workloads while satisfying their
performance requirements.
We first designed a family of disk scheduling
algorithms
[A12,
D16]
in order
to explore the possible design choices and tradeoffs of interest.
Such algorithms (ours as well as those of other researchers)
often use cyclebased scheduling of the CM workload
which results in models that are nonMarkovian
(because each chunk of CM workload must be completed by the end of a
specified time unit) and nonworkconserving
(because early service carries penalties in the form of increased
buffer space requirements).
These algorithms mostly differ in:
(1) how they ``interleave'' service of CM and nonCM workloads,
(2) how they order the service of nonCM requests, and
(3) how workconserving they are with respect to CM workload.
We developed a family of analytical
nonMarkovian models which represent this class of scheduling algorithms
and have tractable analytical solutions
[A5].
Prior approaches to modeling such systems
include M/G/1 queue with vacation models.
Our model, which can be viewed as a pollingtype system,
more faithfully captures the
behavior of scheduling algorithms, is extensible to a wider class
of such algorithms, and allows computation of a larger number of
important performance metrics, for both types of workloads.
In this work
[A5],
each scheduling algorithm being modeled is captured through
construction of appropriate service time distributions for
CM and nonCM workloads, where the CM
workload distribution is then matched with an Erlang and the
nonCM with a loaddependent exponential.
(Parameters for these distributions can be obtained by benchmarking
the disk and extracting the seek function, transfer times, and so on.)
Hence, all the events in the model are exponential, except for the
deterministic time unit during which a chunk of the CM workload
should be completed.
Our solution of the models is based on an existing methodology,
by de Souza e Silva, H.R. Gail, and R.R. Muntz^{1},
of using embedded Markov chains to analyze nonMarkovian
stochastic processes.
It involves a combination of steady state and transient analysis.
