|
Wongsuphasawat, K., Shneiderman, B. (April 2009)
Finding Comparable Temporal Categorical Records:
A Similarity Measure with an Interactive Visualization
Is to appear in Proceedings of IEEE VAST 2009.
HCIL-2009-08
An increasing number of temporal categorical databases are being
collected by various institutions: Electronic Health Records with
millions of records of patient histories in healthcare organizations,
tremendous traffic incident logs in transportation systems, or massive
student records in academic institutes. Finding similar records
within these large-scale databases is a challenging problem. A major
challenge is how to define a similarity measure that captures the
searchers intent. Many methods for computing a similarity measure
between time series have been proposed, but temporal categorical
record is different and requires fresh thinking. We then propose a
temporal categorical similarity measure, called the M&M measure,
which is based on the concept of aligning records by sentinel events,
then matching events between two records. The M&M measure is
calculated as a combination of the time differences between pairs
of events and number of mismatches. To accommodate customization
of parameters in the M&M measure and results interpretation,
we implement Similan, an interactive search and visualization tool
for temporal categorical records. A usability study with 8 participants
demonstrates that Similan was easy to learn, but users had
more difficulty understanding the M&M measure. Users had strong
opinions that Similan could help them find similar records in temporal
categorical databases. In response to feedback from the study,
we also develop a new prototype. A pilot study suggests that while
binned timeline in original interface is simpler and more readable,
the continuous timeline in the new interface is better for showing
fine-grain information.
|