Byoung-Kee Yi
Fast similarity searching in large time-sequence databases has attracted a lot of research interest. All of them use the Euclidean distance (L2), or some variation of Lp metric. Lp metrics lead to efficient indexing, thanks to feature extraction (e.g., by keeping the first few DFT coefficients) and subsequent use of fast spatial access methods for the points in feature space.
In this work we examine a popular, field-tested dissimilarity function, the ``time warping'' distance function which permits local accelerations and decelerations in the rate of the signals or sequences. This function is natural and suitable for several applications, like matching of voice, audio and medical signals (e.g., electrocardiograms) However, from the indexing viewpoint it presents two major challenges:
Here we show how to overcome both problems: for the former, we propose using a modification of the so called ``FastMap'', to map sequences into points, trading off a tiny amount of ``recall'' (typically zero) for large gains in speed. For the latter, we provide a fast, linear test, to help us discard quickly many of the false alarms that FastMap will typically introduce. Using both ideas in cascade, our proposed method achieved up to 7.8-time speed-up over the straightforward sequential scanning, on both read and synthetic datasets.
Joint work with H.V. Jagadish and Christos Faloutsos.
Back to the Summer 1997 dbchat index