Interaction of Query Evaluation and Buffer Management for Information Retrieval

Björn Þór Jónsson (University of Maryland)
Michael J. Franklin (University of Maryland)
Divesh Srivastava (AT&T Labs-Research)

The proliferation of the World Wide Web has brought information retrieval (IR) techniques to the forefront of search technology. To the average computer user, ``searching'' now means using IR-based systems for finding information on the WWW or in other document collections. IR query evaluation methods and workloads differ significantly from those found in database systems. In this paper, we focus on three such differences. First, due to the inherent fuzziness of the natural language used in IR queries and documents, an additional degree of flexibility is permitted in evaluating queries. Second, IR query evaluation algorithms tend to have access patterns that cause problems for traditional buffer replacement policies. Third, IR search is often an iterative process, in which a query is repeatedly refined and resubmitted by the user. Based on these differences, we develop two complementary techniques to improve the efficiency of IR queries: 1) Buffer-aware query evaluation, which alters the query evaluation process based on the current contents of buffers; and 2) Ranking-aware buffer replacement, which incorporates knowledge of the query processing strategy into replacement decisions. In a detailed performance study we show that using either of these techniques yields significant performance benefits and that in many cases, combining them produces even further improvements.

Web Accessibility