George Andrei Mihaila, University of Toronto
In the past few years, the World Wide Web became a vast repository of information in practically all domains of activity. While most of the current content of the Web is of textual and multimedia nature, we see an increased interest in using this medium for the dissemination of more structured kinds of data, such as scientific datasets in various disciplines. In order to facilitate data exchange, standards have been developed for the schema and semantics of the data. However, finding data relevant to a particular problem is often difficult due to the autonomous and distributed nature of data sources. Even after a set of sources have been identified as potentially containing relevant data, it is hard to decide which ones are best suited for the task at hand. Thus, sources differ with respect to various data quality parameters such as coverage of a particular domain, data recency, access cost, etc. Also, when extracting data from several independent, often overlapping and inconsistent data sources, it is sometimes useful to present users with meta-information about the confidence of the answer to a query, based on the number and quality of the sources that participated in constructing the answer.
Back to the Fall 1998 dbchat index