Overview
Air Cache

Cubetrees

Dynamat

MOCHA

Online Updates

Transcurrent Execution Model

WebViews
People
Papers
Demos
Links
WebViews

The World Wide Web has seen tremendous growth in the years since its inception, accompanied by a big transformation in its nature, from purely static HTML pages in the early 90s to most web pages having some dynamic content today. Online services, frequently updated content and personalization are the main reasons behind dynamically generated web pages. On the one hand, dynamic content requires far greater resources from web servers than static pages do and does not scale well. On the other hand, an increasingly big fraction of the world population relies on online services to perform everyday tasks, from reading the newspaper and shopping, to looking up movie times and real-time stock information.

Our research focuses on improving the performance of database-backed web servers which are commonly used to generate dynamic content on the Web today. The ultimate goal would be to solve the scalability issue (i.e. allowing the web server to handle a great number of users), without sacrificing the ``quality'' of the served information (i.e. always responding with ``fresh'' data).

Although web caching has solved the scalability problem for static pages, it cannot be directly applied for dynamically generated pages, since it will not provide any guarantees for the freshness of the cached data. Servicing user requests fast is of paramount importance, only if the data is fresh and correct, otherwise it may be more harmful than slow or even no data service. We have already showed through experiments on an industrial-strength prototype that web materialization, where web pages are cached and constantly refreshed in the background, is a robust solution to the scalability problem for dynamic web content. We use the term WebView to refer to the HTML fragments that are the unit of materialization.

Similarly to traditional database views, WebViews can be in two forms: virtual or materialized. Virtual WebViews are computed dynamically on-demand, whereas materialized WebViews are precomputed. In the virtual case, the cost to compute the WebView increases the time it takes the web server to service the access request (the query response time). On the other hand, in the materialized case, every update to base data leads to an update to the WebView, which increases the server load. Having a WebView materialized can potentially give significantly lower query response times, compared to the virtual approach. However, it may also lead to performance degradation, if the update workload is too high.

WebView materialization generates the following issues, which we are addressing in this project:

  1. We need metrics for the Quality of Service (QoS) and the Quality of Data (QoD) at data-intensive web servers. Ideally, WebView Materialization should attempt to maximize both QoS and QoD or at least provide some quality guarantees.

  2. We must decide which WebViews to materialize in order to improve QoS/QoD. This is similar to the view selection problem in data warehouses, with one very big difference: updates on the Web are performed online, whereas in data warehouses updates are offline.

  3. Given a set of WebViews that must be materialized, we need to determine the order in which to refresh them, in the presense of updates and limited resources. One solution is to refresh most popular WebViews first. However it is not clear how this will work when we have a hierarchy of WebViews (i.e. a view derivation graph) and/or precedence constraints.

  4. With the highly dynamic and unpredictable nature of web traffic, we need to determine how to best adapt the materialization decisions based on changes in the access or update workload.

  5. Finally, we need to combine the algorithm for selecting which WebViews to materialize with the algorithm that decides the order in which to refresh materialized WebViews, under a unified framework.

WebViews Papers
Update Propagation Strategies for Improving the Quality of Data on the Web
Alexandros Labrinidis, Nick Roussopoulos.   In the Proceedings of the 27th International Conference on Very Large Data Bases (VLDB'01), Rome, Italy, September 2001
Available in: PDF

Adaptive WebView Materialization
Alexandros Labrinidis, Nick Roussopoulos.   In the Proceedings of the Fourth International Workshop on the Web and Databases (WebDB'2001), held in conjunction with ACM SIGMOD'2001, Santa Barbara, California, USA, May 2001.
Available in: PDF

WebView Materialization
Alexandros Labrinidis, Nick Roussopoulos.   In the Proceedings of the ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, May 2000
Available in: gzipped PDF
Technical Report version of the same paper (in PDF)

On the Materialization of WebViews
Alexandros Labrinidis, Nick Roussopoulos.   In the Proceedings of the ACM SIGMOD Workshop on the Web and Databases (WebDB'99), Philadelphia, Pennsylvania, USA, June 1999
Available in: gzipped Postscript

WebViews People


Last update was on January 16, 2002 Page Design
Send comments or questions to Nick Roussopoulos

Web Accessibility