ACM Home
SIGMETRICS 2001 / Performance 2001 Home
Call for Papers
Organizing Committee
Technical Program Committee
Registration Information
Advanced Technical Program
Travel Support for Students
Travel Related Information
Other Links of Interest


SIGMETRICS 2001 / Performance 2001

What TCP/IP Protocol Headers Can Tell Us About the Web

F. Donelson Smith
Felix Hernandez Campos
Kevin Jeffay
David Ott

DiRT Group, Department of Computer Science, University of North Carolina at Chapel Hill

We report the results of a large-scale empirical study of web traffic. Our study is based on over 500 GB of TCP/IP protocol-header traces collected in 1999 and 2000 (approximately one year apart) from the high-speed link connecting The University of North Carolina at Chapel Hill to its Internet service provider. We also use a set of smaller traces from the NLANR repository taken at approximately the same times for comparison. The principal results from this study are: (1) empirical data suitable for constructing traffic generating models of contemporary web traffic, (2) new characterizations of TCP connection usage showing the effects of HTTP protocol improvement, notably persistent connections (e.g., about 50% of web objects are now transferred on persistent connections), and (3) new characterizations of web usage and content structure that reflect the influences of "banner ads," server load balancing, and content distribution. A novel aspect of this study is a demonstration that a relatively light-weight methodology based on passive tracing of only TCP/IP headers and off-line analysis tools can provide timely, high quality data about web traffic. We hope this will encourage more researchers to undertake ongoing data collection and provide the research community with data about the rapidly evolving characteristics of web traffic.

[Last updated Fri Mar 23 2001]