CMSC 433, Fall 2005

Programming Language Technologies and Paradigms

Supplement


Web Browsers, Web Servers, and the Hypertext Transfer Protocol (HTTP)


HTTP Requests

On the world wide web, web pages are specified using URLs (Uniform Resource Locators) of the following form:

http://host:port/rest

Here http specifies that the web server uses the HTTP protocol (others are possible, for example: https or telnet). Host specifies the name, e.g. www.cs.umd.edu, or IP address, e.g. 128.8.128.160, on which the web server is running. Port specifies the port number on which server is listening. If the port is not specified, a protocol-specific default is used (e.g., port 80 for HTTP). Finally, rest encodes what information the client (e.g., a web browser) wants from the server.

For example, if a user types the URL

http://www.cs.umd.edu/class/fall2005/cmsc433/menu.html

into a web browser, the browser will contact www.cs.umd.edu on port 80, sending the following text:

GET /class/fall2005/cmsc433/menu.html HTTP/1.0

The web server listening on port 80 on www.cs.umd.edu will parse this request and interpret it in some manner. In this case, the web server on www.cs.umd.edu will look for a file named menu.html in the virtual path /class/fall2005/cmsc433/menu.html, and assuming the requested file exists, the web server will respond with an HTTP header followed by the requested file. Although many web servers simply serve files in this manner, you will see in your project(s) that you are free to interpret and respond to requests however you wish.

HTTP Headers

A web server must return an HTTP header describing the content it is returning before it returns that content. For example, an HTTP request for an HTML file (i.e., a file with the extension .html), a web server could return the following header:

     HTTP/1.0 200
     Content-Type: text/html
     <newline>
This header starts with a status line indicating that the response follows the HTTP version 1.0 protocol and that the status code is 200. The next line gives the Content-Type, in this case text/html. This tells the browser to interpret the response as an HTML formatted web page. The additional newline at the end is also required by the HTTP protocol.

Another example of an HTTP header a webserver could return is:

     HTTP/1.0 200
     Content-Type: text/plain
     <newline>
Here the Content-Type is text/plain, indicating to the browser that that the response should be interpreted as simple text.

A Hands-on Example

You don't have to use a browser to contact web servers. You can also use telnet, which is sometimes preferable for debugging. Below is sample output from using telnet to send the GET request shown above. User input is indicated by bold text. Many web servers require you input an extra newline (press the ENTER key again) after inputting the request.

[cs433001@nauseous ~]$ telnet www.cs.umd.edu 80
Trying 128.8.128.160...
Connected to www.cs.umd.edu (128.8.128.160).
Escape character is '^]'.
GET /class/fall2005/cmsc433/menu.html HTTP/1.0
                                                       <- extra newline
HTTP/1.1 200 OK
Date: Thu, 01 Sep 2005 14:44:15 GMT
Server: Apache
Last-Modified: Thu, 30 Dec 2004 21:06:53 GMT
ETag: "3efe83-384-aa559540"
Accept-Ranges: bytes
Content-Length: 900
Connection: close
Content-Type: text/html; charset=ISO-8859-1

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <title>CMSC 433, Fall 2005 -- Menu</title>
  <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
  <base target="content">
  <link href="style.css" rel="stylesheet" type="text/css">
</head>
<body>
<b><font face="Helvetica, Arial, sans-serif">
<table class="menu" cellpadding="2" cellspacing="2" border="0" width="100%">
  <tbody>
      <tr><td><a href="home.html">Home</a></td></tr>
      <tr><td><a href="syllabus.html">Syllabus</a></td></tr>
      <tr><td><a href="lectures.html">Lectures</a></td></tr>
      <tr><td><a href="projects.html">Projects</a></td></tr>
      <tr><td><a href="exams.html">Exams</a></td></tr>
      <tr><td><a href="news:csd.cmsc433">Newsgroup</a></td></tr>
      <tr><td><a href="resources.html">Resources</a></td></tr>
  </tbody>
</table>
</font></b>
</body>
</html>
Connection closed by foreign host.
[cs433001@nauseous ~]$ 

This example is provided here only to give a conceptual overview of the HTTP protocol and to illustrate how telnet can be a valuable debugging tool. Note that a production web server, such as the one serving www.cs.umd.edu, returns a complicated HTTP header. Your project(s) will return a much simpler http header. Make sure your project implementation adheres strictly to the project description. In this example, the web server is running on (the default) port 80. Your web server(s) will not be running on port 80. If you decide to use telnet as a debugging tool, be certain to specify the correct port number when typing the telnet command.

Valid HTML 4.01!