Overview

For this homework you will implement an utility that allow us to find images present in a collection of web sites.

Objectives

This project is designed to help you practice:

Grading

Clarifications

Any clarifications or corrections associated with this project will be available at: Clarifications

Code Distribution

The project's code distribution is available by checking out the project named ImageFinder. The code distribution provides you with the following:

Specifications

For this homework you will implement a system that allow us to find urls of images present in a collection of web sites. In order to find these images we will use the static method Utilities.findImages (see code distribution) which takes a set of web sites and returns a set of urls that corresponds to images found (if any) in the specified sites.

To recognize images, your system will search through the html code of the specified web page, looking for entries starting with "<img src=" where any number of spaces and options (e.g., border) may exist in between img and src. An image is represented by the string following "src=". The following is a representative example of one possible entry you will be searching for:   

<img src="http://www.cs.umd.edu/class/fall2009/cmsc132/homeworks/ImageFinder/documents/Set1/Set1a.jpg" />

The findImages method will return complete urls of images found. A complete url is defined as one that starts with "http://" and which provides the exact location of the image in such a way that we can cut and past the url in a browser and actually see the image. For this project you don't have to worry about sites that may use uppercase letters for img or src in the html code.

Requirements