Expose

. . . Back to The SHOE Home Page

S  H  O  E :  Simple HTML Ontology Extensions

Exposé

Parallel Understanding Systems Group
Department of Computer Science
University of Maryland at College Park


Introduction

Exposé is a robot that searches for web pages with SHOE mark-up, reads the knowledge from them, and loads it into PARKA. This knowledge can then be queried using the PARKA interface or SHOE Search. These tools are available as Java applets that give users a new way to browse the web by allowing them to submit complex queries and then open documents by clicking on the URLs in the results.

How it Works

Exposé is initialized by given it a starting URL and setting limits on which servers and/or directories it is allowed to visit. This second step is important because it allows the agent to concentrate on portions of the Web where SHOE is likely to be found. It can also be used to focus the knowledge gathering along a specific topic.

When the robot reads a page, it extracts the SHOE information and stores it in a Parka knowledge base. It then identifies all URLs within the document and applies an evaluation function to each to determine the order in which it will visit them. Currently, URLs can be found in hypertext links, SHOE category instances, and SHOE relation arguments. It then chooses the next page to visit and repeats the process. This continues until there are no new pages to visit.

The selection of the evaluation function controls the order of Exposé's search. The function we use operates under the following assumptions:

  • SHOE pages tend to appear in clusters.
  • SHOE pages are more likely to be linked to other SHOE pages.
  • SHOE pages are more likely to be in the same directory as another SHOE page.

Click here to see a screen shot of Exposé in action