S  H  O  E :  Simple HTML Ontology Extensions

Creating Ontologies Using SHOE

Sean Luke
PLUS Group, U Maryland at College Park

Creating a Basic Ontology

Suppose we're trying to let our knowledge-robot system be able to gather information about the UMCP Computer Science Department web site. Before we can annotate our web pages with this information, we need to first establish one or more ontologies (or much better, use already-established ones) that will define how we can classify our documents. These ontologies will describe categories which our web pages can fall into, and relationship rules between categories or other data which we can use later to describe relationships between our web pages and other web pages (or other data like numbers or dates).

An example works best to explain this idea. Imagine that the Association for Computing Machinery (the ACM, an "official" Computer Science group) has asked us create an ontology for computer science department web pages. It's rare that an ontology will simply be created to stand on its own--more often than not, new ontologies "borrow from" or "extend" existing ontologies. We can borrow from as many ontologies as we like, as long as each is assigned a unique prefix, as shown below. For this simple example, we'll borrow from SHOE's root ontology, base-ontology, located at "http://www.cs.umd.edu/projects/plus/SHOE/base.html.

So let's declare our ontology, and call it cs-dept-ontology version 1.0. We'll indicate that we'll be borrowing from base-ontology, that this ontology can be found at a particular URL, and that every element we reference from this other ontology will be prefixed with the prefix base.

<!-- Here we indicate that this document is conformant with SHOE 1.0 -->
<TITLE> Our CS Ontology </TITLE> </HEAD> <BODY>
<!-- Here we declare the ontology's name and version -->
<ONTOLOGY ID="cs-dept-ontology" VERSION="1.0">
<!-- Here we declare that we're borrowing from another ontology --> <USE-ONTOLOGY ID="base-ontology" VERSION="1.0" PREFIX="base" URL="http://www.cs.umd.edu/projects/plus/SHOE/base.html">

The most common function of an ontology is to provide users with the ability to hierarchically categorize instances. Since our ontology deals with computer science departments, let's toss the following categorization facts into the ontology:

  • departments and research groups are organizations.
  • faculty, assistants, and administrative staff are workers.
  • workers and students are people.
  • postdocs, lecturers, and professors are a faculty.
  • research assistants and teaching assistants are assistants.
  • graduate students and undergraduate students are students.
  • secretaries are administrative staff.
  • chairs are both professors and administrative staff.
  • organizations, publications, and people are "basic items".

The fact that chairs can be both professors and administrative staff indicates that SHOE provides multiple inheritance: categories can have more than one supercategory. We declare all these things by saying:

<!-- Here we lay out our category hierarchy -->
<DEF-CATEGORY NAME="Organization" ISA="base.SHOEEntity"> <DEF-CATEGORY NAME="Person" ISA="base.SHOEEntity"> <DEF-CATEGORY NAME="Publication" ISA="base.SHOEEntity"> <DEF-CATEGORY NAME="ResearchGroup" ISA="Organization"> <DEF-CATEGORY NAME="Department" ISA="Organization"> <DEF-CATEGORY NAME="Worker" ISA="Person"> <DEF-CATEGORY NAME="Faculty" ISA="Worker"> <DEF-CATEGORY NAME="Assistant" ISA="Worker"> <DEF-CATEGORY NAME="AdministrativeStaff" ISA="Worker"> <DEF-CATEGORY NAME="Student" ISA="Person"> <DEF-CATEGORY NAME="PostDoc" ISA="Faculty"> <DEF-CATEGORY NAME="Lecturer" ISA="Faculty"> <DEF-CATEGORY NAME="Professor" ISA="Faculty"> <DEF-CATEGORY NAME="ResearchAssistant" ISA="Assistant"> <DEF-CATEGORY NAME="TeachingAssistant" ISA="Assistant"> <DEF-CATEGORY NAME="GraduateStudent" ISA="Student"> <DEF-CATEGORY NAME="UndergraduateStudent" ISA="Student"> <DEF-CATEGORY NAME="Secretary" ISA="AdministrativeStaff"> <DEF-CATEGORY NAME="Chair" ISA="AdministrativeStaff Professor">

Note that Organization, Publication, and Person subcategorize from base.SHOEEntity, that is, the category SHOEEntity declared in base-ontology. SHOEEntity is the accepted "root" category for all categories you'll declare in an ontology, elements at the top of your category hierarchy should subcategorize from it.

So what's up with this "foo.bar" stuff, like "base.SHOEEntity"? This is how SHOE keeps track of which ontologies declared what. By extending one of more ontologies, you have access to the declarations they made through their prefix chains. In this case, since the base-ontology declared SHOEEntity, and we are extending the base ontology with the prefix base, we have access to SHOEEntity through the reference base.SHOEEntity.

Prefix chains can pile up. Imagine that we're extending some ontology foo-ontology with the prefix foo, and foo-ontology extends the ontology bar-ontology with the prefix bar, and bar-ontology declares the category baz. We have access to baz as foo.bar.baz.

Also note that Chair properly subcategorizes from both AdministrativeStaff and Professor. Instead of writing it the way we did, it'd be perfectly fine to write it separately as:

<!-- ...An optional way to say it... <DEF-CATEGORY NAME="Chair" ISA="AdministrativeStaff"> <DEF-CATEGORY NAME="Chair" ISA="Professor"> -->

Now, let's add to our ontology some simple relationships between elements of different categories.

  • students have professors as advisors.
  • organizations have members.
  • people author publications.

There can be many more, of course, but that'll be enough for our example. It's also often useful to use relationships other than just classifications. We can also have the following kinds of relationships with specific kinds of data types:

  • publications are published on a date.
  • students' age is a number.
  • everything can have a name which is a string (that is, a phrase like "Robert Kohout").
  • whether a professor is tenured or not is a truth (a value of "YES" or "NO")

Obviously there could be more than this, but this suffices for our example.

<!-- And now we lay out our relationships between categories -->
<DEF-RELATION NAME="advisor"> <DEF-ARG POS="1" TYPE="Student"> <DEF-ARG POS="2" TYPE="Professor"> </DEF-RELATION> <DEF-RELATION NAME="member"> <DEF-ARG POS="1" TYPE="Organization"> <DEF-ARG POS="2" TYPE="Person"> </DEF-RELATION> <DEF-RELATION NAME="publicationAuthor"> <DEF-ARG POS="1" TYPE="Publication"> <DEF-ARG POS="2" TYPE="Person"> </DEF-RELATION>
<!-- Lastly, we lay out our other relationships -->

What's up with the period in the terms ".DATE", ".NUMBER", ".STRING", and ".TRUTH"? Well, there is one exception to the prefix-chain rule described before (remember, "foo.baz.bar"?), and that is that types, categories, relations, etc. beginning with a sole period are references to objects declared in the SHOE base-ontology. This means that, yes, we could instead have referred to them as "base.STRING" for example. And similarly, "base.SHOEEntity" could (in this example) be ".SHOEEntity". But only because we're happening to extend the SHOE base ontology. If we extend any other ontologies, we have to use full prefix chains to refer to their objects.

Our "name" declaration is redundant. As it turns out, the SHOE base ontology already has a relation "base.name" declared, but we'll ignore that for the sake of this example.We finish out an ontology by closing it as such:


And our basic ontology is finished. There are a number of additional ontology-building features that we didn't touch on here, but this is sufficient for now.