Evaluation Criteria
Numerous data repositories are available online; however few evaluation criteria have been suggested for these important resources. We suggest that when considering data repositories, two areas need to be evaluated: 1) The repository and 2) The data. Below is a list with a brief description of desirable attributes for both of these areas along with the rating criteria used on this site.
* The starred items are considered critical to creating an online data repository.
Three evaluation systems were used when evaluating the site based on the type of data being evaluated:
If the feature was present
If the feature was not
present.- 0-5
depending on the number of subfeatures within a
category. If 5 subfeatures were possible, a star was assigned for
each feature. If 9 were possible a star was assigned for each 2
features. - LIST: the results of evaluating the site were simply listed.
Repository Criteria
- *Single Entry Point (

): Despite the variety of data
available from a source, providing a single entry point to the
data helps orient the user to the repository and improves the
data's findability. Multiple entry points can leave users
wondering if they missed something and reduces their ability to
create an effective search strategy. A good single entry point
should also provide users with assistance navigating the
repository. - *Overview(
): An overview of the repository allows the user to
understand quickly what is available and what is not. A good
overview should provide the user with an understanding of the
(1) number, (2) size, (3) type, (4) source, and (5) temporal range of available
datasets as well as the range of topics covered. - Browsable(

): Subdividing the data into meaningful
categories, which users can quickly browse, facilitates both
comprehension of large data sets and navigation. - *Searchable(

): Because data repositories are often large
and include a wide variety of data, search becomes an important
function. Search should be easily accessible from the home page
and the data only (as opposed to the web site and the data). - *Data Retrieval Formats (LIST): Providing a method for the user to download data tables after locating them, as opposed to cut and paste is a critical feature. Excel formats are an excellent choice for data sets with less than 65,000 rows, although an alternative format should be made available for non-excel users. Larger data sets can be handled by subdividing the data, providing alternative methods of download, or allowing the user to manipulate the data online to reduce the amount of data downloaded. Any download method that requires installation of non-standard software is less desirable, but may be required and for larger data sets. Providing a combination of methods in these situations is likely the best solution to meet multiple users's needs.
- Online Data Interaction(

): Providing a mechanism for
interacting with the data online, if implemented in a usable
fashion, is an excellent extra feature, particularly for larger
data sets. It should not replace providing a method of data
retrieval since copying and pasting html tables often looses
formatting and adds extra data. Generally tabular presentations
are easier to read and handle and is better representation for
relational data. XML, however, is better representation for
hierarchical data. Also XML is more manageable when loaded in
memory. - Visual Interfaces(

): When appropriate, providing
visual interfaces to locating or understanding the data will
improve the user's experience. - Additional Information (LIST): Information about and links to popular searches, usage reports, or related discussion groups can peak a user's interest and take advantage of data's use by multiple people.
- Feedback Mechanisms(

): Methods of providing
feedback to the repository such as contact information and an
error reporting function are also beneficial.
Data Criteria (
)
In addition to the attributes of the repositories, certain information about the data allows a user to assess the data's authority and quality. The following information is recommended for this purpose:
- *Author information
- *Data quality guidelines
- *Method of indicating uncertainty/incompleteness
- *Currency
-
- Last update date
- Version information
- Data collection data
- Additional data details
-
- Detailed descriptions
- Reports based on the data
Evaluation Limitations
Each site was evaluated for 10-15 minutes. If an item was not found in that time then it was marked as missing. This does not guarantee that the feature did not exist on the site but does imply it was not easily findable. As of May 8, 2006, evaluations were conducted by only one individual based on the criteria listed above. Further work needs to be done to ensure the methods were consistent and replicable.