This project explores technologies for visualizing complex datasets to assist information retrieval and understanding. Our particular interest is in biodiversity information, which underlies most pressing environmental and conservation debates but is needed by users without significant content expertise. This project combines information visualization techniques and rapid feedback dynamic query interfaces coupled with an aggressive approach of working with representative users from design through evaluation. Zoomable interfaces will allow users to navigate multiple hierarchies, in order to visually accommodate and understand highly interconnected data.
Discoveries at and across the frontier of science and engineering
Our findings extend our understanding of zooming and integrated searching
and browsing as tools for information retrieval. We are adding to knowledge
about the behavior of non-content experts and how they can be supported in
exploring complex biological databases, even as they gain content-expertise. Our
first applications, TaxonTree and DoubleTree, scale up to very large trees (up
to 400,000 nodes) through use of a database backend. We contributed sample
datasets for an Information Visualization Contest which generated other innovative solutions to the problem of comparing
large trees. We are exploring and evaluating different ways to display node-link
diagrams and node attributes. This interdisciplinary work provides some of the first findings focused on
front-end systems in biodiversity informatics. In particular it targets an
expanded user community of non-experts. At the same time, expert biologists will
benefit from the ability to visualize and interact with taxonomic and
phylogenetic databases.
Connections between discoveries and their use in service to society
Supporting users across content-expertise levels is of vital importance to
the global information economy. People in governments and schools and private
industry rely on internet resources for decision-making and learning.
Specifically, this project represents a new approach for visualizing and
reducing biodiversity data complexity so that it can be successfully used across
society.
A diverse, globally oriented workforce of scientists and engineers
Other than the PI, all project personnel including one Ph.D. student, two
part-time research scientists, one undergraduate researcher, and seven undergraduate design partners have been
women.
Improved achievement in mathematics and science skills needed by all
Americans
Our tools are expected to support increased understanding of scientific
databases and biodiversity data. In addition to its use in a core biology course
at University of Maryland, TaxonTree is being adapted for use by the Animal
Diversity Web, part of the BioKIDS
project (NSF REC 0089283). BioKIDS’ inquiry-based biodiversity curriculum
targets 5th and 6th graders in the Detroit Public School System.
Our project goals are to:
1) Develop a searching interface for biodiversity databases targeting
domain-novice adults.
2) Build interfaces combining "folk" and "scientific" understanding.
3) Evaluate the developed interfaces and compare them to existing interface
models in the biodiversity domain.
Since initiating the project in September 2002 we have created one application and two prototypes towards the first two goals and have conducted one qualitative user study towards the third goal. This second year was spent disseminating results and preparing for final refinement and evaluation of our first application and its design principles. In addition we are expanding our scope and have begun developing datasets and tools for visualization of ecological interaction data.
We developed a new software application, TaxonTree by modifying an existing application, SpaceTree. TaxonTree allows users to browse and search a very large node-link diagram of animal names that we constructed by integrating data from a number of public and private sources. Names link to external web pages. TaxonTree uses zooming interactivity and integrated searching and browsing. Search results are presented in the larger biological context of their classification tree. Towards the second goal, we developed a prototype called DoubleTree which couples navigation in the scientific biological classification in TaxonTree with a simpler, folk tree; another prototype supported multi-dimensional natural history data exploration.
Our qualitative user study of TaxonTree in an undergraduate course is the first to describe usage patterns in the biodiversity domain. We found that interaction with an animated, zoomable node-link diagram aided users' understanding of the data. Most users approached biodiversity data by browsing, using common names and general knowledge rather than the scientific keyword expertise necessary to search using traditional interfaces. Users with different levels of interest in the domain had different interaction preferences -- results suggest that users with higher interest levels (usually female) prefer greater control over node opening. Performance of TaxonTree and DoubleTree on large datasets was quite good, with basic browsing and querying tasks requiring from 62 ms to 2547 ms. This is because our approach is to show only the subset of data of immediate interest to the user, while retaining the ability for users to browse to obtain nearby detail. Our work demonstrates trade-offs inherent in displaying phylogenetic vs. classification trees and shows that a combined approach is not only feasible but usable. Coupling a folk tree with a large scientific tree shows promise but a more effective way to illustrate one to many mappings is needed. We can now refine the tasks and metrics to allow comparative studies to accomplish goal 3 for TaxonTree and future applications.
This research addresses the general problem of diverse users and complex information sources via visualization. In the same way that bioinformatics has revolutionized the fields of molecular biology and biophysics, biodiversity informatics is at the threshold of providing data and tools to allow the next generation to discover and understand global patterns and processes governing the diversity of life. Much biodiversity information is already available on the Internet (Bisby, 2000), where keyword searches remain the predominant method of access (Cockburn & McKenzie, 2000). In the biodiversity domain, the efficiency of single word searches is constrained because inherently complex biological data are stored in a controlled language that is not necessarily understood by domain novices. Users may be professionals such as taxonomists and conservation biologists, or they may be domain novices, such as students or educated professionals of other fields such as land-use planners or lawyers (Maier et al., 2000).
All software is available for download or use at