Dynamic queries, starfield displays, and the path to Spotfire

Updated July 26, 2007

Ben Shneiderman, February 4, 1999

The old days of command line interfaces and submitting queries to databases are passing quickly.  In their place are dynamic queries and starfield displays that update a two-dimensional graphical display in 100 milliseconds.  As users adjust sliders, buttons, check boxes, and other control widgets the starfield display containing color- and size-coded points us updated rapidly.  Users feel they are in control and there is no more "RUN" button.

Two early applications of dynamic queries were built at the University of Maryland's Human-Computer Interaction Laboratory (HCIL Video Reports 1992) -- a chemical table of elements (HCIL TR 91-11) and a real estate HomeFinder (HCIL TR 92-01) (free downloadable version at http://www.cs.umd.edu/hcil/pubs/products.shtml).

Christopher Ahlberg, a visiting student from Sweden during summer 1991, took up my lunch-time challenge to work on dynamic interfaces that applied direct manipulation principles. His first overnight success was making a modern slider-based version of a polynomial viewer that I had built as a graduate student in 1972 (The Mathematics Teacher 67,2 (February 1974), pages 111-113). As users move the sliders for each coefficient, the curve gracefully reshapes on the screen - dancing parabolas. Within a week he had satisfied my second challenge of a dynamic query for the chemical table of elements. He put up the periodic table with chemical symbols in red and six sliders for attributes such as atomic radius, ionization energy, and electronegativity. As users move the sliders, the chemical symbols change to red showing the clusters, jumps, and gaps that chemists find fascinating. A study with 18 chemistry students showed faster performance with use of a visual display (versus a simple textual list) and a visual input device (versus a form fillin box).

Christopher Williamson's HomeFinder showed a map of Washington, DC and 1100 points of light indicating homes for sale. Users could mark the workplace for both members of a couple and then adjust sliders to select circular areas of varying radii. Other sliders selected number of bedrooms and cost, with buttons for air conditioning, garage, etc. Within seconds users could see how many homes matched their query, and adjust accordingly. Controlled experiments with benchmark tasks showed dramatic speed-ups in performance and high subjective satisfaction (1, 2, 3) (HCIL TR 93-01and HCIL TR 94-16) (HCIL Video Reports 1993 and 1994). This demo continues to be one of the most compelling and comprehensible even though it is almost 8 years old.

Williamson earned a trip to the ACM SIGIR'92 conference in Copenhagen to present his work. Then he went on to the University of Colorado at Boulder to do a master's thesis that expanded the idea into a well-engineered and commercial viable version. One of the amusing stories about this project was the unwillingness of corporate or university sources for regional housing information to share their data. Undaunted, Chris Williamson and his friends took a Sunday Washington Post and typed in the data for the 1100 homes. One of the amazing stories is the resistance of these same institutions to learn about or apply our approach.

The concept of a generic two-dimensional scattergram with zooming, color coding, and filtering was first applied in the FilmFinder (4) (HCIL-TR 93-14) (HCIL Video Reports 1994 and repeated in 1996). Our lab was working on interactive TV applications and we had a brainstorming session in the conference room with about eight attendees including Chris Ahlberg, who joined us for a second summer. I asked each person to describe a possible interface for finding a film from a library of 10,000 videotapes. As the variants of traditional approaches with command lines and menus were rejected, it became more difficult for each speaker to come up with something fresh. Since every alternative was text based, I concluded the session by proposing a two dimensional layout with years on the x-axis and popularity on the y-axis. The idea was quickly accepted and refined.

By the next morning Chris had a prototype showing fifteen hundred films with color coded spots (red for drama, white for action, etc.). As the weeks passed and other students built components, Chris integrated them into the FilmFinder. A range slider allowed filtering by the length of the film and buttons allowed selection by ratings. A click on one of the spots produced a pop-up box with details of each film and a picture of one of the actors/actresses. The picture for about 50 actors were grabbed from the net, and Michelle Pfeiffer (Chris's choice) became the default for the others. Demos were carefully scripted around the pictures that we had.

 

The idea for the alphaslider had been germinating in our lab for two years, but we were stuck with designs that had one item per pixel (HCIL TR 93-08), limiting its use to a few hundred items. Chris proposed the concept that many items could be tied to a single pixel, but I thought he misunderstood the technology - it was me who just couldn't grasp his solution. It took me a few minutes to shake free from my assumptions and then it was immediately clear that his approach would work. Chris built four versions and ran a study with 24 subjects (5) (HCIL TR 93-15). The FilmFinder used the alphasliders to select form thousands of actors, actresses, and directors. The alphaslider idea spawned a variety of improvements and further empirical studies.

The FilmFinder videotape was also produced during Chris's intense 12-week summer visit and it remains my most popular tape to show, even 5-6 years later. The group became known as the Widget Carvers of College Park, commemorated by a hand carved wooden sign that hangs at our lab's doorway. Later work on the FilmFinder sought to improve the data structures (HCIL TR 93-16) and make the zooming smoother (HCIL TR 93-06). Our advanced research continues to increase the size of the database while keeping response times rapid (8, 9).

Chris Ahlberg returned to Sweden to work on his excellent PhD on Dynamic Queries at Chalmers University. He developed an enhanced UNIX implementation, called IVEE (Information Visualization and Exploration Environment) that was flexible in reassigning the axes to other attributes and had a scroll bar to permit large numbers of sliders. It allowed importation of arbitrary flat files (6) and therefore was applied in many projects, including the State of Maryland's Department of Juvenile Justice (7) (HCIL TR 96-15)(HCIL Video Reports 1995) .

Ahlberg gathered his friends, and found venture capital to start a company. The commercial version of the starfield display, now called Spotfire, allows increased user control and greater flexibility. Spotfire was launched in mid-1996 by IVEE Development, which became renamed as Spotfire Inc. (http://www.spotfire.com). It maintains a development center in Goteborg, Sweden with its North American headquarters in Cambridge, MA. Because of my sense of paternity for the idea, I agreed to be on the board and have enjoyed watching Chris Ahlberg's transformation into a successful businessperson. Spotfire has become a leader in visual data mining and information visualization and is available in Windows and Windows NT versions. It has become enormously successful in the challenging pharmaceutical drug discovery task, leading 16 of the 20 big pharmaceutical companies to become major adopters. Easy import/export of data, rapid change to axes, color coding, or size coding, and collaboration support have made Spotfire a leader in its class. Expansion to other application domains is proceeding.

The concept of dynamic qureries became the basis for applications such as pruning of large tree structures based on sliders tied to attributes of nodes (HCIL 95-12) (HCIL Video Reports 1995) and a Youth services database for the Maryland Dept. of Juvenile Justice (HCIL TR 96-07) (HCIL Video Reports 1995). Our recent efforts have applied Spotfire to web site log visualization, with the goal of understanding patterns of usage for ecommerce.

In developing a Java-based visual tool for high school teachers to find educational resources, the Baltimore Learning Community (www.learn.umd.edu) we found that a simplified starfield display that had a discrete grid, rather than continuous axes, was easier to use (HCIL TR 97-15). In this database, color coding would typically be used for media type, for example, red for videos, green for web sites, blue for images, and yellow for texts. When more than 49 documents wind up in a grid box, Dotfire shifts to a color-coded bar graph showing the frequency counts for each type of document. The typical axes for this application are topics (Arts and Humanities, Careers, Conflicts, Geography, Religion, U.S. History, World History) and outcomes ( Chronlogy, Civis, Cultures, Economics, Environment, Politics, Science and Technology). Other categorical axes used in this application were year (1991 to 1998) and source (Discovery Communications, National Geographic, Library of Congress, etc.). Users could control and quickly change the axes and the color coding.

Another extension of dynamic queries is to query previews for NASA remote sensing environmental databases. Instead of showing results within 100 milliseconds, the query preview simply shows the size of the result set. This eliminates wasted time with empty and with overly large result sets. This approach works well on the World Wide Web and it has been implemented on NASA websites and elsewhere.

Under Christopher Ahlberg’s leadership as CEO, Spotifre grew to a 200-person company, which was bought during Summer 2007. The HCIL-inspired Spotfire product is used my most pharmaceutical companies for drug discovery and genomic data analysis, and is increasing adopted for business intelligence analysis for oil/gas discovery, manufacturing control, marketing, supply chain management, and financial analysis. Ben Shneiderman participated in the formation of Spotfire and was on its Board of Directors 1996-2001. 


REFERENCES

1) Ahlberg, Christopher, Williamson, Christopher, and Shneiderman, Ben, Dynamic queries for information exploration: An implementation and evaluation, Proc. ACM CHI'92: Human Factors in Computing Systems (1992), 619-626. (link to HCIL TR 92-01)

2) Williamson, Christopher, and Shneiderman, Ben, 1992. The Dynamic HomeFinder: Evaluating dynamic queries in a real-estate information exploration system, Proc. ACM SIGIR'92 Conference, Copenhagen, Denmark, (June 1992), 338-346. Reprinted in Shneiderman, B. (Editor), Sparks of Innovation in Human-Computer Interaction, Ablex Publishers, Norwood, NJ, (1993), 295-307.

3) Shneiderman, Ben, Dynamic queries for visual information seeking, IEEE Software 11, 6 (1994), 70-77.

4) Ahlberg, Christopher and Shneiderman, Ben, Visual Information Seeking: Tight coupling of dynamic query filters with starfield displays, Proc. of ACM CHI94 Conference (April 1994), 313-317 + color plates. Reprinted in Baecker, R. M., Grudin, J., Buxton, W. A. S., and Greenberg, S. (Editors), Readings in Human-Computer Interaction: Toward the Year 2000, Second Edition, Morgan Kaufmann Publishers, Inc., San Francisco, CA (1995), 450-456.

5) Ahlberg, Christopher and Shneiderman, Ben, AlphaSlider: A compact and rapid selector, Proc. of ACM CHI94 Conference, (April 1994), 365-371.

6) Ahlberg, Christopher and Wistrand, Erik, IVEE: An information visualization & exploration environment, Proc. IEEE Information Visualization '95, IEEE Computer Press, Los Alamitos, CA (1995), 66-73.

7) Ellis, Jason, Rose, Anne, Plaisant, Catherine, Putting visualization to work: ProgramFinder for youth placement, ACM CHI 97: Human Factors in Computing Systems , ACM, New York (March 1997), 502-509

8) Tanin, Egemen, Beigel, Richard, and Shneiderman, Ben, Design and evaluation of incremental data structures and algorithms for dynamic query interfaces, IEEE Information Visualization Conference (October 1997).

9) Tanin, Egemen, Beigel, Richard, and Shneiderman, Ben, Incremental data structures and algorithms for dynamic query interfaces, ACM SIGMOD Record 25, 4 (December 1996), 21-24.

Refereed Videos available from ACM SIGCHI (http://www.acm.org/sigchi)

Ahlberg, Christopher , Shneiderman, Ben, and Williamson, Christopher , Dynamic Queries, ACM SIGGRAPH Video Review 77, 10 min. (1991).

Ahlberg, Christopher and Shneiderman, Ben, Visual Information Seeking: Tight coupling of dynamic query filters with starfield displays, ACM SIGGRAPH Video Review 97, 7 min.(April 1994).