SEMVAST

Scientific Evaluation Methods for Visual Analytics Science and Technology

News

The old server that hosted the Repository died... so it was moved HERE: the new version of the Visual Analytics Benchmark Repository . The original websites for a few older challenges also ran on now-outdated technology and stopped working, so we saved PDF copies of the pages and include them in the benchmark repository for archival purposes.

The VAST Challenge 2014 was a success. The solutions and all entries will be posted soon on the Visual Analytics Benchmark Repository

Look at the Information Visualization journal special issue on the VAST challenge (Oct 2014)

Read the summary of seven years of VAST Challenge - i.e. our Beliv 2012 paper (PDF - as revised after the workshop)

The Blog was closed (too high ratio of spam postings to real news). Please use the mailing list or contact the project members.

Subscribe to our MAILING LIST to receive emails about this topic (only a couple a month maximum)

We posted a sample IRB application for running a MILC study (Multi-Dimensional In-depth Long-term Case study)

The Visual Analytics Benchmark Repository is available. It allows you to find benchmarks but also add new ones, or descriptions of the use of existing benchmarks (e.g. analysis done useing them), and to upload or point to papers that report on the use of the benchmark datasets. If you are interested in adding additional benchmarks, please contact us.

Participants

Catherine Plaisant - University of Maryland, College Park
Georges Grinstein - University of Massachusetts at Lowell
Jean Scholtz - Battelle Memorial Institute, Pacific Northwest Division
Mark Whiting - Battelle Memorial Institute, Pacific Northwest Division

Contact us

Students:
Loura Costello - University of Massachusetts at Lowell (CS PhD student)
Heather Byrne- University of Massachusetts at Lowell (CS undergrad - REU student)
Albayrak, Adem - University of Massachusetts at Lowell (CS undergrad - REU student)
Swetha Reddy - University of Maryland (iSchool, Master student)
Shawn Konecni - University of Massachusetts at Lowell (CS PhD student)
Manas Desai - University of Maryland (Computer Engineering, Master Student)
(past) Samiksha Piprodia - University of Maryland (iSchool, Master student)
and other Lowell students: Yen-fu Luo John Fallon, Patrick Stickney.

Project Description

Visual analytics is the science of analytical reasoning facilitated by interactive visual interfaces. As new visual analytics methods and tools are developed an evaluation infrastructure is needed. There is currently no consensus on how to evaluate visual analytics systems as a whole. It is especially difficult to assess their effectiveness as they combine multiple low level components (analytical reasoning, visual representations, computer human interactions, data representations and algorithms, tools for communicating the results of such analyses) integrated in complex interactive systems that requires empirical user testing. Furthermore, it is difficult to assess the effectiveness without realistic data and tasks.

This project has focused on: 1) making benchmarck data sets available, 2) seeding an infrastructure for evaluation, 3) developing evaluation methods.

The VAST Challenge serves as a Testbed for our activities.
SEMVAST PIs co-chaired the first VAST Challenge, organized as a submission category of the IEEE Visual Analytics Science and Technology (VAST) Symposium, and continue to participate in its organization. See the VAST 2008 Challenge and VAST 2009 Challenge. The challenges are powerful activities allowing the community to evaluate their tools with representative tasks and data sets, and allows us to test our metrics and automated evaluation tools on the materials submitted. Rather than emphasize “winners”, the VAST Challenge attempts to engage the community more widely by awarding multiple awards that recognize effective designs.

1. Making benchmark data sets with ground truth available

Our collection started with the released datasets of the VAST contest and Challenge datasets. Those datasets were developed by the team of Mark Whiting of Pacific Northwest National Laboratory in the context of the Threat Stream Generator Project of the National Visualization and Analytics Center. As part of the VAST Challenge organization we worked with Mark and his team to develop all the evaluation materials for those benchmarks.

VAST datasets:

The 2 VAST 2012 datasets (Cybersecurity)
The 3 VAST VAST 2011 datasets (Geospatial and Microblogging for Characterization of an Epidemic Spread, Cybersecurity - Situational Awareness in Computer Networks, Text Analytics - Investigation into Criminal Activity)
The 3 VAST 2010 datasets (Text reports, Spatio-temporal data, Genetic sequences)
The 3 VAST 2009 Challenge datasets (Badge and computer network traffic, Social network with geospatial component, video analysis)
The 4 VAST 2008 Challenge datasets (wiki edits, cell phone social network, spacio-temporal [boat migrations, and building evacuation])
And also previously available:
VAST 2007 Contest dataset (mostly text)
VAST 2006 Contest dataset (mostly text)

Other datasets:

We are worked with colleagues at other institutions to help them develop new data sets and make them available to the community. Examples of data sets we acquired are in the areas of financial transactions, epidemiology and later the Challenge committee secured funding to develop cybersecurity benchmark datasets. Other target areas are health records, sensor/ tracking data, and accident records in the petroleum industry. We are focusing on data sets where ground truth is known to at least some extent.

The Visual Analytics Benchmark Repository is available for everybody to use. It allows you to find benchmarks but also add new ones, or descriptions of the use of existing benchmarks (e.g. analysis done useing them), and to upload or point to papers that report on the use of the benchmark datasets. If you are interested in helping us populate the repository, please contact us.

2. Developing metrics and automated tools for evaluation

Using the VAST Challenge as our testbed, we developed new methodologies to evaluate Visual Analytics technology thru competitions. We developed automated metrics for some aspects of Visual Analytics systems, and guidelines will help researchers assess the subjective aspects of visual analytics environments.  Data from 2009 reviews and from experiments currently underway with professional analysts are the basis of those guidelines.  Lessons learned from the evaluation on the 2008 Challenge are available in 2009 Infovis Journal (see below in publications).

Examples of lessons learned include:

- By redefining the contest format into a challenge with multiple mini-challenges we were able to increase participation dramatically.
- While publicly available data sets are useful (and have been available for a long time), benchmarks need to include 1) data 2) embedded realistic ground truth scenarios and 3) tasks to be useful for adequately assessing the utility of Visual Analytics tools.
- The availability of benchmarks encourages research on the topic of the benchmarks.
- Users of those benchmarks learn from each others.
- Having both professional users (i.e. analysts) AND visual analytics experts act as judges is essential.
- Computing accuracy metrics in the context of realistic analysis scenarios is a challenge. Teams participating in mini-challenge 2 of the VAST 2009 Challenge (the social network mini-challenge), were able to obtain accuracy scores for the social network they identified from the data. Teams were allowed to submit and receive an accuracy measure three times prior to submitting their final entry. We were not 100% successful on computations but the participants were appreciative of the feedback.
- In addition to accuracy measures, we developed guidelines to help researchers assess the subjective aspects of visual analytics environments.  Data from this year's reviews and from experiments currently underway with professional analysts will form the basis of those guidelines.  Ground truth is rarely black and white. Participants are asked to provide not only “answers” but evidence from the data that supports their answers. As our data sets are often too large to be reviewed manually, participants may find additional material that provides interesting data we were unaware of. Judging needs to take those additional insights into consideration.

3. Seeding an infrastructure for the coordination of long term evaluation activities

Visual analytics builds on multiple core research fields (e.g. information visualization, knowledge discovery, data mining, cognitive science, intelligent user interfaces, human-computer interaction) and will impact many visual analytics application domains (e.g. intelligence and business analysis, bioengineering and genomic research, transportation, emergency response). We are working to make the methodolgy and tools known and available in the VAST community as well as outside this community.

We have developed several tools and services that others can use:

* The Benchmark Repository
* A submission and evaluation management website adapted to the needs of Visual Analytics competitions (e.g. multimedia submissions, automatic accuracy ratings, multiple types of reviewers, etc.). It is available at http://vastsubmission.cs.uml.edu (but of course requires login and password).

To stimulate the research community and increase attention given to the evaluation of Visual Analytics SEMVAST's PIs have led several activities:

* The VAST challenge: e.g. VAST 2011 Challenge , and VAST 2010 Challenge.
See Slides of introduction to the VAST Challenge (May 2009 - from HCIL workshop)

* As part of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery (VAKD’09) (June 28th, Paris France) two SEMVAST PIs co-chaired a challenge using our VAST 2008 dataset which resulted in several original papers reporting on that experience.

* The 2009 Visual Analytics Evaluation workshop: May 29th 2009 in conjunction with the HCIL symposium. The workshop gathered 20 designers, developers, evaluators, professional analysts and program managers from academia, industry or government.

* A special issue of Computer Graphics and Applications (IEEE CG&A) May/June 2009, was coedited by the PIs. The 4 accepted papers are listed in the publications section.

* Beliv'08: BEyond time and errors: novel evaLuation methods for Information Visualization, a full day workshop April 5th 2008 at the ACM CHI 2008 conference in Florence (Beliv'08 proceedings - ACM Digital Library)

* VAST 2007 workshop on Metrics for the Evaluation of Visual Analytics, a full day workshop we organized on October 28th 2007 at the IEEE VIS 2007 conference. Short Report: Media:infovis_workshop.doc

4. Examples of Education activities
SEE: Whiting, M., North, C., Endert, A., Scholtz, J., Haack, J., Varley, C., Thomas, J., VAST Contest Dataset Use in Education, Proc. of VAST 2009 Symposium. 115 - 122 (PDF) ( Published IEEE DL version)

* We worked with Dr. Chris North from Virginia Tech (VT) to understand how to best use the VAST ground truth datasets in class projects. Chris has used VAST contest data in his VT fall 2007 graduate class and in his VT spring 2008 undergraduate class. He also used the 2009 data sets in an analyst evaluation program. He contributed the outline and results for his graduate course. and will continue in 2009 (e.g see CS3724 HCI Class and CS5764 InfoVis class) (see VAST 2009 paper in publication section). Jean Scholtz looked at the contributed results and is monitoring the progress on the undergraduate course. We are currently making sure that the datasets are in reasonable shape to release to other educators and exploring how to assess feedback from the educators in regards to both the datasets and the metrics.

* Dr. Grinstein has used the data sets for his class this fall and will be using another in his new Visual Analytics class to be taught either spring or fall 2010. He also used the data sets in the DHS COE CCICADA 2009 Reconnect workshop on visual analytics.

* Dr. Haesun Park used the VAST 2009 datasets for the Summer Intern Program under the NSF/DHS FODAVA (foundations of data and visual analysis) program.

In summary

While challenges remain and the Vast Challenge evaluation methodology has some limitations we believe that it remains the most practical and useful community based evaluation method. The publicly available submitted entries and the reviews provide a window into the utility of the systems being compared. The stable number of participants confirms that it is useful opportunity for participants to learn about representative tasks of analysts, improve their systems, and have fun on the way. After seven years of continuous activity reviewers continue to volunteer. The use of datasets and problems in is expanding. The enduring central presence of the VAST Challenge at the conference, the sustained rate of participation and dataset downloads, and the participant survey results indicate continuing interest from the researchers and practitioners and suggests that the Challenge has contributed to the overall evaluation of visual analytics systems. Evaluating accuracy remains the main challenge as we move toward more realistic scenarios and data.

Publications

Summary papers about the challenges

Scholtz, J., Whiting, M., Plaisant, C., Grinstein, G., A Reflection on Seven Years of the VAST Challenge, Proc. of BELIV 2012, BEyond time and errors: novel evaLuation methods for Information Visualization, a workshop of the IEEE VisWeek 2012 conference (2012) 1-8 (PDF)

Costello, L., Grinstein, G., Plaisant, C. and Scholtz, J., Advancing User-Centered Evaluation of Visual Analytic Environments through Contests, Information Visualization, 8 (2009) 230–238 (TR version) (Palgrave final version).

Plaisant, C., Fekete, J. D., Grinstein, G., Promoting Insight Based Evaluation of Visualizations: From Contest to Benchmark Repository, IEEE Transactions on Visualization and Computer Graphics, 14, 1 (2008) 120-134 (TR version) (Published version in IEEE DL).

Overview of each Challenge (the summaries found each year in the VAST proceedings)

2012:
Cook, K., Grinstein, G., Whiting, M., Cooper, M., Havig, P., Liggett K., Nebesh, B., Paul, C. L., VAST Challenge 2012: Visual Analytics for Big Data, Proceedings of IEEE VAST 2012. (PDF)

2011:
Grinstein, G., Cook, K., Havig, P., Liggett, K., Nebesh, B., Whiting, M., Whitley, K., Konecni, S., VAST 2011 Challenge: Cyber Security and Epidemic. Proceedings of IEEE VAST 2011 301-304. (PDF)

2010:
Grinstein, G., Konecni, S., Plaisant, C., Scholtz, J., Whiting, M., VAST 2010 Challenge: Arms Dealings and Pandemics, Proc. of VAST 2010 Conference, IEEE (2010) 267-268 ( PDF on IEEE DL)

2009:
Grinstein, G., Plaisant, C., Scholtz, J. and Whiting, M., VAST 2009 Challenge: An Insider Threat, Proc. of IEEE VAST. 2009 243 - 244 (PDF) ( Published IEEE DL version)

2008:
Grinstein, G., Plaisant, C., Laskowski, S., O’Connell, T., Scholtz, J., Whiting, M., VAST 2008 Challenge: Introducing Mini-Challenges, Proc. of IEEE Symposium on Visual Analytics Science and Technology (2008) 195-196. (TR version) ( Published IEEE DL version)

2007:
Grinstein, G.; Plaisant, C.; Laskowski, S.; O'Connell, T.; Scholtz, J.; Whiting, M.; VAST 2007 Contest - Blue Iguanodon, Proc. of the IEEE Symposium on Visual Analytics Science and Technology, VAST 2007, pp 231 - 232 (Final version in IEEE DL)

2006:
The VAST 2006 Contest: A tale of Alderwood, Grinstein, G., O’Connell, T., Laskowski, S., Plaisant, C., Scholtz, J., Whiting, M., Proc. of IEEE Symposium on Visual Analytics Science and Technology (2006) 215-216 (PDF)

Other topics (i.e. evaluation methods in general, metrics, dataset development, use in education etc.)

Lam, H., Bertini, E., Isenberg, P., Plaisant, C., Carpendale, S., Seven Guiding Scenarios for Information Visualization Evaluation IEEE Transactions on Visualization and Computer Graphics, 18, 9 (2012) 1520-1536.

Scholtz Jean, Developing Qualitative Metrics for Visual Analytic Environments. In the Proceedings of BELIV '10 (Atlanta, USA).1-7 (PDF)

Whiting Mark, Generating a Synthetic Video Dataset. In the Proceedings of BELIV '10 (Atlanta, USA). 43-48 (PDF)

Whiting, M., North, C., Endert, A., Scholtz, J., Haack, J., Varley, C., Thomas, J., VAST Contest Dataset Use in Education, Proc. of VAST 2009 Symposium. 115 - 122 (PDF) ( Published IEEE DL version)

Plaisant, C., Grinstein, G., Scholtz, J., Whiting, M., O'Connell, T., Laskowski, S., Chien, L., Tat, A., Wright, W., Gorg, C., Liu, Z., Parekh, N., Singhal, K., Stasko, J. Evaluating Visual Analytics: The 2007 Visual Analytics Science and Technology Symposium Contest IEEE Computer Graphics and Applications 28, 2, March-April 2008, pp.12-21 (2008) (final version in IEEE DL)

Whiting, M., Haack, J., Varley, C., Creating realistic, scenario-based synthetic data for test and evaluation of information analytics software, Proc. of BELIV’08, BEyond time and errors: novel evaLuation methods for Information Visualization, ACM (2008).(Published version)

Reddy, S.,Manas,D., Plaisant, C., and Scholtz, J., A usage summary of the VAST Challenge Datasets (2010) (student project report)

Costello, L., Byrne, H., Ly, A. and Grinstein, G., A Survey of Contests in the Computer Science Community (2009)(draft of student project report)

Slides

Slides of a Panel discussion on "Research with impact" presented at the Visual Analytics Consortium meeting - May 2011, College Park.

Overview of the VAST challenge , from the HCIL Visual Analytics Workshop, May 2009 (Plaisant)

Visual Analytics Evaluation from 2008 Summer Camp (Scholtz and Whiting)

Other relevant papers and materials

Sample IRB application for running a MILC study (Multi-Dimensional In-depth Long-term Case study)

Scholtz, J., Cook, K. A., Whiting, M., Lemon, D., Greenblatt, H., Visual analytics technology transition progress, to appear in Information Visualization (2009)

Plaisant, C., Laskowski, S., Evaluation Methodologies for Visual Analytics Section 6.1, in Thomas, J., Cook, K. (Eds.) Illuminating the Path, the Research and Development Agenda for Visual Analytics, IEEE Press, 150-157 (2005) (part of this book chapter)

Shneiderman, B., Plaisant, C., Strategies for Evaluating Information Visualization Tools: Multidimensional In-depth Long-term Case Studies, Proc. of BELIV’06, BEyond time and errors: novel evaLuation methods for Information Visualization, a workshop of the AVI 2006 International Working Conference, ACM (2006) 38-43 (TR version).

Plaisant, C., The Challenge of Information Visualization Evaluation, Proc. of Conf. on Advanced Visual Interfaces AVI'04 (2004), p.109-116. (TR version) ( Published version in ACM DL)

Papers from CG&A May/June 2009 (vol. 29 no. 3) Special Issue on Visual Analytics Evaluation (that we edited)

URL OF SPECIAL ISSUE IN IEEE Digital library

Intro to special issue on Visual-Analytics Evaluation
Catherine Plaisant, University of Maryland, Georges Grinstein, University of Massachusetts Lowell and Jean Scholtz, Pacific Northwest National Laboratory
pp. 16-17

Generating Synthetic Syndromic-Surveillance Data for Evaluating Visual-Analytics Techniques
Ross Maciejewski, Ryan Hafen, Stephen Rudolph, George Tebbetts, William S. Cleveland, David S. Ebert, Purdue University and Shaun J. Grannis, Indiana University
pp. 18-28

To Score or Not to Score? Tripling Insights for Participatory Design
Michael Smuc, Eva Mayr, Tim Lammarsch, Wolfgang Aigner, Silvia Miksch, Danube University Krems, and Johannes Gärtner, Ximes
pp. 29-38

Integrating Statistics and Visualization for Exploratory Power: From Long-Term Case Studies to Design Guidelines
Adam Perer, IBM and Ben Shneiderman, University of Maryland
pp. 39-51

Recovering Reasoning Processes from User Interactions
Wenwen Dou, Dong Hyun Jeong, Felesia Stukes, William Ribarsky, Heather Richter Lipford, Remco Chang, University of North Carolina, Charlotte
pp. 52-61

Acknowledgments

This material is based upon work supported by the National Science Foundation under a Collaborative Research Grant to the following three institutions:
- IIS-0713087 and 0947358 Plaisant, Catherine University of Maryland, College Park
- IIS-0712770 Scholtz, Jean Battelle Memorial Institute
- IIS-0713198 and 0947343 Grinstein, Georges University of Massachussetts, Lowell
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

We also thank the National Visualization and Analytics CenterTM (NVACTM) located at the Pacific Northwest National Laboratory in Richland, WA, their their contribution to the VAST Challenge. The Pacific Northwest National Laboratory is managed for the U.S. Department of Energy by Battelle Memorial Institute under Contract DE-AC05-76RL01830. We also wish to thank Jereme Haack, Carrie Varley and Cindy Henderson of Pacific Northwest National Laboratory; Andrew Canfield from Mercyhurst College, the other students of the University of Massachusetts Lowell who worked to manage the VAST Challenge submission website, especially Shawn Konecni.

(last updated in December 2012)