IEEE VAST 2007 Contest
Information about data set, task and judging

 

Jump down to:  Scenario -  Your taskKey questions to be answeredJudging - Scoring Criteria

 

DATA SET

The dataset consists of:

·         about 1500 news stories from a ficticious newspaper plus a few other items collected by the previous investigators

·         about 150 blog entries

·         a few pictures (in jpg format)

·         a few small databases (in XLS and CVS format)

·         a few pages of background information (in .DOC or PDF format). 

 

 

The dataset contains fictitious information and was created for testing and evaluation of visual analytic tools only. No part of this dataset should be taken as real.

Two forms: RAW and PREPROCESSED

New this year we provide a pre-processed version of the dataset (e.g. with entities already extracted from the text) so that more teams can participate. Your team will need to decide whether to enter the contest using the preprocessed data or the raw data. 


Here is the direct link to the readme file of the 2007 preprocessed data set  to give you an idea of what preprocessing we provide. This will help you decide which level of the contest your team should enter in 2007.   We used MITRE’s Alembic for the entity extraction process, with some modifications and hand work. See http://www.mitre.org/tech/alembic-workbench for details on Alembic.

In 2007 you have to choose one or the other (Raw or Preprocessed).  We will keep track of who downloads which version, but of course we rely on the honor system and you should report correctly which data set you used.  We will evaluate the 2 categories separately but use similar criteria.

Note that the 2006 data set is still available, with solutions, from the VAST 2006 contest web site. This allow you to practice for the 2007 contest as we now provide preprocessed data for the 2006 data set as well. Just register as usual and download the 2006 data set and you will find the preprocessed data together with the raw data.

Scenario

 

It is Fall of 2004 and one of your analyst colleagues has been called away from her current tasks to an emergency.  The boss has given you the assignment of picking up her investigation and completing her task.  She has been asked to pursue a line of investigation into some unexpected activities concerning wildlife law enforcement, endangered species issues, and ecoterrorism.  This isn’t exactly your specialty area, but your boss believes you are one of the few people who could get to the bottom of whatever is going on. In fact, you would have been given this investigation if you hadn’t been busy on another assignment when your colleague had started. 

 

Your colleague hasn’t gotten very far, but she has assembled all the data relevant to this case.  It is a mixed assortment of information: text, images, numbers.  The agency you work for is very accommodating -- you may use any analytical tool you need to help your investigations.

 

You do know a few things coming into this effort.  First, you were instrumental in cracking those investigations from last year, so issues about mad cow or Alderwood, Washington, are not part of this.  Also, you know a little about ecoterrorism and animal rights groups, so, for example, the activities of the People for the Ethical Treatments of Animals (PETA) and Earth Liberation Front (ELF) are not of interest, unless they happen to be tied to some larger, or more pervasive plot.

Your task

 

Find the plots and subplots. A scenario should emerge when you have pieced together the pieces of this puzzle.   Your boss is interested in knowing the whos-whats-wheres-whens-hows and -whys of the story, and how they are connected. 

Key questions to be answered


What is the situation in this scenario and what is your assessment of the situation? What are your recommendations about possible next steps in the investigation?   (Note that a situation may have multiple plots and subplots.)

 

We provide a standard form for you to tell us:

1.  Who are the players relevant to the plot(s) and subplots?

2.  What is the time frame in which this situation unfolded and what events occurring during this time frame are relevant to the plot(s)?

3.  What locations were relevant to the plot(s)?

4.  What is your assessment of the situation (in the form of a debriefing)?

5.  Explain the process you used to arrive at the assessment above (using descriptions of the tools, screen shots, video etc.)

 

Remember, the goal is to answer the main question:

What is the situation and what is your assessment of the situation, including possible next steps?

 

Definitions

Events are things that occur in a short, discrete time frame.

Activities occur over a much longer span of time. 

For example, graduating from school would be an event.  Going to graduate school is an activity. 

 

Some Advice

  • The dataset development team tries very hard to create a scenario and dataset that are both believable and interesting and have a strong tie to reality. However, to create a synthetic dataset, there is an element of “let’s pretend”.  When you are analyzing the dataset -- be flexible and go with the scenario.  Think of the dataset as you would a mystery novel.  We know there was no widely famous detective working in England named Hercule Poirot in the 1920’s era, yet if we suspend our disbelief for a while, his stories are enjoyable and sometimes educational!
  • If you have questions about the dataset or analysis, ask them!
  • Not all data is as important as others.  In fact, some data may be red herrings.  Consider all information very carefully as you evaluate your hypotheses.

Judging

 

Submissions will be reviewed by external judges with content and visual analytics expertise.

 

We are using a realistic but synthetic dataset for which “ground truth” is known, which permits the use of quantitative metrics for evaluation. 

Partial answers will be considered.  For example even if your tool can only deal with some of the questions, we encourage you to submit.  Your submission can very well be the one doing the best job at that particular task and be recognized as such by the contest judges.   Of course submissions that answer all tasks have a better chance at an overall 1st prize, but judges will have the possibility to create special prizes for shining partial entries.

Scoring criteria

 

Judging will be based on:

1.       The correctness of answers to the questions and the evidence provided.  Participants will be given points for correct answers and penalized for incorrect answers.  Quantitative and qualitative assessments of the answers and your description of the situation will make up a “correctness” score.

2.       Subjective assessment of the quality of the displays, interactions and support for the analytical process.  This assessment will be based on your visuals and description of the analytic process, including the video.  Note that during these assessments the judges will not be able to ask you questions so the clarity of the explanations you provide is critical. The judges cannot correctly assess something they don’t understand.

 

Correctness of the answers:

 

Participants are required to answer the questions (who, what and where) on an answer form (for which we provide an answer form template). For each question you are able to answer, you have to identify the most relevant documents or other materials from the dataset used to obtain your answer. 

In the debriefing section of the answer form you are asked to describe the plot(s) and subplots(s) and how people, motivations, activities and locations are part of the plot, their relationship, and any uncertainties or information gaps that exist. Your debriefing will be assessed on accuracy and clarity.  Were you able to accurately identify the plot(s), subplot(s), and relationships and are you able to clearly describe the situation? 

 

Subjective Assessment:

 

Based on the written descriptions, screen captures and video you provide and the insights you report being able to gather from those displays, judges will be asked to give a subjective assessment rating based on the following criteria:

 

Primary criteria (the basis for the main score): 

·                     Utility of the interface components and visualizations - based on the specific INSIGHTS reported in your descriptions (you will be judged on how well your tool(s) supported identification of the scenario – and how much of the identification was supported by your tool(s))

·                     Quality of the static representations - based mostly on the screens prints (Meaningful layout, good use of color or icons, good labeling, saliency of information, etc.)

·                     Quality of the interaction - based on the descriptions and the video

 

Secondary criteria (i.e. criteria that are also very important and this year will be used to award bonus points)

·         Scalability (i.e. are some aspects of the analysis automated?  Do results of the automation seem understandable?  Are there mechanisms to guide your use of the visualizations?)

·         Versatility (i.e. variety of data types which can be handled)

·         Handling of missing data and uncertainty

·         Support for collaboration

·         Learnability (note that the clarity of the explanations will have a strong impact here)

·         Other features such as: History mechanism, ease of importing and exporting data, innovative features in general, etc.

 

 

 

Acknowledgments

The dataset was prepared with the assistance of:

TSG team:  Jereme Haack, Carrie Varley, Wendy Cowley, Doug Love, Stephen Tratz

UPA team:  Alex Gibson, Nick Cramer

NVAC:  Jim Thomas, Richard May

Testing:  Larry Becker Jr., Dave McColgin

Advisors:  Cindy Henderson

 

      The evaluation process and the metrics were developed by the contest chairs and committee members

 

Questions?   Email the Contest Chairs

 

Return to VAST 2007 Contest page