Jump down
to: Scenario - Your task – Key questions
to be answered – Judging - Scoring Criteria
DATA SET
The dataset consists of:
·
about
1500 news stories from a ficticious newspaper plus a few other items collected
by the previous investigators
·
about
150 blog entries
·
a
few pictures (in jpg format)
·
a
few small databases (in XLS and CVS format)
·
a few pages of background information (in .DOC or PDF format).
The dataset contains fictitious information and was created
for testing and evaluation of visual analytic tools only. No part of this
dataset should be taken as real.
Here is the direct link
to the readme file of the 2007 preprocessed data
set to give you an idea of
what preprocessing we provide. This will help you decide which level of
the contest your team should enter in 2007. We used MITRE’s Alembic
for the entity extraction process, with some modifications and hand work. See http://www.mitre.org/tech/alembic-workbench
for details on Alembic.
In 2007 you
have to choose one or the other (Raw or Preprocessed). We will keep track of who downloads which
version, but of course we rely on the honor system and you should report
correctly which data set you used. We
will evaluate the 2 categories separately but use similar criteria.
Note that the 2006 data set is still available,
with solutions, from the VAST 2006 contest
web site. This allow you to practice for the 2007 contest as we now provide
preprocessed data for the 2006 data set as well. Just register as usual and download the 2006
data set and you will find the preprocessed data together with the raw
data.
It is Fall of 2004 and one of your
analyst colleagues has been called away from her current tasks to an
emergency. The boss has given you the
assignment of picking up her investigation and completing her task. She has been asked to pursue a line of
investigation into some unexpected activities concerning wildlife law
enforcement, endangered species issues, and ecoterrorism. This isn’t exactly your specialty area, but
your boss believes you are one of the few people who could get to the bottom of
whatever is going on. In fact, you would have been given this investigation if
you hadn’t been busy on another assignment when your colleague had
started.
Your colleague hasn’t gotten very far, but she has assembled
all the data relevant to this case. It
is a mixed assortment of information: text, images, numbers. The agency you work for is very accommodating
-- you may use any analytical tool you need to help your investigations.
You do know a few things coming into this effort. First, you were instrumental in cracking
those investigations from last year, so issues about mad cow or
Find the plots and subplots. A scenario should emerge when
you have pieced together the pieces of this puzzle. Your boss is interested in knowing the
whos-whats-wheres-whens-hows and -whys of the story, and how they are
connected.
What is the situation in this scenario and what is your assessment of the
situation? What are your recommendations about possible next steps in the
investigation? (Note that a situation
may have multiple plots and subplots.)
We provide a standard form for you to tell us:
1.
Who are the players relevant to the plot(s) and subplots?
2.
What is the time frame in which this situation unfolded and what events occurring
during this time frame are relevant to the plot(s)?
3.
What locations were relevant to the plot(s)?
4.
What is your assessment of the situation (in the form of a debriefing)?
5.
Explain the process you used to arrive at the assessment above (using
descriptions of the tools, screen shots, video etc.)
Remember, the goal is to answer the main question:
What is the situation and what is your assessment of the
situation, including possible next steps?
Definitions
Events are things that occur in a short, discrete time
frame.
Activities occur over a much longer span of time.
For example, graduating from school would be an event. Going to graduate school is an activity.
Some Advice
Submissions will be reviewed by external judges with content
and visual analytics expertise.
We are using a realistic but synthetic dataset for which
“ground truth” is known, which permits the use of quantitative metrics for
evaluation.
Partial answers will be considered. For example even
if your tool can only deal with some of the questions, we encourage you to
submit. Your submission can very well be the one doing the best job at
that particular task and be recognized as such by the contest
judges. Of course submissions that answer all tasks have a better
chance at an overall 1st prize, but judges will have the possibility to create
special prizes for shining partial entries.
Judging will
be based on:
1. The correctness of answers to the questions
and the evidence provided. Participants
will be given points for correct answers and penalized for incorrect
answers. Quantitative and qualitative
assessments of the answers and your description of the situation will make up a
“correctness” score.
2.
Subjective assessment of the quality of the displays,
interactions and support for the analytical process. This assessment will be based on your visuals
and description of the analytic process, including the video. Note that during these assessments the judges
will not be able to ask you questions so the
clarity of the explanations you provide is critical. The judges cannot
correctly assess something they don’t understand.
Correctness of the
answers:
Participants are required to answer the questions (who, what
and where) on an answer form (for which we provide an answer
form template). For each question you are able to answer, you have to
identify the most relevant documents or other materials from the dataset used
to obtain your answer.
In the debriefing section of the answer form you are asked
to describe the plot(s) and subplots(s) and how people, motivations, activities
and locations are part of the plot, their relationship, and any uncertainties
or information gaps that exist. Your debriefing will be assessed on accuracy
and clarity. Were you able to accurately
identify the plot(s), subplot(s), and relationships and are you able to clearly
describe the situation?
Subjective Assessment:
Based on the written descriptions, screen captures and video
you provide and the insights you report being able to gather from those
displays, judges will be asked to give a subjective assessment rating based on
the following criteria:
Primary criteria (the basis for the main
score):
·
Utility of the interface components and
visualizations - based on the specific INSIGHTS reported in your descriptions (you will be judged on how well your
tool(s) supported identification of the scenario – and how much of the
identification was supported by your tool(s))
·
Quality of the static representations - based
mostly on the screens prints (Meaningful layout, good use of color or icons, good labeling, saliency
of information, etc.)
·
Quality of the interaction - based on the
descriptions and the video
Secondary criteria (i.e. criteria that are also very
important and this year will be used to award bonus points)
·
Scalability (i.e. are some aspects of the
analysis automated? Do results of the
automation seem understandable? Are
there mechanisms to guide your use of the visualizations?)
·
Versatility (i.e. variety of data types which
can be handled)
·
Handling of missing data and uncertainty
·
Support for collaboration
·
Learnability (note that the clarity of the explanations
will have a strong impact here)
·
Other
features such as: History mechanism, ease of importing and
exporting data, innovative features in general, etc.
Acknowledgments
The dataset was prepared with the assistance of:
TSG team: Jereme
Haack, Carrie Varley, Wendy Cowley, Doug Love, Stephen Tratz
UPA team: Alex
Gibson, Nick Cramer
NVAC: Jim Thomas,
Richard May
Testing: Larry Becker
Jr., Dave McColgin
Advisors: Cindy
Henderson
The evaluation process and the metrics
were developed by the contest chairs and committee members
Questions? Email
the Contest Chairs