IEEE VAST 2008 Challenge

Detailed Task Descriptions for All Challenges

Last updated June 11 (to clarify the role of the optional two-page Summaries)




Questions?  Send email  to challengecommittee  AT




The 2008 VAST Challenge consists of a Grand Challenge and four mini-challenges described below.  Contestants can choose to work on one, some, or all of the challenges.  To successfully respond to the Grand Challenge, contestants must tie together all data sets with an overall scenario description using data elements from each of the four mini challenges, but are not required to answer all the mini-challenge questions. 


The datasets used for these challenges are synthetic:  that is, they are a blend of computer- and human-generated data.  All datasets, whether real or synthetic, have anomalies. Some anomalies may be significant, some may not.  Any anomalies reported should be supported by the proposed hypotheses.  For example, “all first names start with a ‘M” may be interesting, but unless it is tied to the discussion of the situation, that anomaly has no place in your submission. 


We have included all information necessary to form working hypotheses for the purpose of these challenges.  No external data is needed to successfully perform the analyses. Be aware that using additional non-provided data may skew an otherwise successful solution.


The descriptions below provides all the details for all the mini-challenges and the Grand Challenge, the questions posed in each, and a description of what participants need to provide for answering each question.


Before starting

1)       Read the “Paraiso Manifesto” Wikipedia page and discussion provided in the dataset.  It gives background material useful for the Grand challenge as well as the mini-challenges.

2)       Familiarize yourself with the answer forms.  For each challenge we provide an answer form html file template that you can save on your local computer and use to provide your answers either by adding text to the form itself or by linking to the separate files we ask you to provide.  

3)       Use the following definitions for the type of text explanations we request

There are 3 formats/size for providing answers to the questions:

Short Answer:

Short answers are only requested in the mini challenges.
A short answer is a text description focusing on how you arrived at the answer. It is limited to 150 words and a maximum of 2 screen shots.

Detailed Answer

Detailed Answers are requested both in the mini challenges AND the Grand Challenge.
A Detailed Answer is a text description focusing on how you arrived at the answer with much more details than the Short Answer. 
  - For mini challenges detailed answers are limited to 1000 words, with screen shots and captions for a maximum of 5 pages (when printed). Although five pages is the limit, it doesn’t have to be that long.  
  - For the Grand Challenge there is no size limit (but less than 10 pages is recommended).
Mini challenge detailed answers can also reference or include a video of a maximum length of 2 minutes. Videos are optional for mini challenges, but they are very helpful in showing your work and especially your tools’ interactive features.   

Detailed answers should provide the answer and describe the process used to arrive at the answer. Clearly describe what you saw in the displays that helped you formulate or prove your hypotheses, for example do not just say: “Fig 3 shows who was involved”, but explain what visual (or non-visual) features and characteristics of the interface were used in your reasoning about the question.  Be specific: tell us HOW you can see who was involved? For example don’t just say “we used advanced technique X to easily see who is involved”; instead, be more specific and says something like “we suspected that Joe was involved because his name appeared in red when color was mapped to the number of oversee travels and he stood out as being outside his family cluster”.  Describe the process you used (how you started, what worked and what didn’t work) with estimates of the effort required. Clarify what was accomplished manually, automatically, or in between.  Make sure the screen shots are usable when printed in color (you can always link to the best resolution versions in the html document). Do not forget to include legends for the visual encodings of your screen shots, captions describing what data is being shown, and what filters have been applied in the static figures presented or discussed.  In other words, help us understand what we are looking at!  If you provide a video, voice narration is essential and usage of the mouse to point at relevant elements on the screens. We hope that your screen shots will be readable when reduced in size or printed. If they are not, make sure readers can click on the reduced image to see the full resolution image.  
See an example of good process description from the VAST 2007 contest. (Note:  The 2007 contest had different rules, so this example is more than 5 page long and doesn’t contain the answers per say, but we hope you will find it useful as an example of process description)


Debrief are requested only in the grand challenge.   
A debrief is a maximum of 2000 words narrative describing your hypothesis about the situation at hand.  Include in your narrative the relationships of the various players.  If there are uncertainties, you can suggest possible next steps to clarify those uncertainties. In your debrief there is no need to describe the tools were used nor discuss the process used; instead focus on convincing us that you UNDERSTAND the situation.
See an example of good debrief from the VAST 2007 contest.

Two-Page Summaries:


Two page Summaries are OPTIONAL. 

The two-page summary should be formatted according to the general IEEE VGTC Guidelines

Those summaries allow you to give a general overview of your tools, highlight significantly novel features, provide references to papers and other relevant work and describe what the experience of working thru the Challenge problem revealed about your tools. Only the two-page summaries of the best entries (which are awarded a Certificate of Excellence) will be published in the VAST 2008 Symposium Proceedings. Nevertheless, ALL submitted two-page summaries will be published online - along with your answer - in a repository at NIST (the National Institute of Standard and Technology), whether or not they are awarded a Certificate of Excellence. 



The Four Mini-Challenges



Mini Challenge 1: Wiki Editors


The Paraiso movement is controversial and is having considerable social impact in a specific area of the world.  We have extracted a segment of the Paraiso (the movement) Wikipedia edits page.  Please note this is not the Paraiso Manifesto Wiki page which is part of the background materials, but a related different page.  Please use visual analytics to describe the social relationships of the editors (those that have edited/modified the Wikipedia page) as they are reflected in these files.


Use the Mini Challenge-1 answer form




Wiki-1: What are the factions represented in the edit pages and who are its members? In other words, describe the groups and their members based on their editing changes. 



- a table with the names of the factions and the names of the individuals associated with them.  (a sample is provided)

- a Detailed Answer explaining how you arrived at the answer


Wiki-2:  Is the Paraiso movement involved in violent activities? 



- The answer YES/ NO (select one)

            - a list of evidence supporting the above answer, i.e. a list of wiki edits (a sample will be provided)

            - a Short Answer explaining  how you arrived at your answer



Mini Challenge 2:  Migrant Boats (geo-temporal analysis)


This data comprises records dealing with the mass movement of persons departing Isla Del Sueño on boats for the United States during 2005 - 2007.  This activity was precipitated by the Isla Del Sueño government crackdown on the Paraiso social and religious movement which had been gaining popularity there.  The dataset includes not only interdiction records collected by the United States Coast Guard (i.e. arrests at sea), but also information from other sources about illegal landings on shore.


Please note:  The U.S. has the same “wet foot, dry foot” policy for Isla Del Sueño migrants as it does for Cubans:  If a person is able to land on the U.S. soil, he/she may qualify for expedited “legal permanent resident” status.   If caught between the waters of the two nations (i.e. during an interdiction) and attempting to enter the US, he/she will be summarily sent back to the island. 


The migrant boat records include the following fields:


EncounterCoords:     Where the migrant boat was intercepted or where it landed, in LONG-LAT format

RecordType:             Interdiction (the Coast Guard intercepted them) or Landing (the boat made it ashore)

Passengers:              Number of passengers in the migrant boat

USCG_Vessel:           Name of the Coast Guard cutter involved  in the interdiction

EncounterDate:         Date when the boat was interdicted or landed

RecordNotes:            Any additional information; usually a list of passenger names.

NumDeaths:              Number of migrant deaths

LaunchCoords:         Where the boat left the Island, if known, in LONG-LAT format

VesselType:              Type of boat used by the migrants, i.e. a “Go Fast”, a Raft, or a Rustic vessel.


Please use visual analytics to explain migration patterns during these years.


Use the Mini Challenge-2 answer form




Boat-1 Characterize the choice of landing sites and their evolution over the three years.

Provide a Detailed Answer


Boat-2  Characterize the geographical patterns of interdiction over the three years

Provide a Short Answer


Boat- 3 What is the successful landing rate over the time period?

Provide a Short Answer



Mini Challenge 3:  Cell Phone Calls  (social network)


A set of cell phone call records from Isla Del Sueño over a ten-day period in June 2006 was narrowed down to about 400 unique cell phones during this period.  The records provide critical information about the Catalano social network structure.  Please use visual analytic approaches to characterize the social network structure.


The data set includes records with the following fields:


From:               Identifier for the calling phone

To:                   Identifier for the receiving phone

Datetime:         Date and time of the call in the format: yyyymmdd hhmm

Duration:          Duration of the call in seconds

Cell Tower:      Location of the call origination cell tower.


We have included a map of Isla del Sueño with approximate coverage of the towers in a gridded form.  However, mapping of the cell phone tower locations is not precise.


As many of the cell phone records were registered with false names, we have provided numerical identifiers for callers and receivers.  


We have medium confidence that Ferdinando Catalano is identifier 200.  Close relatives and associates he would be calling would include David Vidro, Juan Vidro, Jorge Vidro, and Estaban Catalano.  We believe Ferdinando would call brother Estaban most frequently.  We also believe that David Vidro coordinates high-level Paraiso activities and communications. 


Using the Mini Challenge-3 answer form,




      Phone-1: What is the Catalano/Vidro social network, as reflected in the cell phone call data, at the end of the time period? 


Provide the network node and link files (using this format)


     Phone-2  Characterize the changes in the Catalano/Vidro social structure over the ten day period.


Provide a Detailed Answer


Mini Challenge 4:  Evacuation Traces


In August 2007, a small improvised explosive device was set off at a Miami, Florida Department of Health (DOH) building, resulting in casualties and moderate damage.  This DOH branch has recently been involved in conflicts with local Paraiso religious groups over the attempts by DOH to provide medical care to Paraiso children in public schools and other public facilities.  The Paraiso religion is a growing movement in Florida and other southern states.  The Paraiso movement is steadfastly opposed to state-administered health care, insisting all health care must be provided at home by the family. 


It is suspected that Paraiso supporters were involved in the DOH bombing.  Paraiso leaders have publicly denied this.  Please use visual analytics to investigate the incident.


Fortunately, the DOH building involved in the bombing was being used as a test facility for RFID (radio-frequency identification).  All employees and visitors to the building wore badges enabling their location to be recorded during the time of the incident. Although not sophisticated, it may help the investigation.  The data you will analyze contains records with the following fields:


Time:               Ticks, representing intervals between tag readings

Person Id:        Tag identification of all employees and visitors

Xcor:                The location x-coordinate

Ycor:                The location y-coordinate


The coordinates are mapped to a 91x61 grid space.  The building itself is represented by a binary datafile against the 91x61 grid:  “1”s represent solid space (walls, closed doors, etc) and “0”s represent open space.  All movement occurs within this grid space. 


The file includes data before and throughout the incident.  It is expected a visual presentation will greatly clarify what happened during the event.  Your visual analytic investigation will help police follow up on this incident.   


Use the Mini Challenge-4 answer form




Traces-1  Where was the device set off?



        -   the grid cell number of where the device went off

        -   a Short Answer.


Traces-2  Identify potential suspects and/or witnesses to the event.

Note: Potential suspects and/or witnesses are people who were near the area just prior to the explosion and exhibit suspicious behavior 



              -  In the answer form provide a list of RFID tag numbers (sample provided)

              -  a Short Answer explaining how you arrived at your answer



Traces-3  Identify any suspects and/or witnesses who managed to escape the building.



              -  a list of RFID tag numbers

              -  a Short Answer


Traces-4  Identify any casualties.



        -  a list of RFID tag numbers     

        -  a Short Answer


Traces-5  Describe the evacuation


        Provide a Detailed Answer



Grand Challenge



For the Grand Challenge, you will need to combine data from the four mini-challenges and the introductory Wiki pages to provide an analysis of the activities of the Paraiso movement.   Assumptions must be supported by evidence and specify in which dataset you found it. 


The last question Grand-4 is the most general question and your detailed answer to that question should explain how you arrived at the answers to all the previous smaller questions (Grand-1, 2 and 3).  Because of the complexity of the overall problem, we anticipate that the answers cannot easily be written separately. 


Use the Grand Challenge answer form




Grand-1. Based on ALL the data available (i.e. using the data from all 4 mini - challenges) what is the social network of the Paraiso movement at the end of the time period?

Provide the network node and link files describing the social network (using this format)


Grand-2.  What name or names can be associated with individual activities?


Provide a list of activities and the names of people associated with them (a sample is provided in the answer form) 


Grand 3. What is the geographical range of the Paraiso Movement and how it changes over time.

      No specific data to provide (as explained above, the answer to that question and how you arrived at it should be combined with the answer to the next question which is more general.)


Grand-4.  How do the major beliefs of the Paraiso movement affect their activities?

- a Debrief
- a Detailed Answer   (for this Grand Challenge there are no page limits  - but we recommend you use less than 10 pages)
- a Video
For the grand challenge you MUST provide a maximum of 10 minute video showing how you used your system to assess the situation.  The purpose of the video is to help the judges understand the interactive features of your tools.  Videos should have well synchronized audio commentaries.  We are interested in seeing especially the different interactions that might have resulted in views that provided more insights into the analysis.  Use the video to show different interactions with different visualizations.  Focus on those interactions that were most useful to you.  The video does not need to cover the entire process since that is covered in the detailed answer. 



Return to VAST 2008 Challenge