IEEE VAST 2008 Challenge

FAQ's and History of changes

 

July 10

PLEASE include “ [CHALLENGE] ” in the subject of your email when you submit your materials. THANK YOU

July 9: Added several things: questions about mini-challenge 3 (phone calls), but also about videos, and faculty advisors

July 9:  Added Questions/Answers about mini challenge-3 (cell phone calls):

Q1. Do you only want a social network of the 5 Catalano/Vidro's (e.g.   Ferdinando, Estaban, etc) or do you want the social network of every caller in the network, whether having a known or unknown identity (e.g. all 400 nodes)?

 

A1. Remember that this corresponds to a real situation... 1) you only know who a few people might be but you are trying to make an hypothesis about what the social network including those people is, and 2)  the activity of that social network is mixed with data from other people who are not or may not be in the social network of interest.   We are curious to see how your tool helps you analyze this data and make hypotheses about the extent of the social network of interest.  Then you report your best guess as to what which nodes are in this network…

 

Q2: You say “at the end of the time period”.  Does this mean you want the network as it looked on the last day?  Or do you mean just using all of the information from Day 1 through Day 10?

 

A2: We ask that you analyze the changes over the whole period (day 1 to 10) but we also ask for a snapshot of what you think the Catalano/Vidro's social network might consist of at the end of the period, i.e. day 10.

July 9: Advice about videos

Videos are required for the Grand Challenge, and only recommended for the mini challenges.  We suggest that you save your video as Flash as most people will be able view them and these have a good compression ratio. If you are not familiar with recorded video demonstrations take a look at Camtasia or BB Flashback.- it’s not too late and trial versions are available. Those tools will work very well for most systems (unless you have fast animation in which case you would an external video recorder to capture the rapid changes).  Always include audio explanations, as they help enormously. You might test your video by asking others -e.g. your friends-  if they can read and play the video (e.g. a common mistake is to not provide the video codec). Avoid .AVI or other formats that generates huge files. Getting the right answer is important of course, but showing how the tool helped you investigate the question is just as important. 

July 9: About the need for a faculty advisor for student teams

We ask that student teams provide the name and email of a faculty advisor because in the past it has happened that the student being the main contact graduated, disappeared, change their email address, etc. and could not be reached again.   Having the name of a faculty advisor increases the chances that we find someone to contact when needed (e.g. in August if we don’t get a camera ready when needed, or later to be sure someone will come to the symposium if you are to receive an award).  The faculty advisors do not have to be in the list of authors, e.g. if they didn’t work with you at all on the analysis or submission process, but you should ask for their authorization to use their name of course. 

Remember also that to receive an award you have to come to the symposium to receive it… so discuss who might be able to go and where the travel money will come from.  In the past we were able to award a few free registrations to the most deserving team (and give priority to the student teams for that), but we do know yet that if this will be possible this year or not.

July 7:  Added Question/Answer about Grand Challenge submission process

Q: if we are entering the Grand Challenge do we need to submit 5 zips, or can we submit one zip with an index with relative links to all of the 5 challenges?

 

A: If you entering the grand challenge, you must have been able to deal with all the data types of the 4 mini-challenges so you can either 1) enter the grand challenge only, prepare and submit only one answer form and one zip file, of 2) enter all 5 challenges (more questions, 5 answer forms and 5 individual zips).  Your choice!  Option 1 is obviously a lot less work for you and will get you work known just as well, but option 2 gives you more space to make your individual tools shine, and to receive an award of some sort.

July 4:  Added general Question/Answer about size limitation

Q: For Short Answer questions (which have a limit of 150 words, 2 images), how would tables of data be counted? As an image? Or as a collection of words? What about captions for images? Do they count against the 150 word limit?


A: For short answers we will count a table of data as an image.  A caption is part of that image.  BUT: in many questions you are asked to first provide your answer (i.e. a list of something, a set of grid coordinates, etc.) then to provide the Short Answer explanation of how you arrived at that result.  So, if your table is the answer itself that does not count against the word or images limit in the text of the Short Answer.   We much prefer to see the screen shots that show clearly how your tool helped you find the answer. Also note that we won’t reject your submission because it has just a few extra words in a short answers… but if you go longer judges WILL be annoyed! And you can ask you to correct it.

July 3:  Added a Question/Answer about mini-challenge 1: Wiki

Q: Could you clarify to what extent you would like us to define factions.  Currently, I'm assuming this is an overall evaluation of the article where we determine members for, against or indifferent to Paraiso. However, in reading around the internet the term faction could also apply to members who dispute over subtopics within the page. For example, if "health" is a subtopic then we would have to determine those for, against or indifferent to the particular health issue being discussed.


A:  Good question.  I would define factions in relation to the impact they had on my analytical points.  Factions who don't impact my points, or future points, don't really matter in the final analysis.   Unless my boss tells me otherwise.  Or new evidence pops up."

 By the way, we like your term "context organizers"...it's pretty accurate! 

 

 July 2:  Added a Question/Answer about mini-challenge 4: Traces

 

Q: For Question 2, it asks "Identify potential suspects and/or witnesses to the event" and "Note: Potential suspects and/or witnesses are people who were near the area just prior to the explosion and exhibit suspicious behavior."  When you ask "near the area" I am assuming you are talking about the area of the explosion? If so, does this mean we should only include people near the area of the explosion?

 

A: If you have a hypothesis about one or more individual's movements during the incident that is revealed by the data, that you believe is relevant to the situation, you should clearly state what the hypothesis is and your support for your thinking in your submission. Your analysis is not limited to an arbitrary distance from the explosion location (but is limited by the data you have).

 June 25:  Camera ready deadline for the 2 page summaries has been given to us:  August 18. 

The two-page summaries of the best entries (which are awarded a Certificate of Excellence) will be published in the VAST 2008 Symposium Proceedings. Nevertheless, ALL submitted two-page summaries will be published online - along with your answers - in a repository at NIST (the National Institute of Standard and Technology), whether or not they are awarded a Certificate of Excellence.   Because the selection of bets entries is based primarily on the answer you provide, the submission of a two page summaries is optional on the deadline of July 11.   Just make sure the main contact email you provide is active during the 1st half of August so you can submit the two page summary if needed.

June 24: Added several questions and answers

Questions about the grand challenge

Question:  "How should we account for the strength of the relationship? For example, we know that the relationship between certain people is certain based on cell phone calls, but the relationship between others is not certain because it is only based on the sound of the last name"

 

Answer:  The strength of an answer should be reflected in the analysis write up (i.e. the detailed answers) as evidence supporting hypotheses.  If indicating strength of relationship is part of the tool someone is using, that's great, but there needs to be an explanation of why they think the relationship is stronger, and how they interpreted the strength (e.g. Is it that the relationship has a lot of evidence pointing to it?  Is it that this is a more important relationship, although it doesn't have as much evidence?)

 Assumptions about data should be clearly stated and explained.  "Strength" of associations should be clearly explained. 

 

Question about the social network files, i.e. common to both mini challenge-3 (cell phone calls) and Grand Challenge

Question: in your sample data the relationship between 58 and 186 is repeated. Would you like us to repeat the relationships in the same manner in both challenge 3 and the overall challenge?

 

Answer: We analyze a directed graph so we do want a bidirectional link to be listed both ways (a->b and b->a) if you think the relationship is symmetric.

If you are illustrating the phone calls (i.e. in the mini-challenge) and there have only been one way calls, there shouldn't be a bidirectional link; but if you are explaining your analysis (in the grand challenge) and you are illustrating a social connection, e.g., Jack knows Jill then it is bidirectional, since Jill knows Jack, too. 

 

Question about the mini challenge-3 (cell phone calls):

There are some cell towers (1, 15, 11, 22) marked with a yellow dot on the cell tower locations map. Do these towers have any special meaning in the context of the problem?

 

Answer: When we started our investigations of the Paraiso issue, there weren't too many of us familiar with the geography of the Island either.  It turns out the yellow dots represent major population centers (cities) of that country!  If towers in those areas have any special significance in this investigation, please link it to your hypotheses and support it through evidence supplied via the data. 

 

Two Questions about the mini-challenge-1 (Wiki Edits)

Question 1) Do the question marks at the beginning of some edit texts have any particular meaning? 

Answer: We passed this question on to our analysts working the Paraiso issue.  They replied, "This is an artifact created as the information has been passed among various text processors.  We're not exactly sure when these had been introduced, but the data is otherwise very clean for the most part.  We have high confidence that there is no steganography (hidden messages) or attempts at deception associated with the sporadic question marks."

 

Question 2) Do all the editors of the wiki page, have to be associated with a particular group?

Answer: Not by default.   After analyzing the edits, it may turn out that they are all either pro or con to an issue, but you should let the data guide you on that determination.

 

June 20:  Remove the “draft” mention in the answer forms, they are now final. 

       The only change since June 11 was to fix a broken link in the Grand Challenge answer form.

June 13: Added a question and answer about the mini challenge-3 (cell phone calls)

Question: The data seem to indicate that one person called three different people at the same time, something not possible without a conference line.  Should we assume that users can engage in conference calls?  Or should we disregard such overlaps - e.g. only accept the first line and assume that the others are in error?

 

Answer: We contacted the fictitious government agency from Isla Del Sueno from whom we obtained the data and asked them your question.  They told us that the telephone company from whom they obtained their data now claims that their data is absolutely accurate and any "so-called" problems with it must have been introduced after they gave it to the agency! So we have no resolution there.    We do know that cell-phone conference calls are possible on the Isla, and if that is an important factor in your analysis, please make a note of it and its significance to findings in your report.  Data should usually only be ignored at your own risk!  

June 11:  How to submit your entry was added, as well as some clarifications about the Two-Page Summary. 

We decided to make the Two-Page Summary optional to make it easier for you to participate, while allowing those you hope to have a small publication to have a chance at it. The two page summaries of the most deserving entries (which will be awarded “certificates of excellence”) will be published in the VAST 2008 Symposium Proceedings.  If you have not submitted one, we will ask you for it then, only as needed!  In addition, the two page summaries of ALL teams submitting an entry to any Challenge will be posted online on a repository at NIST (the National Institute of Standard) after the Symposium, along with all the submitted entry materials (i.e. your answers to the questions). 

June 9: Questions and answers added for the mini-challenge-3 (Cell Phone Calls)

Question a) For the mini-challenge-3 (Cell Phone calls), does the answer to question 2 has to include and hypothesis about what the changes means in the context of the problem scenario, or does it suffice to indicate where and how the network changes structurally?

Answer: Yes, in your Detailed Answer we will look for an hypothesis as to what the changes mean in the contest of the problem.

Question b) Is the geographic data included in the mini-challenge-3 (call towers on the map), relevant to the mini challenge? Question 2 asks to characterize the change of the social structure, does this also include the geographic location change?

Answer:  Yes, the geographic data is relevant

Question c) In the mini-challenge-3 description you give hints to identify 5 members of the Pariso Movement members.  Are we expected to identify other members when answering question 1?

Answer: No. In the mini-challenge we don’t expect you to identify more than 5 family members.  But note that we ask for identify them at the end of the period.

 

June 4th:  Answer forms and criteria for judging have been posted (plus some very minor edits of the task descriptions)

May 22:   Detailed Task Descriptions for All Challenges have been posted

April 2, 2008 : 1FAQ added

Question:  In the file CellPhoneCallRecords.csv, there are some negative values in the field/variable “Duration”
I am not sure if negative “Duration” is possible, so, is this some error?
Should I delete these measurements from the data set?
Are all the other “Duration” measurements correct?
Or, should all “Duration” measurements be scaled upwards (e.g. by adding 146)?

Answer: Indeed, there are some negative values in that field. I've asked our source for this information about what might have created these values and what the challenge contestants should do with them. They report these values are "dirty data" in the dataset and that you should ignore them at this point in your investigation. If our collection management team is able to access some corrections from the Isla del Sueno telecoms, we will post them and notify you ASAP.  Carry on!

March 20:  Data sets available.
Register to download.    (Includes both grand challenge data and mini-challenge data).

End of February:  We decided to encourage Authors to submit to the  CG&A Special Issue on Visual Analytics Evaluation. (deadline Sept 12, 2008) instead of negotiating an invited journal paper about the contest itself.  In 2007 the invited paper was nice but difficult to organize so preparing a special issue seemed a more valuable use of everybody’s time.

February 15, 2008: Sample data available

Return to VAST 2008 Challenge