Q1. Do you only want a social network of the 5 Catalano/Vidro's (e.g. Ferdinando, Estaban, etc) or do you want the social network of every caller in the network, whether having a known or unknown identity (e.g. all 400 nodes)?
A1. Remember that this corresponds to a real situation... 1) you only know who a few people might be but you are trying to make an hypothesis about what the social network including those people is, and 2) the activity of that social network is mixed with data from other people who are not or may not be in the social network of interest. We are curious to see how your tool helps you analyze this data and make hypotheses about the extent of the social network of interest. Then you report your best guess as to what which nodes are in this network…
Q2: You say “at the end of the time period”. Does this mean you want the network as it looked on the last day? Or do you mean just using all of the information from Day 1 through Day 10?
A2: We ask that you analyze the changes over the whole period (day 1 to 10) but we also ask for a snapshot of what you think the Catalano/Vidro's social network might consist of at the end of the period, i.e. day 10.
Q: if we are entering the Grand Challenge do we need to submit 5 zips, or can we submit one zip with an index with relative links to all of the 5 challenges?
A: If you entering the grand challenge, you must have been able to deal with all the data types of the 4 mini-challenges so you can either 1) enter the grand challenge only, prepare and submit only one answer form and one zip file, of 2) enter all 5 challenges (more questions, 5 answer forms and 5 individual zips). Your choice! Option 1 is obviously a lot less work for you and will get you work known just as well, but option 2 gives you more space to make your individual tools shine, and to receive an award of some sort.
Q: For Short Answer questions (which have a limit of 150 words, 2 images), how would tables of data be counted? As an image? Or as a collection of words? What about captions for images? Do they count against the 150 word limit?
A: For short answers we will count a table of data as an image. A caption is part of that image. BUT: in many questions you are asked to first provide your answer (i.e. a list of something, a set of grid coordinates, etc.) then to provide the Short Answer explanation of how you arrived at that result. So, if your table is the answer itself that does not count against the word or images limit in the text of the Short Answer. We much prefer to see the screen shots that show clearly how your tool helped you find the answer. Also note that we won’t reject your submission because it has just a few extra words in a short answers… but if you go longer judges WILL be annoyed! And you can ask you to correct it.
Q: Could you clarify to what extent you would like us to define factions. Currently, I'm assuming this is an overall evaluation of the article where we determine members for, against or indifferent to Paraiso. However, in reading around the internet the term faction could also apply to members who dispute over subtopics within the page. For example, if "health" is a subtopic then we would have to determine those for, against or indifferent to the particular health issue being discussed.
A: Good question. I would define factions in relation to the impact they had on my analytical points. Factions who don't impact my points, or future points, don't really matter in the final analysis. Unless my boss tells me otherwise. Or new evidence pops up."
By the way, we like your term "context organizers"...it's pretty accurate!
Q: For Question 2, it asks "Identify potential suspects and/or witnesses to the event" and "Note: Potential suspects and/or witnesses are people who were near the area just prior to the explosion and exhibit suspicious behavior." When you ask "near the area" I am assuming you are talking about the area of the explosion? If so, does this mean we should only include people near the area of the explosion?
A: If you have a hypothesis about one or more individual's movements during the incident that is revealed by the data, that you believe is relevant to the situation, you should clearly state what the hypothesis is and your support for your thinking in your submission. Your analysis is not limited to an arbitrary distance from the explosion location (but is limited by the data you have).
Questions about the grand challenge
Question: "How should we account for the strength of the relationship? For example, we know that the relationship between certain people is certain based on cell phone calls, but the relationship between others is not certain because it is only based on the sound of the last name"
Answer: The strength of an answer should be reflected in the analysis write up (i.e. the detailed answers) as evidence supporting hypotheses. If indicating strength of relationship is part of the tool someone is using, that's great, but there needs to be an explanation of why they think the relationship is stronger, and how they interpreted the strength (e.g. Is it that the relationship has a lot of evidence pointing to it? Is it that this is a more important relationship, although it doesn't have as much evidence?)
Assumptions about data should be clearly stated and explained. "Strength" of associations should be clearly explained.
Question about the social network files, i.e. common to both mini challenge-3 (cell phone calls) and Grand Challenge
Question: in your sample data the relationship between 58 and 186 is repeated. Would you like us to repeat the relationships in the same manner in both challenge 3 and the overall challenge?
Answer: We analyze a directed graph so we do want a bidirectional link to be listed both ways (a->b and b->a) if you think the relationship is symmetric.
If you are illustrating the phone calls (i.e. in the mini-challenge) and there have only been one way calls, there shouldn't be a bidirectional link; but if you are explaining your analysis (in the grand challenge) and you are illustrating a social connection, e.g., Jack knows Jill then it is bidirectional, since Jill knows Jack, too.
Question about the mini challenge-3 (cell phone calls):
There are some cell towers (1, 15, 11, 22) marked with a yellow dot on the cell tower locations map. Do these towers have any special meaning in the context of the problem?
Answer: When we started our investigations of the Paraiso issue,
there weren't too many of us familiar with the geography of the
Two Questions about the mini-challenge-1 (Wiki Edits)
Question 1) Do the question marks at the beginning of some edit texts have any particular meaning?
Answer: We passed this question on to our analysts working the Paraiso issue. They replied, "This is an artifact created as the information has been passed among various text processors. We're not exactly sure when these had been introduced, but the data is otherwise very clean for the most part. We have high confidence that there is no steganography (hidden messages) or attempts at deception associated with the sporadic question marks."
Question 2) Do all the editors of the wiki page, have to be associated with a particular group?
Answer: Not by default. After analyzing the edits, it may turn out that they are all either pro or con to an issue, but you should let the data guide you on that determination.
Question: The data seem to indicate that one person called three different people at the same time, something not possible without a conference line. Should we assume that users can engage in conference calls? Or should we disregard such overlaps - e.g. only accept the first line and assume that the others are in error?
Answer: We contacted the fictitious government agency from Isla Del Sueno from whom we obtained the data and asked them your question. They told us that the telephone company from whom they obtained their data now claims that their data is absolutely accurate and any "so-called" problems with it must have been introduced after they gave it to the agency! So we have no resolution there. We do know that cell-phone conference calls are possible on the Isla, and if that is an important factor in your analysis, please make a note of it and its significance to findings in your report. Data should usually only be ignored at your own risk!
Question a) For the mini-challenge-3 (Cell Phone calls), does the answer to question 2 has to include and hypothesis about what the changes means in the context of the problem scenario, or does it suffice to indicate where and how the network changes structurally?
Answer: Yes, in your Detailed Answer we will look for an hypothesis as to what the changes mean in the contest of the problem.
Question b) Is the geographic data included in the mini-challenge-3 (call towers on the map), relevant to the mini challenge? Question 2 asks to characterize the change of the social structure, does this also include the geographic location change?
Answer: Yes, the geographic data is relevant
Question c) In the mini-challenge-3 description you give hints to identify 5 members of the Pariso Movement members. Are we expected to identify other members when answering question 1?
Answer: No. In the mini-challenge we don’t expect you to identify more than 5 family members. But note that we ask for identify them at the end of the period.
End of February: We decided to encourage Authors to submit to the CG&A Special Issue on Visual Analytics Evaluation. (deadline Sept 12, 2008) instead of negotiating an invited journal paper about the contest itself. In 2007 the invited paper was nice but difficult to organize so preparing a special issue seemed a more valuable use of everybody’s time.