Q1. Do you only want a social network of
the 5 Catalano/Vidro's (e.g.
Ferdinando, Estaban, etc) or do you want the social network of every
caller in the network, whether having a known or unknown identity (e.g. all 400
nodes)?
A1. Remember that this
corresponds to a real situation... 1) you only know who a few people might be
but you are trying to make an hypothesis about what the social network
including those people is, and 2) the
activity of that social network is mixed with data from other people who are
not or may not be in the social network of interest. We are curious to see how your tool helps
you analyze this data and make hypotheses about the extent of the social
network of interest. Then you report
your best guess as to what which nodes are in this network…
Q2: You say “at the end of the time
period”. Does this mean you want the
network as it looked on the last day? Or
do you mean just using all of the information from Day 1 through Day 10?
A2: We ask that you analyze the
changes over the whole period (day 1 to 10) but we also ask for a snapshot of
what you think the Catalano/Vidro's social network might consist of at the end
of the period, i.e. day 10.
Q: if we are entering the Grand Challenge
do we need to submit 5 zips, or can we submit one zip with an index with
relative links to all of the 5 challenges?
A: If you entering the grand
challenge, you must have been able to deal with all the data types of the 4
mini-challenges so you can either 1) enter the grand challenge only, prepare
and submit only one answer form and one zip file, of 2) enter all 5 challenges
(more questions, 5 answer forms and 5 individual zips). Your
choice! Option 1 is obviously a lot less work for you and will get you
work known just as well, but option 2 gives you more space to make your
individual tools shine, and to receive an award of some sort.
Q: For Short Answer questions (which
have a limit of 150 words, 2 images), how would tables of data be counted? As
an image? Or as a collection of words? What about captions for images? Do they
count against the 150 word limit?
A: For short answers we will count a table of data as an image. A caption
is part of that image. BUT: in many questions you are asked to first
provide your answer (i.e. a list of something, a set of grid coordinates, etc.)
then to provide the Short Answer explanation of how you arrived at that
result. So, if your table is the answer itself that does not count
against the word or images limit in the text of the Short Answer. We much prefer to see the screen shots that
show clearly how your tool helped you find the answer. Also note that we won’t
reject your submission because it has just a few extra words in a short
answers… but if you go longer judges WILL be annoyed! And you can ask you to
correct it.
Q: Could you clarify to what extent you
would like us to define factions. Currently, I'm assuming this is an
overall evaluation of the article where we determine members for, against or
indifferent to Paraiso. However, in reading around the internet the term
faction could also apply to members who dispute over subtopics within the page.
For example, if "health" is a subtopic then we would have to
determine those for, against or indifferent to the particular health issue
being discussed.
A: Good question. I would define
factions in relation to the impact they had on my analytical points.
Factions who don't impact my points, or future points, don't really matter in
the final analysis. Unless my boss tells me otherwise. Or new
evidence pops up."
By the way,
we like your term "context organizers"...it's pretty accurate!
Q: For Question 2, it asks "Identify potential suspects
and/or witnesses to the event" and "Note: Potential suspects and/or
witnesses are people who were near the area just prior to the explosion and
exhibit suspicious behavior." When
you ask "near the area" I am assuming you are talking about the area
of the explosion? If so, does this mean we should only include people near the
area of the explosion?
A: If you have a hypothesis about one or more
individual's movements during the incident that is revealed by the data, that
you believe is relevant to the situation, you should clearly state what the
hypothesis is and your support for your thinking in your submission. Your
analysis is not limited to an arbitrary distance from the explosion location
(but is limited by the data you have).
Questions about the grand challenge
Question: "How should we
account for the strength of the
relationship? For example, we know that the relationship between certain
people is certain based on cell phone calls, but the relationship between
others is not certain because it is only
based on the sound of the last name"
Answer: The strength of
an answer should be reflected in the analysis write up (i.e. the detailed
answers) as evidence supporting hypotheses. If indicating strength of
relationship is part of the tool someone is using, that's great, but there
needs to be an explanation of why they think the relationship is stronger, and
how they interpreted the strength (e.g. Is it that the relationship has a lot
of evidence pointing to it? Is it that this is a more important
relationship, although it doesn't have as much evidence?)
Assumptions about data
should be clearly stated and explained. "Strength" of
associations should be clearly explained.
Question
about the social network files, i.e. common
to both mini challenge-3 (cell phone calls)
and Grand Challenge
Question: in your sample data the
relationship between 58 and 186 is repeated. Would you like us to repeat the
relationships in the same manner in both challenge 3 and the overall challenge?
Answer: We analyze a directed
graph so we do want a bidirectional link to be listed both ways (a->b and
b->a) if you think the relationship is symmetric.
If you are illustrating the
phone calls (i.e. in the mini-challenge) and there have only been one way
calls, there shouldn't be a bidirectional link; but if you are explaining your
analysis (in the grand challenge) and you are illustrating a social connection,
e.g., Jack knows Jill then it is bidirectional, since Jill knows Jack,
too.
Question about the mini challenge-3 (cell phone calls):
There
are some cell towers (1, 15, 11, 22) marked with a yellow dot on the cell tower
locations map. Do these towers have any special meaning in the context of the
problem?
Answer: When we started our investigations of the Paraiso issue,
there weren't too many of us familiar with the geography of the
Two
Questions about the mini-challenge-1
(Wiki Edits)
Question
1) Do the question marks at the beginning of some edit texts have any
particular meaning?
Answer: We passed this question on to our analysts working the
Paraiso issue. They replied, "This is an artifact created as
the information has been passed among various text processors. We're
not exactly sure when these had been introduced, but the data is otherwise very
clean for the most part. We have high confidence that there is no
steganography (hidden messages) or attempts at deception associated with the
sporadic question marks."
Question
2) Do all the editors of the wiki page, have to be associated with a particular
group?
Answer: Not by default. After analyzing the edits, it
may turn out that they are all either pro or con to an issue, but you
should let the data guide you on that determination.
Question: The data seem to indicate that one person called
three different people at the same time, something not possible without a
conference line. Should we assume that
users can engage in conference calls? Or
should we disregard such overlaps - e.g. only accept the first line and assume
that the others are in error?
Answer: We contacted the fictitious government
agency from Isla Del Sueno from whom we obtained the data and asked them your
question. They told us that the
telephone company from whom they obtained their data now claims that their data
is absolutely accurate and any "so-called" problems with it must have
been introduced after they gave it to the agency! So we have no resolution
there. We do know that cell-phone
conference calls are possible on the Isla, and if that is an important factor
in your analysis, please make a note of it and its significance to findings in
your report. Data should usually only be
ignored at your own risk!
Question a) For the mini-challenge-3 (Cell Phone calls), does the
answer to question 2 has to include and hypothesis about what the changes means
in the context of the problem scenario, or does it suffice to indicate where
and how the network changes structurally?
Answer: Yes, in your Detailed
Answer we will look for an hypothesis as to what the changes mean in the
contest of the problem.
Question
b) Is the geographic data included in the mini-challenge-3 (call towers on the
map), relevant to the mini challenge? Question 2 asks to characterize the
change of the social structure, does this also include the geographic location
change?
Answer: Yes, the geographic data is relevant
Question
c) In the mini-challenge-3 description you give hints to identify 5 members of
the Pariso Movement members. Are we
expected to identify other members when answering question 1?
Answer: No. In the
mini-challenge we don’t expect you to identify more than 5 family
members. But note that we ask for identify them at the end of the period.
End of February:
We decided to
encourage Authors to submit to the CG&A
Special Issue on Visual Analytics Evaluation. (deadline Sept 12, 2008)
instead of negotiating an invited journal paper about the contest itself. In 2007 the invited paper was nice but
difficult to organize so preparing a special issue seemed a more valuable use
of everybody’s time.