
Oct 19th, 2004
Known improvements
to the dataset (available after the contest)
THE CONTEST
IS OVER
June 30th
- The
submission website will be open all night July 1st until Maryland
wakes up around 8am Friday on the US east coast - so don’t worry about exact
time on Thursday. Also, it’s ok if you
didn’t register before June 21st, the important deadline is July 1st. If possible check email next week, in case we
have problems reading what you submitted.
- Clarification: student teams usually only
have student authors (and may be a mentor who didn’t contribute too much to the
ideas), typically it’s a class project.
In doubt, add something in the submission (i.e. at the beginning of the
standard form) saying how you worked together, and who did what.
June 14
The submission page is
now up (see the contest home page for all submission information).
We extended the deadline for the submission of final materials to July 1st
(but we asked to please register your submission and submit what you have ready
by the original deadline of June 21st so we can prepare this short
review process.) You will be able to
submit as many revisions as you want until the July 1st deadline.
May 21
The template for the standard form you have to use to
submit your answers has been posted (see “submission information” in the main
contest page). We also added more details about the video submission.
May 20
A beta version of the standard form was posted
Answered
some questions:
- You should use the whole dataset for all
questions, including question 1 (you can visualize papers and refs differently
if you want!)
- You should not use more data than is
provided in the contest dataset. For
example we never received the 2003 papers from the Digital Library (they are
still not posted as of today!) so you should NOT add 2003 yourself. But note
that after the contest results are announced you will be given a chance to
resubmit a revised version that could add more materials for the repository,
and could include the 2003 data as an additional task (e.g. how does things
change when you add the 2003 data?”)
May 5
The final dataset is
posted.
The main dataset now
includes:
- lots of extra abstracts
(was 227, now 429 over 614 entries)
- lots of extra keywords
(added or completed)
- lots of extra
references (added or completed)
- some entries have been unified (were id..., are now
acm...), but no entries were added.
Improvements to the
main dataset are thanks to Kevin Stamper, Tzu-Wei Hsu, Dave McColgin, Chris
Plaue, Jason Day, Bob Amar, Justin Godfrey, and Lee Inman Farabaugh (all at
Georgia Tech), and to Howie Goodell (UML) and
Niklas Elmqvist (Chalmers, Sweden)
In addition:
- a tabular
version of the data is provided by the team at UBC (Jung-Rung Han, Chia-Ning Chiang and Tamara Munzner)
- a small file provided by Maylis Delest (Universite de
Bordeaux). lists 4 autoreferences (really references to variants of the same
content).
- Note that the list of duplicate names is still external to
the dataset. (3 duplicates were added).
The dataset is
not perfect and will probably never be.
But we need to freeze it so that you all use the SAME data.
Someone
asked:
Under "Tasks" section, #2, "Characterize the research areas
(areas/topics to be defined by you)..."
is this "you" referring to the users who will be using the tool (i.e. allowing users to define, add topics), or is it referring to us (the designers) who will define research areas/topics when designing the tool?
[our answer] it is referring to you the designers
Someone
asked:
Under task 4, "Additional related items to build into the visualizations
include uncertainty, reliability, range, flexibility, broader
applicability", can you elaborate further on these terms? (i.e.
uncertainty on what? reliability of what? etc.)
[our answer] There is interpretation here. For example you could handle missing values, ambiguous authors, and partial names and define an uncertainty measure that is then represented visually providing the viewer information about the data used for this visualization. You could also compute p-values for confidence and somehow represent these within your visualizations.
It is important to realize that the metadata is NOT COMPLETE which will lead to uncertainties. For example, you can get the author list either as provided or parse the provided reference string automatically but note that some names can be ambiguous (e.g., North could be several individuals).
Return to Dataset and
Tasks
Return to InfoVis 2004 Contest