InfoVis 2003 Contest

Datasets - Descriptions - Tasks

.

News: N/A

See the history of updates for all the contest materials pages.

Questions?

Contact: Jean-Daniel.Fekete@inria.fr; plaisant@cs.umd.edu

General Information about the materials

For each type of dataset we give a brief description of the type of tree it represents, background about the application domain and tasks one might encounter when analyzing the data. Of course we hope that this contest will help refine the list of tasks as you will probably find interesting facts in the data that do not fit any of the tasks we suggested...

We arbitrarily chose 3 types of trees. We could have picked many more (e.g. shallow fixed depth trees, or small trees with 100s of attributes, attributes that vary as a function of depth) but we felt that even though it would have been more complete, it would have also diluted the contest as well as made judging harder and results more difficult to explain. In other words: there is enough to run the InfoVis contest for 10 years just on trees, but we had to start somewhere!

Tasks

We separated general tasks and tasks specific to particular datasets. The specific tasks are often broad goal setting tasks, but could also be instantiations of the general tasks that highlight the special needs of the datasets.

1- See the list of general tasks , i.e. tasks commonly encountered while analyzing tree data: topology tasks, attribute based tasks, and comparison tasks.

2- See the specific tasks in each domain background below.

IMPORTANT: For most questions we do not want a detailed result list but an explanation (or illustration or demonstration) of how the visualization helped you find the answer (or not). For example when we say: "Which nodes have been deleted?" we do not need to see the list of nodes... but we want enough information to judge how the tool helped you see what was deleted.

Data Format

The simple XML format is spcified in treeml.dtd
We provide a small sample tree , and of course you can look at the datasets themselves as examples.

Datasets

Download all data files at once : iv03contest_data(+date created as year-month-day).zip (about 7Mg).

PHYLOGENIES

Specific tree characteristics

The trees are small binary trees (60 leaf nodes.) Link length is often considered important by researchers using this data. No attributes. On the other hand the analysis can be very complex, and there are no good interactive tools available today for scientists to make hypotheses about the matching of those trees.

Application Domain Background and Specific Tasks

Datasets

phylo_A_ABC(+date).xml (about 15Kb)
phylo_B_IM(+date).xml

CLASSIFICATIONS

Specific tree characteristics

The trees are very large (about 200,000 leaf nodes) with large fanouts. There are only tthree attributes, all nominal. Labelling, search and showing results in context is important.

Application Domain Background and Specific Tasks

Datasets

classif_A(+date).xml (about 40Mg)
classif_B(+date).xml

NOTE: If you really have to use a subset of the tree because you cannot handle so many nodes, work on the "mammal" subtree.

FILE SYSTEM AND USAGE LOGS

Specific tree characteristics

The trees are large (about 70,000 leaf nodes). Here we have more attributes available, numerical and nominal. Changes between the two trees can be topological changes and attribute value changes. Each file corresponds to a given period during which the usage logs were collected. We provide more than two trees but the focus of the contest remains on pair comparisons so pick the pairs you want.

Application Domain Background and Specific Tasks

Datasets

logs_A(+dateposted).xml Period A ending 1-19 (about 20Mg)
logs_B(+dateposted).xml Period B ending 1-25
logs_C(+dateposted).xml Period C ending 2-1
logs_D(+dateposted).xml Period D ending 2-8

NOTE: If you really have to use a subset of the tree because you cannot handle so many nodes, work on the "HCIL" subtree (i.e. everything under /projects/hcil).

.

Questions?

Contact: Jean-Daniel.Fekete@inria.fr; plaisant@cs.umd.edu

Return to InfoVis 2003 Contest