In this project, we explore the
area of set visualization. Our focus is more geared towards providing
scalability to some of the approaches shown in [Liu05], and facilitating the
analysis of sets by representing the clusters graphically in order to depict
their internal, as well as external links. The significant contribution of our
work is to apply the SOM and K-means clustering for producing better
visualizations. Although it might not be apparent at first glance, focusing on
the problem reveals that both of the above algorithms, as documented in the
literature, are not applicable to set visualization, as they assume a 2D or nD
(vector) representation for each data point (i.e. law case). More specifically,
the attributes must form a vector space. This assumption does not hold and there
is no clear geometric attribute corresponding to our dataset. Nevertheless, our
algorithms produce high quality 2D visual representations of large datasets. We
tested the algorithms on about 2800 points, while most previous approaches fail
to represent any dataset larger than 100’s of nodes. The details of how
similarity was computed, in the face of no geometric distances, for clustering
algorithms are provided. Additionally, the system provides various interactive
tools to enable users to explore sets and navigate between them.