DBChat is a once per week discussion group for students, faculty and staff who are interested in
examining current and emerging topics of interest in the database area and
related research in information systems and networking. Topics of discussion
include work in progress, interesting papers/articles/...
This chat is intended to be completely informal (i.e. no slides)
so that it requires little preparation time and attracts more people.
If we have a visitor that week, DBChat may be cancelled.
Title: Toward Scalable Stochastic Flow Algorithms
Speaker: Prof. Srinivasan Parthasarathy, Ohio State University
When/Where: Oct 12, 2011; AVW 1146, 2pm
Abstract: Many real world problems (biological, social, web) can be effectively
modeled as networks or graphs where nodes represent entities of interest and
edges mimic the interactions or relationships among them. The study of such
complex relationship networks, recently referred to as "network science", can
provide insight into their structure, properties and emergent behavior. Of particular
interest here are rigorous methods for uncovering and understanding important
network structures and motifs (communities) at multiple topological and temporal
scales. Achieving this objective is challenging due to the presence of noise
(false or missing interactions), topological (scale-free)) properties and scalability.
Given the importance of the graph clustering problem, a number of solutions ranging
from hierarchical methods to spectral methods have been designed and developed.
To this mix of graph clustering algorithms one can also include a recent entrant
Markov Clustering (MCL) a graph clustering algorithm based on stochastic flow
simulation, and the focus of this work.
MCL has several advantages over competing techniques including an inherent
simplicity, empirically confirmed robustness to noise effects and relatively non-parametric
nature. However, in spite of its popularity in the bioinformatics community MCL
has drawn limited attention from the broader social network, web and data
mining communities primarily due to its limited scalability. In this talk we present two
orthogonal strategies to address this problem - the first relies on developing a new
multi-level regularized variant of MCL called MLR-MCL. The second approach focuses
on data reduction via graph sparsification, in a manner analogous to the use of
sampling, for improving the scalability of such algorithms. Empirical results demonstrate
both qualitative as well as quantitive improvements over existing approaches on
a wide range of domains.
This is joint work with Venu Satuluri and Yiye Ruan.
Bio: Srinivasan Parthasarathy is a Professor of Computer Science at Ohio State. His
research interests are in Data Mining, Systems (broadly defined) and Graph and Network
algorithms as they relate to social, biological and web applications. He is a recipient of
both the NSF and DOE career awards, multiple research awards from Microsoft, Google
and IBM, and multiple best paper awards and nominations from conferences such as
IEEE ICDM, SIAM DM, VLDB, SIGKDD, ACM-Bioinformatics and ISMB.