There has been a recent surge of interest in the analysis of data describing all forms of networks, including communications networks, biological networks, social networks, financial transaction networks and more. Despite the diversity of domains, common difficulties and challenges include noisy and incomplete data, dynamic and streaming data, issues of scalability and statistical issues such as identifiability, stationarity, and so on. There are a number of different research communities working on network analysis including statisticians, physicists and computer scientists; each comes with their own view on the problem of network analysis, their own set of tools and their own style of analysis.
In this seminar, we will survey some of the recent work in network analysis. We will cover topics such as: random graph models, information diffusion and social contagion, probabilistic inference and collective classification in network data, link prediction, community formation and detection, clustering in graph data, entity resolution, visual analytic tools for network data, and other topics from graph mining and statistical network analysis as time permits.
The seminar will be very interactive and collaborative. The topics covered and the depth of coverage will depend on the participants' input and interests.
The goal of the course is to give you an overview of current topics in link mining and dynamic network analysis and practical machine learning and statistical modeling experience to analyze network data that arises in your research. The methods are applicable in many areas such as bioinformatics, computer vision, computational linguistics, databases, program analysis, networks and systems. Along the way, you will pick up some practical experience in reading and presenting research papers, writing a literature survey, and doing a course project that ideally will lead to a publishable paper.
In tandem with the course, throughout the semester there will be several invited speakers presenting current work in network analysis. Some of these will be during the scheduled course time, while others, due to schedule constraints, will be outside the regular course time. Students are highly encouraged to attend the invited talks and meet with the speakers.
Prerequisites: Background in machine learning and graphical models suggested. Mathematical maturity and a basic course in probability required.
This is a seminar course. Each class will consist of presentations and discussion. Students will be required to do a survey paper (30%) and a class project for the course (45%) . A significant portion of the grade will be based on class participation, which includes paper presentations, contributions to the wiki, and demonstrations (25%).
Because of the interactive nature of the course, auditing is discouraged. However the sessions which have invited speakers will be open to all. For information and announcements about the invited speakers, please join the dynamic network analysis (dna) mailing list:: email@example.com. To sign up, go here http://mailman.cs.umd.edu/mailman/listinfo/dna-seminar.
The web page for the seminar series is here: http://www.cs.umd.edu/projects/linqs/dna-seminar/
This course does not count as a PhD Core or MS Comps course. This course can be used toward PhD coursework as part of the non-core classes required or towards MS coursework (but it is not an MS qualifying coure).
Time: Tue. & Thurs. at 11:00-12:15pm in room: CSI 2107
Professor: Lise Getoor - getoor AT cs.umd.edu
Office hours: 3:00 - 4:00 Tue. in AVW 3217
Web site: http://www.cs.umd.edu/class/spring2008/cmsc828g/
Tthe class wiki, http://linqs.cs.umd.edu/cmsc828g/, is for students enrolled in the course to share material and discuss content.
Tthe class mailing list is for announcements relevant to the class. If you are enrolled in the class please sign up here http://mailman.cs.umd.edu/mailman/listinfo/cmsc828g
|Date & Location||Topic / Papers||Notes|
|Tue 1/29, CSIC 2107||
|Students should add their introductions and defintions to the class wiki|
|Thu 1/31||Defintions and Link Mining Survey||
Students present their definitons
Link Mining: A Survey, Lise Getoor, Christopher Diehl, SigKDD Explorations Special Issue on Link Mining, Volume 7, Number 2 - December, 2005
|Mon 2/4, 11AM - 12PM, AVW 2460||Aron Culotta Talk, Learning to Extract Knowledge from Text||
|Tue 2/5||Michell Girvan, Contagion Processes in Complex Networks (slides)||The structure and function of complex networks , M. E. J. Newman, SIAM Review 45 , 167-256 (2003).|
|Tue 2/12||Discussion||Link Mining Survey and Newman article|
|Thu 2/14||Adam Perer, Integrating Statistics and Visualization: Gaining Clarity during Exploratory Data Analysis of Social Networks||
Intro to the SigKDD Explorations Special Issue on Visual Analytics ,
by Daniel Keim, and Joern Schneidewind, 2007,
Balancing Systematic and Flexible Exploration of Social Networks, Adam Perer and Ben Shneiderman, IEEE Transactions on Visualization and Computer Graphics, 12, 5, (October 2006), 693 - 700.
|Tue 2/19, AVW 2460||
Mixed membership analysis of high-throughput interaction studies, http://arxiv.org/abs/0706.0294/ (required)
Coordination of Growth Rate, Cell Cycle, Stress Response, and Metabolicctivity in Yeast, http://www.molbiolcell.org/cgi/reprint/19/1/352 (optional)
Genomic Expression Programs in the Response of Yeast Cells to Environmental Changes http://www.molbiolcell.org/cgi/reprint/11/12/4241 (optional)
|Thu 2/21, AVW 2460|| Empirical+models for viral marketing
Empirical+models for blogs
Algorithms for detecting cascades:
|Tue 2/26, 11AM-12PM, AVW 2460||Doug Downey, UW||
Discussion and Project One Minutes
Slides on How to Write a Paper , courtesy of Mike Hicks
Instructions on Project Proposal
Cautious Inference in Collective Classification,
Luke McDowell, Kalyan Gupta and David Aha, AAAI 2007.
Why Collective Inference Improves Relational Classification, D. Jensen, J. Neville and B. Gallagher, KDD 2004.
Survey 1: Collective Classification
Collective Classification in Network Data
Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Gallagher, Tina Eliassi-Rad.
UMD Tech Report 4905, 2008.
Discriminative probabilistic models for relational data B. Taskar, P. Abbeel, and D. Koller. UAI, 2002.
Survey 2: Information Diffusion
Maximizing the Spread of Influence through a Social Network
D. Kempe, J. Kleinberg, E. Tardos.
Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge
Discovery and Data Mining, 2003.
Patterns of Influence in a Recommendation Network Jure Leskovec, Ajit Singh, Jon Kleinberg. Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2006.
|Fri 3/14, AVW 2460 (updated!)|
Survey 3: Node Ranking
David Austin How Google Finds Your Needle in the Web's Haystack
J. Kleinberg. Authoritative sources in a hyperlinked environment.
Journal of the ACM, 1999. (HITS)
Amy N. Langville and Carl D. Meyer Deeper Inside PageRank. Internet
1(3):335-380, 2005. pages 1-20.
S. Brin and L. Page. The Anatomy of a Large-Scale Hypertextual Web Search
Engine. Proc. 7th International World Wide Web Conference, 1998.
Survey 4: Entity Resolution
Collective entity resolution in relational data,
I. Bhattacharya and L. Getoor.
ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 1, 2007.
Toward conditional models of identity uncertainty with application to proper noun coreference, A. McCallum and B. Wellner. IJCAI Workshop on Information Integration on the Web, 2003. pdf
Survey 5: Link Prediction
The Link Prediction Problem for Social Networks,
D. Liben-Nowell, J. Kleinberg.
Proc. 12th International Conference on Information and
Knowledge Management (CIKM), 2003.
Link Prediction in Relational Data, B. Taskar, M. F. Wong, P. Abbeel and D. Koller. Neural Information Processing Systems Conference (NIPS03), Vancouver, Canada, December 2003. pdf
Survey 6: Group Discovery - Stochastic Block Models
Learning systems of concepts with an infinite relational model.
C. Kemp, J. B. Tenenbaum, T. L. Griffiths, T. Yamada and N. Ueda.
National Conference on Artificial Intelligence (AAAI), 2006.
K. Faust and S. Wasserman. Blockmodels: Interpretation and evaluation. Social Networks (Invited Paper), 14, 561, 1992. pdf
|Wed 4/17, AVW 2120||
Vitor Carvalho , CMU|
Modeling Intention in Email: Speech Acts, Information Leaks and User Ranking Methods (abstract)
|Thu 4/17||Cosma Shalizi, CMU |
Discovering Functional Communities in Dynamical Networks (abstract)
Survey 7: Group Discovery 2
Community structure in social and biological networks. M. Girvan and M. E. J.
Newman, Proceedings of the National Academy of Sciences, 2002.
Clustering Relational Data Using Attribute and Link Information. J. Neville, M. Adler, and D. Jensen. In Proceedings of the Text Mining and Link Analysis Workshop, 18th International Joint Conference on Artificial Intelligence, 2003. pdf
Modularity and community structure in networks. M. E. J. Newman.- Proceedings of the National Academy of Sciences, 2006. pdf
Survey 8: Topic Models
Probabilistic Topic Models,
M. Steyvers and T. Griffiths,
in T. Landauer, D McNamara, S. Dennis, and W. Kintsch (eds),
Handbook of Latent Semantic Analysis, 2007.
The Missing Link: A Probabilistic Model of Document Content and Hypertext Connectivity. T. Hoffman and D. Cohn, Conference on Neural Information Processing Systems (NIPS), 2001, pdf
|Fri 4/25, 2pm AVW 2120|
Survey 9: Organization and Taxonomy Formation
Rob Patro and Mike Pekala
Structural Inference of Hierarchies in Networks,
A. Clauset, C. Moore, M.J. Newman,
ICML Workshop on Social Network Analysis, 2006.
Learning Annotated Hierarchies from Relational Data, D. Roy, C. Kemp, V. Mansinghka and J. Tenenbaum, Conference on Neural Information Processing Systems (NIPS), 2006, pdf
Survey 10: Graph Mining
Hassan Sayyadi and Shanchan Wu