Statistical machine learning is in the midst of a "relational revolution". After many decades of focusing on learning statistical models from examples described by a simple vector of attribute values, where each example is assumed to be independent and identically-distributed, many researchers are now studying problems in which the examples are linked together into complex networks. These networks can be a simple as sequences and 2-D meshes (such as those arising in part-of-speech tagging and remote sensing) or as complex as social networks, the world wide web, complex visual scenes, dynamic software communication structures and biological data.
This course will cover recent approaches in machine learning which combine probabilistic and logical representations that deal with graph, network and relational data. In this course we will review many of the recently proposed approaches. We will identify the fundamental algorithmic challenges. We will survey work from machine learning, graphical models, and theoretical computer science. In the latter part of the course we will explore several specific applications areas, to be selected based on class interest.
The goal of the course is to give you the practical machine learning and statistical modeling expertise to analyze the types of network data that arise in your research. This should be applicable in areas such as bioinformatics, computer vision, computational linguistics, databases, networks and systems.
Prerequisites: Background in machine learning and graphical models suggested. Mathematical maturity and a basic course in probability required.
This is a seminar course. Each class will consist of instructor presentations, student presentations and discussion. Students will be required to do a significant project for the course. A significant portion of the grade will be based on class participation. Additional coursework will include knowledge engineering and programming exercises.
The readings will be based on materials that will be part of an upcoming SRL book and on selected additional readings. Course materials will be available from the class wiki, https://waimea.cs.umd.edu/cmsc828g
DATE & TOPIC
1/28 Introduction to SRL
2/4 Introduction to Graphical Models
Presenter: Indrajit Bhattacharya
Chapters 1-3 from 'An Introduction to Probabilistic Graphical Models' by Michael Jordan
Hardcopies of ch2 and ch3 are available in the CS Library. These are draft chapters of an upcoming book, please do not distribute
|2/11 Frame-based SRL Approaches: Probabilistic Relational Models and||Due: Focus Problem writeup||
Probabilistic Relational Models, L. Getoor, N. Friedman, D. Koller,
2/18 Frame-based SRL Approaches continued
from last time
Learning Probabilistic Models of Link Structure, L. Getoor, N. Friedman, D. Koller, B. Taskar. Journal of Machine Learning Research, 2002.
PRMs with Class Hierarchies, chapter 5 of Learning Statistical Models of Relational Data, Lise Getoor, PhD Thesis, Stanford University, 2001.
|2/25 Probabilistic Entity Relationship Models||Due: Focus Problem description uploaded to wiki, with PRM and PER descriptions||
Entity-Relationship Model, ch. 2 Korth and Silberschatz Database Systems Concepts. Hardcopy available in the CS Library.
Probabilistic Models for Relational Data, David Heckerman, Christopher Meek and Daphne Koller, Microsoft Technical Report TR-2004-30.
|3/4 Introduction to Logic-based Approaches||
Bayesian Logic Programs, Kristian Kersting and Luc DeRaedt, Institure for Computer Science, University of Frieburg, technical report 151,
Learning Bayesian Logic Programs, Kristian Kersting and Luc DeRaedt, Institure for Computer Science, University of Frieburg, technical report 174.
|3/11 Logic-based Approaches cont.||
Due: Focus Problem logic-based description
Due: Project Proposal
Logical Bayesian Networks,
Daan Fierens, Hendrik Blockeel, Jan Ramon and Maurice Bruynooghe. In Proc.
3rd Int'l Workshop on Multi-Relational Data Mining (MRDM-2004),
Logic-based Approach to PRMs (draft), Lise Getoor and John Grant. Draft available from class wiki
queries from context-sensitive probabilistic
The Independent Choice Logic for modelling multiple agents under uncertainty'', David Poole. Artificial Intelligence, 94(1-2), special issue on economic principles of multi-agent systems, pages 7-56, 1997.
|3/25 Spring Break|
|4/1 BLOG and IBAL||
BLOG: Probabilistic Models with Unknown Objects, Brian Milch Bhaskara Marthi, Stuart Russell David Sontag, Daniel L. Ong, Andrey Kolobov. Draft available from wiki.
The Design and Implementation of IBAL: A General-Purpose Probabilistic Language. Avi Pfeffer. Draft available from wiki. Please do not distribute.
|4/8 SLPs and PRISM||Cussens Slides||Due: Project Progress Report||
Statistical aspects of stochastic logic programs, James Cussens. In Artificial Intelligence and Statistics 2001: Proceedings of the Eighth International Workshop, January 2001.
Parameter learning of logic programs for symbolic-statistical modeling. Sato, T. and Kameya, Y. Journal of Artificial Intelligence Research (JAIR), Vol.15, pp.391–454, 2001.
Integrating by Separating: Combining Probability and Logic with ICL, PRISM and SLPs. James Cussens. Dagstuhl seminar on Probability, Logic and Relational Learning, February 2005.
4/15 Undirected Models: RMNs and Markov Logic
Richardson and Domingos, Markov Logic: A Unifying Framework for Statistical Relational Learning. Draft available from class wiki.
Discriminative Probabilistic Models for Relational Data, B. Taskar, P. Abbeel and D. Koller. Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI02?), Edmonton, Canada, August 2002.
Link Prediction in Relational Data, B. Taskar, M. F. Wong, P. Abbeel and D. Koller. Neural Information Processing Systems Conference (NIPS03?), Vancouver, Canada, December 2003.
An analysis of first-order logics of probability, Artificial Intelligence 46, 1990, pp. 311-350
Maximum Entropy Probabilistic Logic. Mark A. Paskin. Technical Report UCB/CSD-01-1161, University of California, Berkeley, 2002.
|4/29||Final focus problem writeups due||Discussion|
Review (as necessary throughout)
Models for sequences and spatial data
Inductive Logic Programming
Statistical Relational Learning (SRL)
Introduction (1-2 classes)
SRL Representations (3-4 classes)
Probabilistic Relational Models
Relational Markov Networks
Probabilistic Entity Relation Models
Bayesian Logic Programs
Stochastic Logic Programs
Probabilistic Context-free Grammars
Descriptive Modeling in Graphs and
Social Networks (2-3 classes)
Random Graphs and the P* Model
SRL Inference (2-3 classes)
Inference reuse in OOBNs
SRL Learning (2-3 classes)
Parameter estimation in Markov Random Fields
Learning in Probabilistic Relational Models
Learning in Relational Markov Networks
Plus we will cover some of the following topics, depending on student interest, paying particular attention to how to approach these problems from an SRL perspective:
Special Topics (5 classes)
Collaborative Filtering and Recommender Systems
Information Extraction and Integration
Language and Speech Modeling
Malware (virus, worm, etc.) detection
Semantic Web Mining
Social Networks and Link Analysis
Spam and Email filtering
This course does not count as a PhD
Core or MS Comps course. This
course can be used toward PhD coursework as part of the 2 additional
non-core classes required or as MS qualifying coursework.