SRL2004: Statistical Relational Learning and
its Connections to Other Fields

Contact Information

CFP

Pointers
Schedule
Papers
Instructions
Software
Data

Contact Information
Information
Mailing List

Stuctured machine learning problems in natural language processing

Michael Collins
MIT CSAIL/EECS

Many problems in natural language processing involve the mapping from strings to structured objects such as parse trees, underlying state sequences, or segmentations. This leads to an interesting class of learning problems: how to induce classification functions where the output "labels" have meaningful internal structure, and where the number of possible labels may grow exponentially with the size of the input strings. Probabilistic grammars -- for example hidden markov models or probabilistic context-free grammars -- are one common approach to this type of problem. In this talk I will describe recent work on alternatives to HMMs and PCFGs, based on generalizations of binary classification algorithms such as boosting, the perceptron algorithm, or large-margin (SVM) methods.