PhD Defense: Sequential Decisions and Predictions in Natural Language Processing

Talk
He He
Time: 
05.27.2016 13:00 to 15:00
Location: 

AVW 3450

In recent years, natural language processing has achieved great success in a wide range of applications, producing both commercial language services and open-source language tools. However, most methods take a static or batch approach, assuming that the model has all information it needs before making a one time prediction. In this dissertation, we study dynamic problems where the input comes in a sequence instead of all at once, and the output must be produced while the input is arriving, based only on partial information. We see the dynamic setting in many real-time, interactive applications. These problems often involve a trade off between the amount of input received (cost) and the quality of the output prediction (accuracy). The evaluation therefore considers both objectives (e.g. by using a Pareto curve).
Our goal is to develop a formal understanding of sequential prediction and decision-making problems in natural language processing and to propose efficient solutions. Towards this end, we present meta-algorithms that take an existent batch model and produce a dynamic model that handles sequential inputs and outputs. We build our framework upon theories of Markov Decision Process (MDP), which allows learning to trade off competing objectives in a principled manner. The main machine learning techniques we use are from imitation learing and reinforcement learning, and we advance current techniques to tackle problems arising in our setting. We evaluate our algorithms on a variety of applications, including dependency parsing, machine translation, and question answering, and show that our learning-based approach achieves a better cost-accuracy tradeoff than heuristic-based approaches.
We first propose a general framework for cost-sensitive prediction, where different parts of the input come at different costs. We formulate a decision making process that selects pieces of the input sequentially, adaptive to each instance. Our approach is evaluated on both standard classification tasks and a structured prediction task (dependency parsing), and we show that it achieves similar prediction quality to methods that use all input with a much smaller cost. Next, we extend the framework to problems where the input is revealed incrementally in a fixed order. Two applications are studied: simultaneous machine translation and quiz bowl (incremental text classification). We discuss challenges in this setting and show that adding domain knowledge eases the decision making problem. A central theme throughout the chapters is an MDP formulation of a challenging problem with sequential input/output and trade off decisions, accompanied by a learning algorithm that solves the MDP.
Examining committee:
Chair: Dr. Hal Daume III
Dean’s rep: Dr. Philip Resnik
Members: Dr. Jordan Boyd-Graber
Dr. Ramani Duraiswami
Dr. Joelle Pineau