PhD Proposal: Practical Techniques for Leveraging Experts for Sequential Decisions and Predictions

Talk
Kianté Brantley
Time: 
04.26.2021 11:00 to 13:00
Location: 

Remote

Sequential decisions and predictions are common problems in many areas of natural language process, robotics, and video games. Essentially, an agent interacts with an environment to learn how to solve a particular problem. Research in the area of sequential decisions and predictions has increased due in part to the success of reinforcement learning. However, this success has come at a cost of algorithms being very data inefficient, which makes doing learning in the real world difficult.Our primary goal is to make these algorithms more data-efficient using techniques related to imitation learning. Imitation learning is a technique for using expert feedback in sequential decision and prediction-making problems. Naive imitation learning has a covariate shift problem (i.e. training distribution is different than test distribution). We propose methods and ideas to address this issue and address other issues that arise in other styles of imitation learning. In particular, we study two types of imitation learning style feedbacks, ‘interactive’ feedback and ‘demonstration’ feedback. We propose two ideas to help move towards learning in real-world settings.For my first proposed work, I plan to study Imitation learning problems in the context of multi- player games. Modern multi-player games are a setting for studying imitation learning because of the amount of potential demonstration data available. Reinforcement learning techniques are hard to run in these settings because the reward is often sparse and the state space is usually large. Imitation learning techniques can help, but simply running imitation learning techniques out of the box could exaggerated issues already present. In this work, we use the fact that we are playing a game and that the expert provides good state coverage in the ‘demonstration’ feedback setting — and propose a new technique to learn from the ‘demonstration’ feedback.Extending beyond multi-player games, we propose to benchmark the current state of imitation learning algorithms. To understand how much progress we have made over the years, we plan to standardize comparing imitation learning algorithms with each other. At its simplest form imitation learning is supervised learning and a lot of progress has been made from benchmarks in supervised learning. We go a step further by introducing more standard ‘baselines’ besides behavior cloning and standardize ways of training behavior cloning. This benchmark will provide us with an understanding of how recent ideas hold against various situations, giving us an indicator of how practical and feasible it would be to run these algorithms in real-world settings.Examining Committee:

Chair: Dr. Hal Daumé III Dept rep: Dr. om Goldstein Members: Dr. John Baras Dr. Philip Resnik Dr. Geoff Gordon Dr. Kyunghyun Cho