PhD Proposal: Reinforcement Learning under Unseen/Adversarial Inputs: Learning to Adapt and Learning to Robustify

Yanchao Sun
12.15.2021 10:00 to 12:00

IRB 5105

Reinforcement learning (RL), especially Deep RL, has recently achieved incredible success in many applications, including autonomous driving, gaming, healthcare, etc. However, RL algorithms are known to be sample-hungry, so adapting to unseen states or novel tasks is usually hard and inefficient. Moreover, deep neural networks are known to be vulnerable to adversarial attacks, making it risky to apply deep network-based RL techniques to high-stakes problems. Motivated by these two challenges, in this proposal, we introduce several algorithms to achieve (1) effective and efficient knowledge transfer for fast adaptation, and (2) vulnerability-aware robust training.Regarding the adaptation challenge in RL, we first focus on tabular RL, and propose two provably efficient transfer learning algorithms. Both algorithms discover and extract similarities among state-action pairs within a task or among multiple tasks. The proposed algorithm provably improves the sample efficiency and computational efficiency of existing model-based RL methods. With the insights built on tabular RL problems, we then move on to the regime of Deep RL, and investigate a novel adaptation problem: transferring knowledge across drastically different observation spaces. We propose a theory-inspired transfer algorithm which, for the first time, achieves transfer learning from a vector-input environment to a pixel-input environment.To address the robustness challenge in RL, we start by studying the vulnerability of Deep RL algorithms, against both training-time attacks and test-time attacks. We propose the first poisoning (training-time attack) algorithm in Deep RL, and introduce a novel metric to measure the vulnerability of Deep RL methods during training. We also propose a novel evasion attack (test-time attack) algorithm which efficiently achieves state-of-the-art attacking performance in a wide range of environments. As the old saying goes, “if you know yourself and your enemy, you’ll never lose a battle”. Equipped with a deeper understanding of RL algorithms’ vulnerability, we then proceed to robustify existing Deep RL methods by adversarial training. We introduce a novel vulnerability-aware adversarial training algorithm which consistently evaluates and improves its worst-case performance together with its normal performance. Our algorithm achieves better robustness with less computation and fewer samples than state-of-the-art robust RL methods.Examining Committee:

Chair:Department Representative:Members:

Dr. Furong Huang Dr. Dinesh Manocha Dr. Hal DauméDr. Soheil Feizi