Amr Sharaf

4108 Brendan Iribe Center, University of Maryland, MD, 20742 · amr@cs.umd.edu

I'm a PhD candidate in the Computational Linguistics and Information Processing (CLIP) Lab at the University of Maryland, advised by Hal Daumé III. My research focuses on developing interactive learning algorithms in the context of structured prediction for AI and NLP. I'm interested in applying imitation learning algorithms for structured prediction problems in weakly supervised settings.

Publications

Meta-Learning for Contextual Bandit Exploration

Amr Sharaf, Hal Daumé III

We describe MELEE, a meta-learning algorithm for learning a good exploration policy in the interactive contextual bandit setting. Here, an algorithm must take actions based on contexts, and learn based only on a reward signal from the action taken, thereby generating an exploration/exploitation trade-off. MELEE addresses this trade-off by learning a good exploration strategy for offline tasks based on synthetic data, on which it can simulate the contextual bandit setting. Based on these simulations, MELEE uses an imitation learning strategy to learn a good exploration policy that can then be applied to true contextual bandit tasks at test time. We compare MELEE to seven strong baseline contextual bandit algorithms on a set of three hundred real-world datasets, on which it outperforms alternatives in most settings, especially when differences in rewards are large. Finally, we demonstrate the importance of having a rich feature representation for learning how to explore.

January 2019

Cross-Lingual Approaches to Reference Resolution in Dialogue Systems

Amr Sharaf, Arpit Gupta, Hancheng Ge, Chetan Naik, Lambert Mathias

In the slot-filling paradigm, where a user can refer back to slots in the context during the conversation, the goal of the contextual understanding system is to resolve the referring expressions to the appropriate slots in the context. In this paper, we build on the context carryover system, which provides a scalable multi-domain framework for resolving references. However, scaling this approach across languages is not a trivial task, due to the large demand on acquisition of annotated data in the target language. Our main focus is on cross-lingual methods for reference resolution as a way to alleviate the need for annotated data in the target language. In the cross-lingual setup, we assume there is access to annotated resources as well as a well trained model in the source language and little to no annotated data in the target language. In this paper, we explore three different approaches for cross-lingual transfer delexicalization as data augmentation, multilingual embeddings and machine translation. We compare these approaches both on a low resource setting as well as a large resource setting. Our experiments show that multilingual embeddings and delexicalization via data augmentation have a significant impact in the low resource setting, but the gains diminish as the amount of available data in the target language increases. Furthermore, when combined with machine translation we can get performance very close to actual live data in the target language, with only 25% of the data projected into the target language.

December 2018

Residual Loss Prediction: Reinforcement Learning with no Incremental Feedback

Hal Daumé III, John Langford and Amr Sharaf

We consider reinforcement learning and bandit structured prediction problems with very sparse loss feedback: only at the end of an episode. We introduce a novel algorithm, RESIDUAL LOSS PREDICTION (RESLOPE), that solves such problems by automatically learning an internal representation of a denser reward function. RESLOPE operates as a reduction to contextual bandits, using its learned loss representation to solve the credit assignment problem, and a contextual bandit oracle to trade-off exploration and exploitation. RESLOPE enjoys a no-regret reductionstyle theoretical guarantee and outperforms state of the art reinforcement learning algorithms in both MDP environments and bandit structured prediction settings.

May 2018

Structured Prediction via Learning to Search under Bandit Feedback

Amr Sharaf and Hal Daumé III

We present an algorithm for structured prediction under online bandit feedback. The learner repeatedly predicts a sequence of actions, generating a structured output. It then observes feedback for that output and no others. We consider two cases: a pure bandit setting in which it only observes a loss, and more fine-grained feedback in which it observes a loss for every action. We find that the fine-grained feedback is necessary for strong empirical performance, because it allows for a robust variance-reduction strategy. We empirically compare a number of different algorithms and exploration methods and show the efficacy of BLS on sequence labeling and dependency parsing tasks.

September 2017

The UMD Neural Machine Translation Systems at WMT17 Bandit Learning Task

Amr Sharaf, Shi Feng, Khanh Nguyen, Kiante Brantley, Hal Daumé III

We describe the University of Maryland machine translation systems submitted to the WMT17 German-English Bandit Learning Task. The task is to adapt a translation system to a new domain, using only bandit feedback: the system receives a German sentence to translate, produces an English sentence, and only gets a scalar score as feedback. Targeting these two challenges (adaptation and bandit learning), we built a standard neural machine translation system and extended it in two ways: (1) robust reinforcement learning techniques to learn effectively from the bandit feedback, and (2) domain adaptation using data selection from a large corpus of parallel data.

September 2017

Visual Comparison of Images Using Multiple Kernel Learning for Ranking

Amr Sharaf, Mohamed E. Hussein, and Mohamed A. Ismail.

Ranking is the central problem for many applications such as web search, recommendation systems, and visual comparison of images. In this paper, the multiple kernel learning framework is generalized for the learning to rank problem. This approach extends the existing learning to rank algorithms by considering multiple kernel learning and consequently improves their effectiveness. The proposed approach provides the convenience of fusing different features for describing the underlying data. As an application to our approach, the problem of visual image comparison is studied. Several visual features are used for describing the images and multiple kernel learning is adopted to find an optimal feature fusion. Experimental results on three challenging datasets show that our approach outperforms the state-of-the art and is significantly more efficient in runtime.

August 2015

Real-time Multi-scale Action Detection From 3D Skeleton Data

Amr Sharaf, Marwan Torki, Mohamed E. Hussein, and Motaz El-Saban

In this paper we introduce a real-time system for action detection. The system uses a small set of robust features extracted from 3D skeleton data. Features are effectively described based on the probability distribution of skeleton data. The descriptor computes a pyramid of sample covariance matrices and mean vectors to encode the relationship between the features. For handling the intra-class variations of actions, such as action temporal scale variations, the descriptor is computed using different window scales for each action. Discriminative elements of the descriptor are mined using feature selection. The system achieves ac- curate detection results on difficult unsegmented sequences. Experiments on MSRC-12 and G3D datasets show that the proposed system outperforms the state-of-the-art in detection accuracy with very low latency.

January 2015

Education

University of Maryland, College Park

PhD Student, Computer Science
Master of Science (MSc), Computer Science
August 2015 - Present

Alexandria University

Master of Science (MSc), Computer Science
Bachelor of Science (BSc), Computer Science
September 2012 - August 2015