I want to create learning agents which combine visual and linguistic reasoning to act intelligently in realistic environments.

Robust Baselines for EmbodiedQA

(Ongoing work)

Advised by Prof. Abhinav Shrivastava. Developing robust visual baselines for Embodied Question Answering datasets (e.g., EQA, IQUAD).

Embodied QA tasks, and the agents designed to complete them, provide a compelling mixture of vision, language, and action. But when an agent performs well at EQA, what has it really learned? This project proposes a new agent design which can better quantify the difficulty of these tasks.

Learning to Grasp with Synthetic Examples

(Undergraduate thesis)

Machine Learning for Robotic Grasping. Advised by Prof. Kostas Daniilidis (UPenn). Proposed a [then-]novel application of deep learning-based object pose detection to robotic grasp synthesis. Also included an extensive review of existing literature on supervised learning for grasp synthesis.

Code from that project, for creating synthetic training images using the Gazebo simulator, is available here.