Eadom Dessalene

I am a PhD Candidate in Computer Science at the University of Maryland, College Park, where I work with the Perception and Robotics Group and am advised by Yiannis Aloimonos.

My research studies how robots can learn from human videos and how visual representations learned from passive observation can be improved through embodied physical interaction. Broadly, I am interested in embodied action understanding, robot learning, egocentric vision, and multimodal perception.

Eadom Dessalene
PhD Candidate
Perception and Robotics Group
Department of Computer Science
University of Maryland

News & Talks

Jan 2026 Invited talk: Embodied Action UnderstandingNASA JPL Vision Seminar, NASA Jet Propulsion Laboratory
Dec 2025 Invited talk: Generative Models of ActionNVIDIA Research Radar, NVIDIA
Nov 2024 Invited talk: Understanding Actions from VideoNYC Computer Vision Day
Sep 2024 Invited talk: Learning the Organization of ActionUniversity of Maryland, Baltimore County
Jun 2024 Invited talk: Learning the Organization of ActionTelluride Neuromorphic Workshop
May 2023 Invited talk: Understanding Actions from VideoCoRL Cognitive Science Workshop

Selected Publications

FEEL dataset teaser
FEEL (Force-Enhanced Egocentric Learning): A Dataset for Physical Action Understanding
Eadom Dessalene, Botao He, Michael Maynord, Yonatan Tussa, Pavan Mantripragada, Yianni Karabatis, Nirupam Roy, Yiannis Aloimonos
ArXiv 2026
FEEL is the first large-scale egocentric dataset pairing video with synchronized force measurements from custom piezoresistive gloves. The dataset contains approximately 3 million force-synchronized frames of natural, unscripted kitchen manipulation and introduces force as a physically grounded supervisory signal for contact understanding and action representation learning.
EmbodiSwap
EmbodiSwap for Zero-Shot Robot Imitation Learning
Eadom Dessalene, Pavan Mantripragada, Michael Maynord, Yiannis Aloimonos
arXiv, 2025
We introduce EmbodiSwap, a method for producing photorealistic synthetic robot overlays on human video. The approach helps bridge the embodiment gap between in-the-wild egocentric human videos and a target robot embodiment, enabling zero-shot imitation learning for robot manipulation.
Context in Human Action
Context in Human Action through Motion Complementarity
Eadom Dessalene, Michael Maynord, Cornelia Fermüller, Yiannis Aloimonos
WACV 2024
We propose a learning framework in which context is modeled as the complement of motion. Physical movement is represented through Therbligs, while context is captured using a contrastive mutual-information-based objective.
LEAP
LEAP: LLM-Generation of Egocentric Action Programs
Eadom Dessalene, Michael Maynord, Cornelia Fermüller, Yiannis Aloimonos
arXiv 2023
LEAP generates video-grounded action programs composed of sub-actions, conditions, and control flows. It uses large language models to combine program knowledge with multimodal evidence from egocentric videos.
Therbligs in Action
Therbligs in Action: Video Understanding through Motion Primitives
Eadom Dessalene, Michael Maynord, Cornelia Fermüller, Yiannis Aloimonos
CVPR 2023
We introduce a compositional and hierarchical framework for action understanding based on Therbligs as motion primitives, along with differentiable rule-based reasoning for logical consistency.
Mid-Vision Feedback
Mid-Vision Feedback
Michael Maynord, Eadom T. Dessalene, Cornelia Fermüller, Yiannis Aloimonos
ICLR 2023
We introduce Mid-Vision Feedback, a mechanism that biases mid-level network representations using high-level categorical expectations, improving contextual consistency and recognition performance.
Forecasting action through contact representations
Forecasting Action through Contact Representations from First-Person Video
Eadom Dessalene, Chinmaya Devaraj, Michael Maynord, Cornelia Fermüller, Yiannis Aloimonos
PAMI 2021
We develop contact-centered representations and models for first-person video, motivated by the role of hand-object contact in the structure and anticipation of human action.
Near-contact robotic grasping
Using Geometric Features to Represent Near-Contact Behavior in Robotic Grasping
Eadom Dessalene, Yi Herng Ong, John Morrow, Ravi Balasubramanian, Cindy Grimm
ICRA 2019
We define hand-object geometric feature representations for robotic grasping at the near-contact stage, designed to be robust to noise, morphology differences, and suitable for direct machine learning use.

Service

Reviewer: ICRA, ICLR, CVPR, PAMI, WACV

Competitions

EPIC-Kitchens Action Recognition Challenge 2024 — 4th Place
Alexa Prize SimBot Challenge — Team Lead, University of Maryland (Qualified for Semi-Finals)
EPIC-Kitchens Action Anticipation Challenge 2020 — 1st Place
Amazon Robotics Challenge — Qualified for Finals