PhD Defense: Augmenting Communication in Extended Reality through Multimodal Systems for Humans and Agents
IRB-3137 https://umd.zoom.us/my/dmanocha
Extended Reality (XR) introduces a new design space for communication, enabling rich, multimodal interactions between humans and with intelligent agents. A critical challenge, however, lies in effectively integrating these modalities, such as gaze, gesture, and speech, to align with a user's attentional state and communicative intent. Without this alignment, systems often create what this thesis defines as “attentional friction”, a misalignment between an interface’s demands and a user’s cognitive state. This friction appears in the form of poorly timed information delivery, ambiguous social cues, and disruptive agent interventions, which collectively increase cognitive load and hinder effective communication.
This dissertation addresses this challenge through a three-phase investigation that designs, builds, and evaluates multimodal systems supporting human and agent communication of increasing complexity. The first phase explores attention in knowledge work and demonstrates that systems responsive to gaze and task context improve reading efficiency, decrease perceived workload, and enhance social presence and collaborator awareness in both single-user and dyadic procedural tasks. The second phase expands these findings to the complex dynamics of human–human communication in immersive meetings. This phase contributes systems that achieve faster user response times to new speakers, increase conversation satisfaction, reduce attentional re-engagement time after disruptions, and improve both social presence and information recall. The final phase focuses on agentic communication systems, presenting a tool that reduces presenter effort while enhancing audience connection in online presentations, as well as a framework for proactive AR agents that minimize perceived interaction effort by aligning their behavior with user attentional and social context.
By uniting the theoretical concept of attentional friction with the empirical evaluation of multimodal systems across a progression of contexts, from individual to collaborative to agent-based, this dissertation provides a validated pathway toward more effective, expressive, and human-centered communication in XR.