PhD Proposal: Speech-Driven Immersive Analytics: GenAI-grounded Sensemaking Perspective
[Remote] https://umd.zoom.us/my/varshney
As generative AI facilitates human–analytics system interaction, there is growing interest in natural interactions within immersive analytics. Natural interaction inherently entails nuanced communication of intent between humans and the system during data analysis across various input and output modalities. Speech can serve as the central interaction modality in this communication loop, employing its unique properties to enable more natural analytic interaction and to complement other embodied modalities. However, there is a lack of fundamental studies specifically focused on speech as the primary interaction modality. Existing work approaches speech from a relatively narrow perspective, either treating it as merely auxiliary in multimodal interactions or relying primarily on text-based modalities instead. Therefore, this dissertation addresses this problem by defining Speech-driven Immersive Analytics and examining it through three perspectives.Speech-to-Intent aims to understand the most fundamental factors influencing the communication loop between humans and immersive analytics systems through two studies: EmbodiedNLI (IEEEVIS 2025), which uncovers users' speech patterns and the degree of embodiment reliance expressed in speech; and SIA (ACM Intelligent User Interfaces 2026), which introduces a Speech-driven Immersive Analytics framework. SIA focuses on the local interaction context in the immediate future, guiding users, especially novices, toward their next actions during the foraging phase.Attention-to-Intent dives deeper into the data reasoning phase to tackle challenges related to working memory load and the loss of reasoning context, especially targeting more experienced users. This study resolves these issues by focusing on the local interaction context in the immediate past, capturing both implicit and explicit signals from users' recent actions that reveal previously unnoticed insights during the data reasoning stage.Context-to-Intent widens the lens to include long-term and situational context alongside these local interaction contexts, moving toward human–AI co-analysis. We discuss future directions for this research.