Ruohan Gao on Multi-Sensory AI

He discusses his path into the field, current projects and the challenges of integrating vision, sound and touch.

October 08, 2025

Descriptive image for Ruohan Gao on Multi-Sensory AI

Ruohan Gao, an assistant professor of computer science at the University of Maryland, has centered his career on understanding the biological role of vision and its relationship to other sensory inputs. Through the Multi-Sensory Machine Intelligence Lab, he integrates sight, sound and touch to design models that replicate how humans and animals perceive the world.

In this interview, he discusses his path into the field, recent research projects, challenges in building multi-sensory systems and the broader societal impacts of his work.

Was there a defining moment that shaped your career path into computer science?

I have always been interested in computing, going back to high school and my undergraduate studies. What really motivated me was vision. From a biological perspective, vision is central to how humans and animals understand the world. It has even been tied to evolutionary developments such as the Cambrian explosion. That curiosity made me want to explore whether machines could perceive and interpret the world in similar ways.

What do you enjoy most about your research area?

My work began with computer vision, but I now focus on connecting vision with other modalities such as audio and touch. Humans rely on multiple senses to understand their environment, and I enjoy exploring how to model that interaction computationally. This work allows me to collaborate across communities and learn different approaches, which I find particularly rewarding.

Can you describe a project you are working on now that reflects this approach?

One project I would highlight is our work on audio-visual room acoustic rendering. The goal is to reconstruct not only what a space looks like but also how it sounds. Different materials and room shapes affect reverberation, which changes how sound is perceived.

Our work links visual information with acoustic properties. For example, a carpet has a distinctive appearance but also alters sound reflections. By using visual cues, we can predict acoustic behavior and create realistic renderings. A paper on this project was accepted for oral presentation at the International Conference on Computer Vision, and we are continuing to extend this work.

What are some of the challenges in multi-sensory research?

A major challenge is hardware. For vision and audio we have cameras and microphones, but for modalities like touch there is no unified sensor. Researchers often build custom devices, but this makes scaling difficult.

Another challenge is data. Computer vision advanced rapidly because of large datasets such as ImageNet. Multi-sensory research lacks similar resources, so it is harder to train models effectively. Building better sensors and large datasets will be essential for progress.

How does your work contribute to the broader computer science community or to society?

Multi-sensory approaches have many applications. One area we study is echolocation for visually impaired individuals. By using sound reflections, similar to how bats navigate, people can perceive spatial structures without vision. This could be useful for accessibility.

Another project focuses on multi-modal fire investigation training. Training firefighters in real environments is risky, so we are developing embodied platforms that simulate conditions using multiple sensory inputs. These examples show how combining modalities can provide practical benefits.

What inspired you to join the University of Maryland, and what have you enjoyed most so far?

The decision was driven by the strong research environment and the opportunities for collaboration. The department has a well-established computer vision group and faculty working on audio, robotics and artificial intelligence. My research connects with these areas, which makes it a good fit. I have also enjoyed working with the students, who bring strong skills and motivation.

What advice would you give to students interested in your area of research?

The field of artificial intelligence changes quickly, and it is easy to chase trends without a clear direction. My advice is to identify problems that truly interest you. Passion matters because it keeps you motivated through challenges. Once you find an area that excites you, focus on it. That combination of passion and focus is important for making progress in research.

—Story by Samuel Malede Zewdu, CS Communications

The Department welcomes comments, suggestions and corrections. Send email to editor [-at-] cs [dot] umd [dot] edu.