Neuro-Symbolic Embodied AI For Visual Intelligence

Talk
Chuang Gan
Talk Series: 
Time: 
04.11.2022 14:00 to 15:00

Intelligence is characterized by the ability to understand and reason about the world around us. While deep learning has excelled at pattern recognition tasks such as image classification and object recognition, it falls short of deriving the true understanding necessary for complex reasoning and physical interaction. In this talk, I will introduce a framework, neuro-symbolic embodied AI, for bringing intelligence to immersive media. This framework aims to reduce the gap between machine and human intelligence in terms of data efficiency, flexibility, and generalization. My approach combines the ability of neural networks to extract patterns from data, symbolic programs to represent and reason from prior knowledge, and physics engines for inference and planning. Together, they form the basis for enabling machines that can effectively reason about underlying objects and their associated dynamics, as well as master new skills efficiently and flexibly. I will conclude my talk by introducing the ThreeDWorld platform, a multi-modal interactive world simulator, and highlighting its potential for promoting research and education in immersive media design.