PhD Defense: Dense 3D Reconstructions from Sparse Visual Data

Talk
Tao Hu
Time: 
11.14.2022 09:30 to 12:30
Location: 

IRB 4107

3D reconstruction, the problem of estimating the complete geometry or appearance of objects from partial observations (e.g., several RGB images, partial shapes, videos), serves as a building block in many vision, graphics, and robotics applications such as 3D scanning, autonomous driving, 3D modeling, augmented reality (AR) and virtual reality (VR). However, it is very challenging for machines to recover 3D geometry from such sparse data due to occlusions, and irregularity and complexity of 3D objects. To solve these, in this dissertation, we explore learning-based 3D reconstruction methods for different 3D object representations on different tasks: 3D reconstructions of static objects and dynamic human bodies from limited data.For the 3D reconstructions of static objects, we propose a multi-view representation of 3D shapes, which utilizes a set of multi-view RGB images or depth maps to represent a 3D shape. We first explore the multi-view representation for shape completion tasks and develop deep learning methods to generate dense and high-resolution point clouds from partial observations. Yet one problem with the multi-view representation is the inconsistency among different views. To solve this problem, we propose a multi-view consistency optimization strategy to encourage consistency for shape completion in inference stage. Third, the extension of multi-view representation for dense 3D geometry and texture reconstructions from single RGB images will be presented.Capturing and rendering realistic human appearances under varying poses and viewpoints is an important goal in computer vision and graphics. In the second part, we will introduce some techniques to create 3D virtual human avatars with limited data (e.g., videos). We propose implicit representations of motion, texture, and geometry for human modeling, and utilize neural rendering techniques for free view synthesis of dynamic articulated human bodies. Our learned human avatars are photorealistic and fully controllable (pose, shape, viewpoints, etc.), which can be used in free-viewpoint video generation, animation, shape editing, telepresence, and AR/VR.Our proposed methods can learn end-to-end 3D reconstructions from 2D image or video signals without the use of explicit 3D supervision. We hope these learning-based methods will assist in perceiving and reconstructing the 3D world for future AI systems.

Examining Committee

Chair:

Dr. Matthias Zwicker

Dean's Representative:

Dr. Joseph F. JaJa

Members:

Dr. Marine Carpuat

Dr. Abhinav Shrivastava

Dr. John Aloimonos