PhD Proposal: DENSE 3D RECONSTRUCTIONS FROM SPARSE VISUAL DATA

Talk
Tao Hu
Time: 
08.12.2021 09:00 to 11:00
Location: 

Remote

3D reconstruction, the problem of estimating the complete geometry or appearance of objects from partial observations, serves as a building block in many vision and robotics applications such as 3D scanning, autonomous driving, and 3D modeling. However, 3D reconstruction from sparse visual data (e.g., sparse RGB images or depth maps, partial shapes) is challenging due to occlusions, irregularity and complexity of 3D shapes. In addition, 3D shapes have various representations, including explicit representations (e.g., volumetric grids, point clouds, meshes, multi-view images), and implicit surface representations, while there is no unique shape representation that works well for all applications. My research goal is to develop effective 3D representations enabling 3D reconstructions of the world from visual data to allow better efficiency and efficacy in 3D understanding and modeling. “Dense” 3D reconstruction here refers to dense shape generations or free-viewpoint renderings.In this proposal, I will present my research projects from the aspects of multi-view shape representation and its applications in shape completion and reconstruction, specifically: 1) multi-view representation for 3D shape completion, 2) consistency in multi-view representation, and 3) dense geometry and texture reconstruction from single RGB images. 3D shape completion is to perceive the complete geometry of objects from partial observations, e.g., sparse depth maps or partial point clouds, and shape completion is widely used in 3D perception and autonomous driving. I will first analyze the advantages of multi-view representation for shape completion tasks, and propose a pipeline which is able to generate dense and high-resolution point clouds. Yet one problem with the multi-view representation is the inconsistency among different views. To solve this problem, a shape memory mechanism and multi-view consistency optimization will be introduced, which encourages consistency in inference stage without taking any ground truth as supervision. Third, the extension of multi-view representation for dense 3D geometry and texture reconstructions from single RGB images will be discussed.Besides the explicit 3D representations, I will also introduce the implicit texture representations and neural rendering techniques for view synthesis of dynamic humans. Neural rendering is a class of deep image and video generation approaches that combines generative machine learning techniques with physical knowledge from computer graphics to obtain controllable outputs. I will take a proposed system EgoRenderer as an example, which is built on a wearable, egocentric fisheye camera, and utilizes 3D egocentric pose estimation, implicit texture reconstruction, and neural rendering techniques to render free-viewpoint neural avatars of a person. Finally, I will summarize the finished work, and present my ongoing projects and future work.Examining Committee:

Chair: Dr. Matthias Zwicker Dept rep: Dr. Marine Carpuat Members: Dr. Abhinav Shrivastava