PhD Defense: Towards Inverse Rendering with Global Illumination

Talk
Saeed Hadadan
Time: 
07.15.2025 10:30 to 11:30
Location: 

Obtaining 3D representations from real-world observations is a long-standing problem in computer vision/graphics with important applications in virtual/augmented reality. Inverse rendering is a technique for inferring 3D information—such as geometry, materials, and lighting—from a set of 2D images based on image formation principles. To recover scene parameters, it uses iterative gradient descent to minimize a loss between input images and images rendered by a differentiable renderer. Specifically, physically-based inverse rendering emphasizes accurate modeling of light transport physics, which can be prohibitively expensive. For example, inter-reflections between objects in a scene—known as global illumination—require simulating multi-bounce light transport. Differentiating this process across millions of pixels and parameters over many light bounces can exhaust memory, especially when using automatic differentiation, which stores a large transcript during the forward pass and differentiates through it in the backward pass. Consequently, many works in the literature simplify the problem by limiting light bounces to one or two, often resulting in inaccurate reconstructions.This doctoral thesis focuses on improving the efficiency of global illumination algorithms using neural networks, with the ultimate purpose of making real-world inverse rendering more accessible. In particular, we present Neural Radiosity, a method of finding the solution of the rendering equation using a neural network by minimizing the residual of the rendering equation. We integrate Neural Radiosity in an inverse rendering pipeline and introduce a radiometric prior as a form of regularization term next to the photometric loss. As inverse rendering requires differentiating the rendering algorithm, we further apply the idea of Neural Radiosity to find the solution of the differential rendering equation. Finally, by coupling inverse rendering with generative AI, we present a method for synthesizing 3D assets. We use an image diffusion model to generate realistic material details on renderings of a scene, and backpropagate the new details into the scene description using inverse rendering. To achieve multi-view consistency using an image model, we propose to bias the attention mechanism without retraining the model. Together, our contributions advance the state-of-the-art in global illumination for inverse rendering, showing that this previously prohibitive goal is more attainable with neural methods. This thesis also demonstrates the potential of combining inverse rendering with generative AI for 3D content creation.