PhD Proposal: Computational Photography in Challenging Conditions via Physical Cues and Generative Priors

Talk
Mingyang Xie
Time: 
08.26.2025 14:00 to 15:30

Computational photography uses computer algorithms to recover clean visual images or videos from degraded inputs. This process becomes particularly challenging in real-world environments, such as seeing through scattering media (e.g., fog) or reflective surfaces (e.g., glass), where complex light transport corrupts camera measurements. Most existing methods attempt to learn a direct mapping from degraded inputs to clean outputs, but such problems are often severely ill-posed and result in limited performance.
This proposal focuses on two complementary strategies: (1) introducing physical cues, such as active illumination or optical modulation, to improve fidelity and reduce ambiguity; and (2) incorporating generative priors to plausibly complete missing details when physical cues alone are insufficient.
To demonstrate these strategies, this proposal presents 3 works of mine: WaveMo learns to modulate light wavefronts for seeing through scattering media. Flash-Splat performs 3D reflection removal by combining flash cues with 3D Gaussian Splatting. Flash-Split performs 2D reflection removal with a latent diffusion model.