PhD Proposal: Fine-Grained Control for Video Editing

Talk
Yao-Chih Lee
Time: 
04.10.2026 12:40 to 14:00

Video editing is vital for transforming raw footage into structured visual narratives. However, unlike controlled studio productions, real-world videos feature a complex interplay between dynamic scene motion and unconstrained camera trajectories. This complexity hinders advanced tasks such as precise viewpoint synthesis and object-level manipulation. Furthermore, while modern generative models offer powerful text-to-video synthesis, high-level textual prompts lack the precision required for fine-grained control, preventing users from achieving specific, predictable results.
In this thesis, we address challenging video editing tasks by introducing methodologies for fine-grained controllability. First, we present a layer-decomposition framework that isolates objects and their correlated effects, enabling versatile scene manipulation. Second, we develop an efficient novel view synthesis representation that accelerates training and rendering for monocular videos. Finally, we establish 3D point tracks as a unified representation for generative motion editing, allowing simultaneous control over camera trajectories and scene-object movements. Collectively, these works provide a technical foundation for 3D-aware, controllable editing in unconstrained, real-world environments.