PhD Defense: An efficient neural representation for videos

Hao Chen
04.03.2023 17:00 to 19:00

IRB 4105

As the popularity of videos increases, it becomes crucial to find efficient and compact ways of representing them to facilitate their storage, transmission, and downstream video tasks. During his Ph.D, Hao introduced an innovative neural representation for videos called NeRV, in which each video is stored implicitly as a neural network. Building on NeRV, he proposed a hybrid representation for videos (HNeRV) resulting in improved internal generalization and representation capacity. It allows for highly efficient video representation and compression, with a model size that is up to 1000 times smaller than the original raw video. Besides efficiency, HNeRV’s simple decoding process - a feedforward operation - enables fast video loading and easy deployment. Consequently, we developed an efficient neural video dataloader (NVLoader) that is 3-6 times faster than conventional video dataloaders. To address encoding speed, we introduced the HyperNeRV framework, which uses a hypernetwork to directly map input videos to NeRV model weights, speeding up the encoding process by 10^4 times. Aside from developing compact and implicit video neural representations, we explore several compelling applications based on them, such as frame interpolation, video restoration, and video editing. Moreover, the compactness of these representations makes them an ideal output video format that significantly reduces the search space or an efficient input for video understanding models.

Examining Committee


Dr. Abhinav Shrivastava

Dean's Representative:

Dr. Behtash Babadi


Dr. Furong Huang

Dr. Ramani Duraiswami

Dr. Saining Xie (New York University)