PhD Proposal: Visual Content Synthesis at Scale
IRB IRB-3137
The visual content humans can synthesize has always been in a symbiotic and continually evolving relationship with technological development. Impressionism develops with synthetic pigments, film-making starts with zoopraxiscope, and video games grow with computer-generated imagery. Along the way of technology development, tons of visual data has been accumulated, from paintings to web images and videos. The availability of such data is unprecedented in human history. Generative models are one of the effective ways to handle these data. By training the scalable models on a large volume of visual data, the models can synthesize top-tier visual content. Further combined with various controllability tools, it empowers individuals to create their desired artistic content by instructing with natural language or operating with intuitive user interfaces, even without any skill training like before. In this thesis proposal, we explore the scalable generative models architectures and training, analyze the evaluation metrics and training data, and their applications to various domains and tasks.
Examining Committee
Chair:
Dr. David Jacobs
Department Representative:
Dr. Tom Goldstein
Members:
Dr. Jia-Bin Huang