PhD Defense: Generating Visual Content: From Pixel Orders to Videos and Beyond

Talk

Hanyu Wang

Time:

12.17.2025 13:00 to 14:30

Location:

IRB-4105 https://umd.zoom.us/j/91998449224?pwd=eikLLACYfbt3Za5sNlfRlhnJUbU1Ee.1

URL:

https://talks.cs.umd.edu/talks/4445

Visual content generation is a fundamental challenge in computer vision that enables diverse applications across domains. The high-dimensional nature of visual data makes it particularly challenging to achieve both quality and precise control in generation tasks. This thesis investigates visual generation across varying levels of abstraction, ranging from fundamental pixel-level ordering to video synthesis, and extending beyond to the unification of perception and creation within large-scale multimodal systems.We begin by addressing the foundational challenge of sequentially representing visual data through Neural Space-filling Curves, a data-driven approach that learns context-aware pixel orderings optimized for downstream tasks such as LZW compression. We then explore controlled image generation through two complementary approaches: Chop & Learn, a framework for compositional generation that enables synthesis of novel object-state combinations, and a multimodal style transfer method that effectively combines guidance from both images and text. For video generation, we introduce LARP, a novel tokenization approach with a learned autoregressive prior that achieves state-of-the-art performance while maintaining computational efficiency. Finally, we present Bridge, a unified framework that equips pre-trained MLLMs with visual generative capabilities. By utilizing a Mixture-of-Transformers architecture to handle conflicting modalities and a novel semantic-to-pixel discrete representation, Bridge enables high-precision visual understanding and high-fidelity generation within a single model, effectively closing the loop between perception and creation.

Upcoming Events

Event

03.06.2026 12:00 to 13:00

4105 IRB

Computer Science Instructional Faculty Meeting

Event

03.06.2026 14:00 to 15:00

4105 IRB

Computer Science APT Meeting

Event

03.06.2026 15:00 to 16:30

0318 IRB

Education Committee Meeting - IRB 0318

Event

03.13.2026 14:00 to 15:00

4105 IRB

Computer Science APT Meeting

Event

03.25.2026 12:00 to 13:00

4105 IRB

Computer Science Assistant Professor Meeting

Event

03.27.2026 12:00 to 13:30

4105 IRB

Computer Science FFL

Event

03.27.2026 14:00 to 15:00

4105 IRB

Computer Science APT Meeting

Event

03.27.2026 15:00 to 16:30

0318 IRB

Education Committee Meeting - IRB 0318

Event

04.03.2026 14:00 to 15:00

4105 IRB

Computer Science APT Meeting

Event

04.10.2026 14:00 to 15:00

4105 IRB

Computer Science APT Meeting