PhD Defense: Pruning for Efficient Deep Learning: From CNNs to Generative Models

Talk
Alireza Ganjdanesh
Time: 
01.17.2025 13:00 to 15:00
Location: 

IRB IRB-4109

Deep learning models have shown remarkable success in visual recognition and generative modeling tasks in computer vision in the last decade. A general trend is that their performance improves with an increase in the size of their training data, model capacity, and training iterations on modern hardware. However, the increase in model size naturally leads to higher computational complexity and memory footprint, thereby necessitating high-end hardware for their deployment. We develop model pruning methods to improve the inference efficiency of deep learning models for visual recognition and generative modeling applications. We design our methods to be tailored to the unique characteristics of each model and its task.In the first part, I present model pruning methods for Convolutional Neural Network (CNN) classifiers. We start by proposing a pruning method that leverages interpretations of a pre-trained model's decisions to prune its redundant structures. Then, we develop a framework for the simultaneous pretraining and pruning of CNNs, which combines the first two stages of the pretrain-prune-finetune pipeline that is commonly used in model pruning and reduces its complexity.In the second part, I discuss model pruning methods for visual generative models. First, we present a pruning method for conditional Generative Adversarial Networks (GANs) in which we prune the generator and discriminator models in a collaborative manner. We then address the inference efficiency of diffusion models by proposing a method that prunes a pre-trained diffusion model into a mixture of efficient experts, each handling a separate part of the denoising process. Finally, we develop an adaptive prompt-tailored pruning method for modern text-to-image diffusion models. It prunes a pre-trained model like Stable Diffusion into a mixture of efficient experts such that each expert specializes in certain types of input prompts.