PhD Proposal: Robust and Interpretable Deep Learning: Understanding the Role of Data and Architecture

Vasu Singla
05.15.2024 10:30 to 12:30

IRB IRB-4109

Current AI systems, have achieved remarkable performance on several computer vision benchmark tasks. However, these models are still far from reliable, and exhibit degradation in performance even in the presence of small imperceptible perturbations. These AI systems serve as black boxes, and it's unclear how the underlying training data and model architecture impact their performance. In this thesis, we shed more light on this topic and understand how the training data and model biases impact the model performance under adversarial and natural test distribution. First, we focus on how the choice of architecture impacts performance on small imperceptible perturbations. In our first work, we analyze how the choice of activation function impacts the adversarial robustness of classifiers. Second, we show that shift-invariance a critical property of convolution neural networks can lead to greater sensitivity to adversarial attacks. Next, we shift our focus to understanding the impact of data on model performance. We propose a new method to create imperceptible noise, which, when added to the training data, causes models' test accuracy to reach near-random chance. This work investigates how learning shortcuts in training data can prevent models from achieving generalization. Finally, we propose a simple mechanism to understand how sensitive the model's predictions are with respect to the training data. We show that in contrast to prior work which uses a lot of computational power, and an ensemble of models, a single self-supervised model can serve as a baseline for how the training data influences model predictions.