PhD Defense: Towards Trustworthy AI: Methods for Enhancing Robustness and Attribution
IRB-5137 or https://umd.zoom.us/my/vsingla
While deep learning achieves remarkable success in computer vision, its vulnerability to attacks and the difficulty in interpreting predictions pose significant barriers to trustworthy AI. This talk focuses on enhancing two crucial aspects: robustness and attribution. In the first part, we examine adversarial robustness, revealing how activation function geometry impacts adversarial training outcomes and demonstrating, both theoretically and empirically, that CNN shift-invariance can negatively affect robustness. We also introduce a novel, potent data poisoning technique ("autoregressive perturbations") designed to resist standard defenses. In the second part, we address attribution, presenting a computationally efficient method using self-supervision to understand training data influence on predictions. We then tackle memorization in generative AI, quantifying data replication in text-to-image diffusion models and introducing effective mitigation strategies that preserve output quality. These contributions provide key insights and practical tools for building more robust, interpretable, and trustworthy AI.