PhD Proposal: Towards Robust and Accurate Neural Networks

Talk

Chen Zhu

Time:

04.27.2021 10:00 to 12:00

Location:

Remote

URL:

https://talks.cs.umd.edu/talks/2830

Recent advances in deep learning have demonstrated the seemingly unlimited effectiveness of scaling up training data and model size. New challenges emerge as the dataset and model become larger.First, large but less-inspected datasets are prone to poisoning attacks. To bring attention to such threats, we develop a transferable clean-label targeted poisoning attack on an unknown image classification network, which adds imperceptible perturbations to the images without changing their labels to cause misclassification of a target image for models trained on the poisoned dataset. We also provide a deep k-NN defense against this type of poisoning attack that removes the poisons without compromising model performance.Second, deep neural networks are vulnerable to adversarial examples. Adversarial training, which adds adversarial perturbations to the training examples throughout the training process, improves the robustness against such adversarial examples, at the cost of accuracy on natural data. Surprisingly, we show that when the adversarial perturbation is added to their embedding space, adversarial training can improve the natural accuracy for Transformers on language understanding and vision&language tasks, as well as graph neural networks on both node and graph classification tasks. We develop algorithms that further improve the efficacy of the standard PGD-based adversarial training in this setting.Third, training can be slow or even unstable when the network becomes larger, when we experiment with a new architecture, or when we generalize an existing neural network to a different domain. It is usually insufficient to resolve such issues by only fine-tuning the training hyperprameters. Rather, careful variance analysis is required to derive proper magnitudes for the weights at initialization. Such derivations usually simplify the model and may not be optimal for networks with complicated branching architectures or high nonlinearities. We develop an efficient learning-based initialization scheme to eliminate such simplifications and ensure versatility for any architecture. It accelerates the convergence and test accuracy for a range of architectures.Examining Committee:

Chair: Dr. Tom Goldstein Dept rep: Dr. David Jacobs Members: Dr. Furong Huang