PhD Proposal: Interpretability of Deep Models Across different Architectures and Modalities
The interpretability of deep models has been an active area of research for a long time. In particular, model inversion seeks to find the perception of a model of a target class. Model inversion is essential for visualizing and interpreting behaviors inside neural architectures, understanding what models have learned, and explaining model behaviors. However, existing techniques for model inversion typically rely on hard-to-tune regularizers, such as total variation or feature regularization, which must be individually calibrated for each network to produce adequate images. We introduce Plug-In Inversion, which relies on a simple set of augmentations and does not require excessive hyperparameter tuning. We illustrate the practicality of our approach by inverting Vision Transformers (ViTs) and Multi-Layer Perceptrons (MLPs).While feature visualizations and image reconstructions have provided a looking glass into the workings of CNNs, these methods have shown less success in understanding ViT representations, which are difficult to visualize. However, we show that, if properly applied to the correct representations, feature visualizations can indeed succeed on ViTs. This insight allows us to visually explore ViTs and the information they glean from images.For image-based tasks, networks have been studied using feature visualization, which produces interpretable images that stimulate the response of each feature map individually. Visualization methods help us understand and interpret what networks "see." In particular, they elucidate the layer-dependent semantic meaning of features, with shallow features representing edges and deep features representing objects. While this approach has been quite effective for vision models, our understanding of networks for processing auditory inputs, such as automatic speech recognition (ASR) models, is more limited because their inputs are not visual. To this end, we consider methods to sonify, rather than visualize, their feature maps.
Dr. Tom Goldstein
Dr. Abhinav Shrivastava
Dr. Soheil Feizi