PhD Proposal: Interpretable Deep Learning for Time Series Prediction and Forecasting

Aya Abdelsalam Ismail
04.01.2020 15:00 to 17:00

Time series data emerges in applications across many domains including neuroscience, medicine, finance, economics, and meteorology. Deep learning has revolutionized many machine learning including natural language processing and computer vision; however, its applications to time series data has been limited. In this work, we investigate both interpretability and accuracy of deep neural networks when applied to time series data.We start by analyzing saliency-based interpretability for Recurrent Neural Networks (RNNs). We show that RNN saliency vanishes over time, biasing detection of salient features only to later time steps and are, therefore, incapable of reliably detecting important features at arbitrary time intervals. To address this, we propose a novel RNN cell structure (input-cell attention), which can extend any RNN cell architecture. At each time step, instead of only looking at the current input vector, input-cell attention uses a fixed-size matrix embedding, each row of the matrix attending to different inputs from current or previous time steps. We show that the saliency map produced by the input-cell attention RNN is able to faithfully detect important features regardless of their occurrence in time.Next, we create an evaluation framework based on time series data for interpretability methods and neural architectures. We propose and report multiple metrics as an empirical determination for the performance of a specific saliency method for detecting feature importance over time. We apply our framework to different saliency-based methods including Gradient, Input X Gradient, Integrated Gradients, SHAP values, DeepLIFT, DeepLIFT with SHAP and SmoothGrad, across diverse models including LSTMs, LSTMs with Input-Cell Attention, Temporal Convolutional Networks and Transformers. We find that, architecture has a strong effect on saliency quality over the choice of saliency measurement method.In addition to interpretability, we explore the challenges of long-horizon forecasting using RNNs. We show that the performance of these methods decays as the forecasting horizon extends beyond few time steps. We then propose expectation-biasing, an approach motivated by Dynamic Belief Networks, as a solution to improve long-horizon forecasting using RNNs.In light of our findings, we propose creating a benchmark comparing neural architectures performance when changing the forecasting horizon and investigating the effect of forecasting horizon has on model interpretability. Finally, we set out to build inherently interpretable neural architectures specifically transformers that allows saliency methods to faithfully capture feature importance across time.Examining Committee:

Chair: Dr. Héctor Corrada Bravo Co-Chair: Dr. Soheil Feizi Dept rep: Dr. Marine Carpuat Members: Dr. Thomas Goldstein