Responsible Machine Learning through the Lens of Causal Inference

Amanda Coston
Talk Series: 
03.08.2023 11:00 to 12:00

Machine learning algorithms are widely used for decision-making in societally high-stakes settings from child welfare and criminal justice to healthcare and consumer lending. Recent history has illuminated numerous examples where these algorithms proved unreliable or inequitable. In this talk I show how causal inference enables us to more reliably evaluate such algorithms’ performance and equity implications. In the first part of the talk, I demonstrate that standard evaluation procedures fail to address missing data and as a result, often produce invalid assessments of algorithmic performance. I propose a new evaluation framework that addresses missing data by using counterfactual techniques to estimate unknown outcomes. Using this framework, I propose counterfactual analogues of common predictive performance and algorithmic fairness metrics that are tailored to decision-making settings. I provide double machine learning-style estimators for these metrics that achieve fast rates & asymptotic normality under flexible nonparametric conditions. I present empirical results in the child welfare setting using data from Allegheny County’s Department of Human Services. In the second half of the talk, I propose novel causal inference methods to audit for bias in key decision points in contexts where machine learning algorithms are used. A common challenge is that data about decisions are often observed under outcome-dependent sampling. I develop a counterfactual audit for biased decision-making in settings with outcome-dependent data. Using data from the Stanford Open Policing Project, I demonstrate how this method can identify racial bias in the most common entry point to the criminal justice system: police traffic stops. To conclude, I situate my work in the broader question of governance in responsible machine learning.