PhD Proposal: Fairly Law-informed Distribution of Human-AI Deserts
IRB-4107 https://umd.zoom.us/my/haliii
This proposal's broad theme is to incorporate insights from U.S. law and moral philosophy to audit for and enable fairer distribution of deserts in collaborative human-AI systems. The completed works encompass the first category of ''desert'': burden, e.g., 1/ decision makers' burden of potentially suffering more to perform their tasks when AI assistance and explanations are given (compared to when they make decisions alone); 2/ decision subjects' burden of bearing unfavorable and potentially erroneous decisions, as normatively regulated by federal and state laws, technical algorithmic fairness criteria, and philosophical debates on how to measure decision subjects' efforts to improve; The proposed work will address the second category of ''desert'': blame and credit, i.e., when a human-AI system generates something blame-worthy or credit-worthy, how to fairly determine whom to blame or credit (e.g., the AI model developers vs. users) and how to assign blame/credit (e.g., through civil damages/copyrights).In the first completed work, we address the AI-assisted decision makers' task burden. We propose to characterize what constitutes an explanation that is itself "fair'' -- an explanation that does not adversely impact specific populations. We formulate a novel evaluation method of " fair explanations'' using not just accuracy and label time, but also psychological impact of explanations on different user groups across many metrics (mental discomfort, stereotype activation, and perceived workload). We apply this method in the context of content moderation of potential hate speech, and its differential impact on Asian vs. non-Asian proxy moderators, across explanation approaches (saliency map and counterfactual explanation). We find that saliency maps generally perform better and show less evidence of disparate impact (group) and individual unfairness than counterfactual explanations.In the second completed work, we address the burden on decision subjects due to unfair AI predictions, as defined by technical and legal fairness concepts. Despite its constitutional relevance, the technical ''individual fairness'' criterion has not been operationalized in U.S. state or federal statutes/regulations. We conduct a human subjects experiment to address this gap, evaluating which demographic features are relevant for individual fairness evaluation of recidivism risk assessment (RRA) tools. Our analyses conclude that the individual similarity function should consider age and sex, but it should ignore race.In the third completed work, we extend fairness metrics to consider each decision subject's individual-level efforts. We propose a philosophy-informed approach to conceptualize and evaluate Effort-aware Fairness (EaF), grounded in the concept of Force, which represents the temporal trajectory of predictive features coupled with inertia. Besides theoretical formulation, our empirical contributions include: (1) a pre-registered human subjects experiment, which shows that for both stages of the (individual) fairness evaluation process, people consider the temporal trajectory of a predictive feature more than its aggregate value; (2) pipelines to compute Effort-aware Individual/Group Fairness in the criminal justice and personal finance contexts. Our work may enable AI model auditors to uncover and potentially correct unfair decisions against individuals who have spent significant efforts to improve but are still stuck with systemic disadvantages outside their control.In the proposed work, regarding the 'blame' aspect of desert, we observe that the recent surge in GenAI adoption by the general population comes with hundreds of documented incidents of AI-generated harm. To justly compensate victims and deter harmful behaviors, we will develop a law/moral psychology-informed framework and metrics to determine how much liability each party (user vs. deployer) should bear, with three research questions:
RQ1: How can legal liability principles create cost-effective disincentives to GenAI harm?
RQ2: Which blame factors and their dependencies apply to the Human-GenAI context?
RQ3: How to develop technical metrics for blame factors (e.g., deployer/user's causality or control on the harmful AI materials)?
Regarding the "credit" aspect of desert, we observe that the U.S. Copyright Office, backed by a federal court decision, has recently adopted a hard stance on no copyright for artworks resulting from human-generative AI collaboration, demoralizing technology-adaptive artists. However, the Office allows copyright for "substantive" edits (e.g., those made with Photoshop) that humans perform on top of the AI-generated work. As many edits are non-separable from the final artwork, we propose to develop and evaluate (via a human study with AI-literate artists) two metrics, based on the ''authorship'' and ''creativity'' criteria, to quantify how copyrightable a creative prompt engineering process is.