AI Alignment through Participation and Evaluation: Promises and Pitfalls

Talk

Michael Feffer

Talk Series:

Visitors

Time:

04.21.2026 11:00 to 12:00

Location:

IRB 4105 or https://umd.zoom.us/j/93666933047?pwd=gWgqOgGbBP6laZclyURdDG2mNdArBt.1

URL:

https://talks.cs.umd.edu/talks/4591

Researchers have been studying whether AI systems reflect stakeholders’ values and developing strategies for alignment when systems do not. In this talk, I will illustrate how two existing approaches to AI alignment, participation and evaluation (of and by relevant stakeholders) fall short of achieving purported goals. First, I will discuss how 1) participation is poorly understood and operationalized by AI researchers and practitioners, and 2) existing participatory mechanisms are insufficient to guarantee alignment. Next, I will show how red-teaming, one general evaluation approach proposed to analyze AI alignment, is an ill-defined process with highly variable inputs and outputs. Lastly, I will conclude by previewing my ongoing and future research agenda, including empirical study of the impact of red-teaming design choices (e.g., instructions for human versus automated evaluation approaches) on evaluation outcomes, aiming to develop more robust AI evaluation methods that empower stakeholders.