Aakriti Agrawal

PhD Student

Email: agrawal5

umd[.dot.]edu

Location:

IRB 2104

Website:

https://sites.google.com/umd.edu/aakriti-agrawal/

Google Scholar:

2RRnCRMAAAAJ

Biography:

I am an AI safety researcher focused on reasoning, scalable oversight, alignment, self-improvement, and AI control. My work addresses critical safety challenges, including identifying and mitigating reward hacking in LRMs, safe process-supervision, weak-to-strong generalization using multi-LLMs, and using interpretability to reduce hallucinations. I also have a background in agentic world modeling, MARL (multi-agent Reinforcement Learning), and robustness, with deep expertise across LLMs, Diffusion-LLMs, and VLMs. I am highly motivated to conduct research that addresses agentic misalignment, improves capability and reduces catastrophic risks from frontier models.