PhD Proposal: Towards Principled AI-Agents with Decentralized and Asymmetric Information

Talk

Xiangyu Liu

Time:

06.02.2025 12:00 to 14:00

Location:

[Remote] https://umd.zoom.us/j/7269246363#success

URL:

https://talks.cs.umd.edu/talks/4246

Abstract:
AI models have been increasingly deployed to develop ``Autonomous Agents'' for decision-making, with prominent application examples including playing Go and video games, robotics, autonomous driving, healthcare, human-assistant, etc. Most such success stories naturally involve multiple AI-agents interacting dynamically with each other and humans. More importantly, these agents oftentimes operate with asymmetric information in practice, both across different agents and across the training-testing phases. In this thesis, we aim to lay the theoretical foundations for principled AI agents operating under asymmetric and decentralized information.First, we will focus on Reinforcement Learning (RL)-Agents, in multi-agent environments with partially observable and decentralized information. To circumvent the known hardness results and the use of computationally intractable oracles, we advocate leveraging the potential information-sharing among agents. We first establish several computational complexity results to justify the necessity of information-sharing, as well as the observability assumption. Inspired by the inefficiency of planning in the ground-truth model, we then propose to further approximate the shared common information to construct an approximate model of the POSG, in which planning an approximate equilibrium can be quasi-efficient, under the aforementioned assumptions. Furthermore, we develop a partially observable multi-agent RL algorithm that is both statistically and computationally quasi-efficient.Secondly, we will focus on RL agents in partially observable Markov decision processes when there is privileged information in training, a common practice in robot learning and deep RL. We will firstly revisit two major empirical paradigms, expert distillation (a.k.a. teacher-student learning) and asymmetric actor-critic and demonstrate their pitfalls in finding near-optimal policies. Furthermore, we develop new principled algorithm with both polynomial sample complexity and (quasi)-polynomial computational complexity and revealed the provable benefits of such privileged information.Finally, we will examine Large-Language-Model (LLM)-(powered-)Agents, which use LLM as the main controller for decision-making, by understanding and enhancing their decision-making capability in canonical decentralized and multi-agent scenarios. In particular, we use the metric of Regret, commonly studied in Online Learning and RL, to understand LLM-agents' decision-making limits in context using controlled experiments. Motivated by the observed pitfalls of existing LLM agents, we also proposed a new fine-tuning loss to promote the no-regret behaviors of the models, both provably and experimentally.

Upcoming Events

Talk

08.07.2025 11:30 to 13:00

IRB-4105 https://umd.zoom.us/j/6266008577?pwd=UlFQSDJMMmVoNzlJQzRXWVpmZXZhUT09&om...

PhD Proposal: Towards Trustworthy and Capable Language Models
Abhimanyu Hans

Event

09.05.2025 12:00 to 13:30

IRB-4105

Computer Science APT Meeting