Towards Principled Sequential Decision-Making

Talk
Qinghua Liu
Talk Series: 
Time: 
03.28.2024 11:00 to 12:00

Sequential decision-making studies how intelligent agents ought to make decisions in a dynamic environment to achieve their objectives. Its diverse applications range from robotics and nuclear plasma control to discovering faster matrix multiplication algorithms and fine-tuning language models (LLMs). In this talk, I will delve into my research on the theoretical foundations of sequential decision-making.

Firstly, I will talk about reinforcement learning with generic nonlinear function approximation, a widely used approach for solving real-world decision-making problems characterized by enormous state spaces. I will demonstrate that the classical Fitted Q-iteration algorithm (the prototype of DQN), combined with the idea of global optimism, is provably sample-efficient in solving a diverse range of problems. In the second part, I will focus on partially observable decision-making in the framework of POMDP, a problem that has long been considered intractable within the theory community due to numerous hardness results. Contrary to this belief, I will reveal a rich class of POMDPs that are of practical interest and can be solved within polynomial samples using a variant of the classical maximum likelihood estimation algorithm. Finally, I will turn to multi-agent decision-making in the framework of Markov Game, where agents must learn to strategically cooperate or compete. I will introduce a fully decentralized algorithm capable of learning equilibria strategy with nearly minimax-optimal sample efficiency.