Building More Reliable and Scalable AI Systems with Language Model Programming

Omar Khattab
Talk Series: 
03.04.2024 11:00 to 12:00

It is now easy to build impressive demos with language models (LMs) but turning these into reliable systems currently requires brittle combinations of prompting, chaining, and finetuning LMs. In this talk, I present LM programming, a systematic way to address this by defining and improving four layers of the LM stack. I start with how to adapt LMs to search for information most effectively (ColBERT, ColBERTv2, UDAPDR) and how to scale that to billions of tokens (PLAID). I then discuss the right architectures and supervision strategies (ColBERT-QA, Baleen, Hindsight) for allowing LMs to search for and cite verifiable sources in their responses. This leads to DSPy, a programming model that replaces ad-hoc LM prompting techniques with composable modules and with optimizers that can supervise complex LM programs. Even simple AI systems expressed in DSPy routinely outperform standard hand-crafted prompt pipelines, in some cases while using small LMs. I highlight how ColBERT and DSPy have sparked applications at dozens of leading tech companies, open-source communities, and research labs, and then conclude by discussing how DSPy enables a new degree of research modularity, one that stands to allow open research to again lead the development of AI systems.