Making and measuring progress in long-form language processing

Talk
Mohit Iyyer
Talk Series: 
Time: 
03.26.2024 13:00 to 14:00

Recent advances in large language models (LLMs) have enabled them to process texts that are millions of words long, fueling demand for long-form language processing tasks such as the summarization or translation of books. However, LLMs struggle to take full advantage of the information within such long contexts, which contributes to factually incorrect and incoherent text generation. In this talk, I first demonstrate an issue that plagues even modern LLMs: their tendency to assign high probability to implausible long-form continuations of their input. I then describe a contrastive sequence-level ranking model that mitigates this problem at decoding time and can also be adapted to the RLHF alignment paradigm. Next, I consider the growing problem of long-form evaluation: as the length of the inputs and outputs of long-form tasks grows further, how do we even measure progress? I propose a high-level framework (applicable to both human and automatic evaluation) that first decomposes a long-form text into simpler atomic units before then evaluating each unit on a specific aspect. I demonstrate the framework's effectiveness at evaluating factuality and coherence on tasks such as biography generation and book summarization. Finally, I will discuss my future research vision, which aims to build collaborative, multilingual, and secure long-form language processing systems.