Understanding Implicit Content in Language and Language Models
IRB 4105
Human communication hinges on drawing implicit meaning from text, going beyond what is explicitly stated. This thesis investigates the role of implicit content in language, consisting of the propositions, beliefs, attitudes, and assumptions that remain unstated but are important for understanding communicative intent. Across three interconnected chapters, we pursue two linked objectives. The first is to demonstrate how recognizing and surfacing implicit content using LLMs as a validated tool improves construct measurement. Our second objective turns the focus on LLMs themselves, asking whether LLMs adequately handle implicit content while acting as conversational partners.
First, we examine implicit content in individual utterances by extending pragmatic inferences to include plausible inferences obtained from domain knowledge, going beyond classical presupposition and implicature to include abductive and deductive inferences. We show how this enhanced understanding of communicative intent enables effective measurement of constructs from text in three distinct settings: community-level political opinion, psychological stress, and vaccine hesitancy. Next, we shift to the conversational level, where implicit content involves a participant's beliefs about their partner's beliefs and intentions. Through analysis of collaborative problem-solving dialogues, we show that misalignments in the conversational common ground create measurable friction that can lead to task failure, while revealing current models' limitations in tracking implicit belief states during dialogue as observers.
Turning to models as participants in a cooperative conversation, we find that LLMs cannot reliably weigh partner assertions against their private evidence, agreeing with statements that contradict it. Using probing techniques from mechanistic interpretability, we find that model representations themselves are swayed by contradictory partner assertions. We then study these failures in LLMs designed for real-world deployment. In a maternal health chatbot, identifying implicit false assumptions in questions enables safer, more complete answers that address both surface queries and underlying misconceptions. In a chatbot designed to combat vaccine hesitancy, we show how false or misleading beliefs held by a user might be perpetuated through a novel red-teaming simulation approach. Together, our contributions emphasize the need for conversational NLP systems that pay appropriate attention to implicit content in their role as conversational partners.