Understanding and Generating Human Language

Talk
Wei Xu
Talk Series: 
Time: 
02.24.2020 11:00 to 12:00
Location: 

IRB 4105

Human language is notoriously complex due to the multitude of ways people can express the same meaning. In this talk, I will present two series of work on machine learning methods to understand the varied expressions in human language and to generate paraphrases for applications, such as reading and writing assistive technology. In the first part, I will showcase how to design learning and ranking models for natural language generation, including a new metric that has been widely adopted as a learning objective and evaluation method. In the second part, I will present new datasets and a class of pairwise models for learning textural expressions that convey the same meaning. In contrast to previous work, we focus on extracting paraphrases on a much larger scale and with a much broader range by developing more robust models, leveraging social media data and crowdsourcing. I will also briefly discuss the connections of my work to computational social science, language and code, and human language instructions.