Computer Science Major Sander Schulhoff’s Paper Accepted into EMNLP 2023
Large language models (LLMs)—a type of artificial intelligence (AI) algorithm—are used to power various applications from chatbots to writing assistants. Yet, these models face increasing security risks from prompt hacking—a process where models are coerced into abandoning their intended tasks in favor of potentially harmful instructions.
University of Maryland computer science major Sander Schulhoff will present a research paper on this issue at the Empirical Methods in Natural Language Processing (EMNLP) 2023 conference, scheduled for December 6 to 10, 2023, in Singapore. His paper, titled “Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global-Scale Prompt Hacking Competition,” addresses critical security concerns in deploying and using LLMs.
Schulhoff's paper is the result of an AI security competition he organized in 2023 called HackAPrompt, which was backed by industry giants including OpenAI and HuggingFace and drew over 600,000 adversarial prompts from over 3,000 hackers.
“Our competition was no ordinary competition,” Schulhoff said. “Funded by major players in the AI industry and carried out in collaboration with the University of Maryland, MILA and Towards AI, we aimed to expose the potential weaknesses of LLMs by challenging an international pool of hackers to outsmart AI.”
Participants were tasked with manipulating state-of-the-art LLMs such as GPT-3 and ChatGPT into discarding their original instructions and, instead, following malicious ones. Their reward for successfully outsmarting AI was a share of a $37,500 prize pot.
Schulhoff's paper is the only undergraduate-led research project from his group led by UMD Computer Science Associate Professor Jordan Boyd-Graber to be featured at EMNLP 2023.
“It is very exciting to have been accepted,” Schulhoff said. “This is my first paper accepted as the first author at a conference. After nine months of working on the project, I am glad it paid off.”
Schulhoff also noted the societal importance of his team’s work.
“This paper is expected to stimulate further discussion on prompt hacking and contribute to the standardization of terminology,” he said. “From a non-research standpoint, we've already witnessed numerous challenges and related competitions emerging where one of our HackAPrompt winners presented their findings.”
In addition to conducting research, Schulhoff serves as CEO of the startup, LearnPrompting, which was recently selected as one of the four teams for this year's Mokhtarzada Hatchery student startup accelerator program. LearnPrompting is an open-source AI guide that aims to enhance AI literacy to speed up AI adoption. With over a million users and recognition from OpenAI, the platform creates a tailored content library for businesses and individuals.
Story by Samuel Malede Zewdu, CS Communications
Co-authors of the research paper include:
- Svetlina Anati (Researcher, Technical University of Sofia)
- Jordan Boyd-Graber (Associate Professor at the Department of Computer Science, UMD)
- Louis-François Bouchard (Researcher, Mila and Towards AI)
- Christopher Carnahan (Researcher, University of Arizona)
- Anaum Khan (Undergraduate Researcher, UMD)
- Anson Liu Kost (Researcher, NYU)
- Jeremy Pinto (Researcher, MILA)
- Chenglei Si (Ph.D. Student, Stanford)
- Valen Tagliabue (Researcher, University of Milan)
The Department welcomes comments, suggestions and corrections. Send email to editor [-at-] cs [dot] umd [dot] edu.