Learning Spoken Language Through Vision

Talk

David Harwath

Talk Series:

Visitors

Time:

04.09.2020 11:00 to 12:00

Location:

Virtual-https://umd.zoom.us/j/431963648

URL:

https://talks.cs.umd.edu/talks/2580

Humans learn spoken language and visual perception at an early age by being immersed in the world around them. Why can't computers do the same? In this talk, I will describe our work to develop methodologies for grounding continuous speech signals at the raw waveform level to natural image scenes. I will first present self-supervised models capable of jointly discovering spoken words and the visual objects to which they refer, all without conventional annotations in either modality. I will show how the representations learned by these models implicitly capture meaningful linguistic structure directly from the speech signal. Finally, I will demonstrate that these models can be applied across multiple languages, and that the visual domain can function as an "interlingua," enabling the discovery of word-level semantic translations at the waveform level.

Upcoming Events

Talk

04.18.2024 11:00 to 12:00

LTS Auditorium, 8080 Greenmead Drive, College Park, MD 20740

Adaptive Low Probability of Detection Radar Waveform Design with Generative Deep Learning
Matthew Ziemann

Event

04.19.2024 12:00 to 13:30

IRB-0318

Computer Science APT Meeting

Talk

04.25.2024 13:00 to 14:00

IRB 4105 or https://umd.zoom.us/j/95853135696?pwd=VVEwMVpxeElXeEw0ckVlSWNOMVhXdz09

Human-centered Explainable AI: Expanding Explainable & Responsible AI
Upol Ehsan

Event

04.26.2024 12:00 to 13:30

IRB-4105

Computer Science APT Meeting

Event

04.26.2024 13:00 to 14:00

IRB-5105

Computer Science Instructional Faculty Meeting

Event

04.26.2024 15:00 to 16:30

IRB-0318

Computer Science Education Committee Meeting

Event

05.03.2024 11:00 to 12:00

IRB-4105

Computer Science APT Meeting

Event

05.03.2024 12:00 to 13:30

IRB-4105

Computer Science FFL

Event

05.06.2024 12:00 to 13:00

IRB-2137

Computer Science Department Council Meeting

Event

05.17.2024 12:00 to 13:30

IRB-4105

Computer Science APT Meeting