Speaker-Based Variability in Robotic Spoken Language Grounding

dc.contributor.advisorMatuszek, Cynthia
dc.contributor.authorKebe, Gaoussou Youssouf
dc.contributor.departmentComputer Science and Electrical Engineering
dc.contributor.programComputer Science
dc.date.accessioned2023-07-07T16:02:16Z
dc.date.available2023-07-07T16:02:16Z
dc.date.issued2022-01-01
dc.description.abstractRobots in human spaces need to be able to understand human-provided natural language instructions in the context of their physical environment. Learning to understand grounded language, which connects natural language to percepts, is a critical research area. However, the majority of existing efforts relies on highly curated text and ignores the noise and variance present in end-user speech. Existing speech-based grounded language learning works require an extensive amount of speech data. Additionally, variation in speech characteristics can cause challenges for grounding models, and prior works do not investigate the difference in performance between demographic groups. In this thesis, I train and evaluate language grounding models on collected spoken and textual descriptions of common household objects. I leverage recent work in self-supervised speech representation models to learn groundings without the interference of transcriptions as an intermediate representation. The goal is to eliminate the effects of off-the-shelf speech-to-text models as a potential source of bias. The experimental results suggest that this approach can make language grounding systems more inclusive towards accented speakers and increase general performance.
dc.formatapplication:pdf
dc.genrethesis
dc.identifierdoi:10.13016/m272dp-jkbx
dc.identifier.other12594
dc.identifier.urihttp://hdl.handle.net/11603/28469
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
dc.sourceOriginal File Name: Kebe_umbc_0434M_12594.pdf
dc.subjectGrounded Language Learning
dc.subjectMultimodal Learning
dc.subjectNatural Language Processing
dc.subjectSpoken Language Grounding
dc.titleSpeaker-Based Variability in Robotic Spoken Language Grounding
dc.typeText
dcterms.accessRightsAccess limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan through a local library, pending author/copyright holder's permission.
dcterms.accessRightsAccess limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kebe_umbc_0434M_12594.pdf
Size:
1.95 MB
Format:
Adobe Portable Document Format