Speaker-Based Variability in Robotic Spoken Language Grounding

Kebe, Gaoussou Youssouf

Speaker-Based Variability in Robotic Spoken Language Grounding

dc.contributor.advisor	Matuszek, Cynthia
dc.contributor.author	Kebe, Gaoussou Youssouf
dc.contributor.department	Computer Science and Electrical Engineering
dc.contributor.program	Computer Science
dc.date.accessioned	2023-07-07T16:02:16Z
dc.date.available	2023-07-07T16:02:16Z
dc.date.issued	2022-01-01
dc.description.abstract	Robots in human spaces need to be able to understand human-provided natural language instructions in the context of their physical environment. Learning to understand grounded language, which connects natural language to percepts, is a critical research area. However, the majority of existing efforts relies on highly curated text and ignores the noise and variance present in end-user speech. Existing speech-based grounded language learning works require an extensive amount of speech data. Additionally, variation in speech characteristics can cause challenges for grounding models, and prior works do not investigate the difference in performance between demographic groups. In this thesis, I train and evaluate language grounding models on collected spoken and textual descriptions of common household objects. I leverage recent work in self-supervised speech representation models to learn groundings without the interference of transcriptions as an intermediate representation. The goal is to eliminate the effects of off-the-shelf speech-to-text models as a potential source of bias. The experimental results suggest that this approach can make language grounding systems more inclusive towards accented speakers and increase general performance.
dc.format	application:pdf
dc.genre	thesis
dc.identifier	doi:10.13016/m272dp-jkbx
dc.identifier.other	12594
dc.identifier.uri	http://hdl.handle.net/11603/28469
dc.language	en
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Collection
dc.relation.ispartof	UMBC Theses and Dissertations Collection
dc.relation.ispartof	UMBC Graduate School Collection
dc.relation.ispartof	UMBC Student Collection
dc.rights	This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
dc.source	Original File Name: Kebe_umbc_0434M_12594.pdf
dc.subject	Grounded Language Learning
dc.subject	Multimodal Learning
dc.subject	Natural Language Processing
dc.subject	Spoken Language Grounding
dc.title	Speaker-Based Variability in Robotic Spoken Language Grounding
dc.type	Text
dcterms.accessRights	Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan through a local library, pending author/copyright holder's permission.
dcterms.accessRights	Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Kebe_umbc_0434M_12594.pdf
Size:: 1.95 MB
Format:: Adobe Portable Document Format

Download

Collections

UMBC Theses and Dissertations
UMBC Computer Science and Electrical Engineering Department
UMBC Graduate School
UMBC Student Collection