Presentation and Analysis of a Multimodal Dataset for Grounded Language Learning

Jenkins, Patrick; Sachdeva, Rishabh; Kebe, Gaoussou Youssouf; Higgins, Padraig; Darvish, Kasra; Raff, Edward; Engel, Don; Winder, John; Ferraro, Francisco; Matuszek, Cynthia

Presentation and Analysis of a Multimodal Dataset for Grounded Language Learning

dc.contributor.author	Jenkins, Patrick
dc.contributor.author	Sachdeva, Rishabh
dc.contributor.author	Kebe, Gaoussou Youssouf
dc.contributor.author	Higgins, Padraig
dc.contributor.author	Darvish, Kasra
dc.contributor.author	Raff, Edward
dc.contributor.author	Engel, Don
dc.contributor.author	Winder, John
dc.contributor.author	Ferraro, Francisco
dc.contributor.author	Matuszek, Cynthia
dc.date.accessioned	2020-09-11T17:40:39Z
dc.date.available	2020-09-11T17:40:39Z
dc.date.issued	2020-07-09
dc.description.abstract	Grounded language acquisition -- learning how language-based interactions refer to the world around them -- is amajor area of research in robotics, NLP, and HCI. In practice the data used for learning consists almost entirely of textual descriptions, which tend to be cleaner, clearer, and more grammatical than actual human interactions. In this work, we present the Grounded Language Dataset (GoLD), a multimodal dataset of common household objects described by people using either spoken or written language. We analyze the differences and present an experiment showing how the different modalities affect language learning from human in-put. This will enable researchers studying the intersection of robotics, NLP, and HCI to better investigate how the multiple modalities of image, text, and speech interact, as well as show differences in the vernacular of these modalities impact results.	en
dc.description.uri	https://arxiv.org/abs/2007.14987	en
dc.format.extent	11 pages	en
dc.genre	journal articles preprints	en
dc.identifier	doi:10.13016/m24vfu-goza
dc.identifier.citation	Patrick Jenkins, Rishabh Sachdeva, Gaoussou Youssouf Kebe, Padraig Higgins, Kasra Darvish, Edward Raff, Don Engel, John Winder, Francisco Ferraro and Cynthia Matuszek, Presentation and Analysis of a Multimodal Dataset for Grounded LanguageLearning, https://arxiv.org/abs/2007.14987	en
dc.identifier.uri	http://hdl.handle.net/11603/19644
dc.language.iso	en	en
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartof	UMBC Faculty Collection
dc.relation.ispartof	UMBC Student Collection
dc.relation.ispartof	UMBC Office for the Vice President of Research
dc.rights	This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.title	Presentation and Analysis of a Multimodal Dataset for Grounded Language Learning	en
dc.type	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2007.14987.pdf
Size:: 11.45 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.56 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

UMBC Computer Science and Electrical Engineering Department
UMBC Faculty Collection
UMBC Office for the Vice President of Research & Creative Achievement (ORCA)
UMBC Student Collection