A Spoken Language Dataset of Descriptions for Speech-Based Grounded Language Learning
dc.contributor.author | Kebe, Gaoussou Youssouf | |
dc.contributor.author | Higgins, Padraig | |
dc.contributor.author | Jenkins, Patrick | |
dc.contributor.author | Darvish, Kasra | |
dc.contributor.author | Sachdeva, Rishabh | |
dc.contributor.author | Barron, Ryan | |
dc.contributor.author | Winder, John | |
dc.contributor.author | Engel, Don | |
dc.contributor.author | Raff, Edward | |
dc.contributor.author | Ferraro, Francis | |
dc.contributor.author | Matuszek, Cynthia | |
dc.date.accessioned | 2021-06-25T22:16:40Z | |
dc.date.available | 2021-06-25T22:16:40Z | |
dc.date.issued | 2021-06-08 | |
dc.description | 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks | en_US |
dc.description.abstract | Grounded language acquisition is a major area of research combining aspects of natural language processing, computer vision, and signal processing, compounded by domain issues requiring sample efficiency and other deployment constraints. In this work, we present a multimodal dataset of RGB+depth objects with spoken as well as textual descriptions. We analyze the differences between the two types of descriptive language and our experiments demonstrate that the different modalities affect learning. This will enable researchers studying the intersection of robotics, NLP, and HCI to better investigate how the multiple modalities of image, depth, text, speech, and transcription interact, as well as how differences in the vernacular of these modalities impact results. | en_US |
dc.description.sponsorship | This material is based in part upon work supported by the National Science Foundation under Grant Nos. 1940931 and 1637937. Some experiments were conducted on the UMBC High-performance computing facility, funded by the National Science Foundation under Grant Nos. 1940931 and 2024878. This material is also based on research that is in part supported by the Air Force Research Laboratory (AFRL), DARPA, for the KAIROS program under agreement number FA8750-19-2-1003. The U.S.Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either express or implied, of the Air Force Research Laboratory (AFRL), DARPA, or the U.S. Government. | en_US |
dc.description.uri | https://neurips.cc/virtual/2021/poster/22767 | |
dc.format.extent | 11 slides | en_US |
dc.genre | video recordings | |
dc.identifier | doi:10.13016/m2bmhe-tmzc | |
dc.identifier.citation | Kebe, Gaoussou Youssouf et al.; A Spoken Language Dataset of Descriptions for Speech-Based Grounded Language Learning; NeurIPS 2021 Track Datasets and Benchmarks Round1 Submission, 8 June, 2021; https://openreview.net/forum?id=Yx9jT3fkBaD | en_US |
dc.identifier.uri | http://hdl.handle.net/11603/21838 | |
dc.language.iso | en_US | en_US |
dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
dc.relation.ispartof | UMBC Computer Science and Electrical Engineering Department Collection | |
dc.relation.ispartof | UMBC Student Collection | |
dc.relation.ispartof | UMBC Faculty Collection | |
dc.relation.ispartof | UMBC Office for the Vice President of Research | |
dc.rights | This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author. | |
dc.subject | grounded language acquisition | en_US |
dc.subject | speech processing | en_US |
dc.subject | computer vision | en_US |
dc.subject | natural language processing | en_US |
dc.title | A Spoken Language Dataset of Descriptions for Speech-Based Grounded Language Learning | en_US |
dc.type | Text | en_US |
dc.type | Moving Image |
Files
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 2.56 KB
- Format:
- Item-specific license agreed upon to submission
- Description: