Practical Cross-modal Manifold Alignment for Robotic Grounded Language Learning

Nguyen, Andre T.; Richards, Luke E.; Raff, Edward; Kebe, Gaoussou Youssouf; Darvish, Kasra; Ferraro, Frank; Matuszek, Cynthia

Practical Cross-modal Manifold Alignment for Robotic Grounded Language Learning

dc.contributor.author	Nguyen, Andre T.
dc.contributor.author	Richards, Luke E.
dc.contributor.author	Raff, Edward
dc.contributor.author	Kebe, Gaoussou Youssouf
dc.contributor.author	Darvish, Kasra
dc.contributor.author	Ferraro, Frank
dc.contributor.author	Matuszek, Cynthia
dc.date.accessioned	2021-06-29T19:41:19Z
dc.date.available	2021-06-29T19:41:19Z
dc.date.issued	2021
dc.description	CVPR 2021 workshop	en
dc.description.abstract	We propose a cross-modality manifold alignment procedure that leverages triplet loss to jointly learn consistent, multi-modal embeddings of language-based concepts of real-world items. Our approach learns these embeddings by sampling triples of anchor, positive, and negative data points from RGB-depth images and their natural language descriptions. We show that our approach can benefit from, but does not require, post-processing steps such as Procrustes analysis, in contrast to some of our baselines which require it for reasonable performance. We demonstrate the effectiveness of our approach on two datasets commonly used to develop robotic-based grounded language learning systems, where our approach outperforms four baselines, including a state-of-the-art approach, across five evaluation metrics.	en
dc.description.uri	https://ieeexplore.ieee.org/document/9522916	en
dc.format.extent	10 pages	en
dc.genre	conference papers and proceedings postprints	en
dc.identifier	doi:10.13016/m2zimw-ltag
dc.identifier	https://doi.org/10.1109/CVPRW53098.2021.00177
dc.identifier.citation	A. T. Nguyen et al., "Practical Cross-modal Manifold Alignment for Robotic Grounded Language Learning," 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2021, pp. 1613-1622, doi: 10.1109/CVPRW53098.2021.00177.	en
dc.identifier.uri	http://hdl.handle.net/11603/21840
dc.language.iso	en	en
dc.publisher	IEEE	en
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartof	UMBC Student Collection
dc.relation.ispartof	UMBC Faculty Collection
dc.rights	This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rights	© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.title	Practical Cross-modal Manifold Alignment for Robotic Grounded Language Learning	en
dc.type	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Nguyen_Practical_Cross-Modal_Manifold_Alignment_for_Robotic_Grounded_Language_Learning_CVPRW_2021_paper.pdf
Size:: 1.66 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.56 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

UMBC Computer Science and Electrical Engineering Department
UMBC Faculty Collection
UMBC Student Collection