Practical Cross-modal Manifold Alignment for Robotic Grounded Language Learning

dc.contributor.authorNguyen, Andre T.
dc.contributor.authorRichards, Luke E.
dc.contributor.authorRaff, Edward
dc.contributor.authorKebe, Gaoussou Youssouf
dc.contributor.authorDarvish, Kasra
dc.contributor.authorFerraro, Frank
dc.contributor.authorMatuszek, Cynthia
dc.date.accessioned2021-06-29T19:41:19Z
dc.date.available2021-06-29T19:41:19Z
dc.date.issued2021
dc.descriptionCVPR 2021 workshopen_US
dc.description.abstractWe propose a cross-modality manifold alignment procedure that leverages triplet loss to jointly learn consistent, multi-modal embeddings of language-based concepts of real-world items. Our approach learns these embeddings by sampling triples of anchor, positive, and negative data points from RGB-depth images and their natural language descriptions. We show that our approach can benefit from, but does not require, post-processing steps such as Procrustes analysis, in contrast to some of our baselines which require it for reasonable performance. We demonstrate the effectiveness of our approach on two datasets commonly used to develop robotic-based grounded language learning systems, where our approach outperforms four baselines, including a state-of-the-art approach, across five evaluation metrics.en_US
dc.description.urihttps://ieeexplore.ieee.org/document/9522916en_US
dc.format.extent10 pagesen_US
dc.genreconference papers and proceedings postprintsen_US
dc.identifierdoi:10.13016/m2zimw-ltag
dc.identifierhttps://doi.org/10.1109/CVPRW53098.2021.00177
dc.identifier.citationA. T. Nguyen et al., "Practical Cross-modal Manifold Alignment for Robotic Grounded Language Learning," 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2021, pp. 1613-1622, doi: 10.1109/CVPRW53098.2021.00177.en_US
dc.identifier.urihttp://hdl.handle.net/11603/21840
dc.language.isoen_USen_US
dc.publisherIEEEen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Student Collection
dc.relation.ispartofUMBC Faculty Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rights© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.titlePractical Cross-modal Manifold Alignment for Robotic Grounded Language Learningen_US
dc.typeTexten_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Nguyen_Practical_Cross-Modal_Manifold_Alignment_for_Robotic_Grounded_Language_Learning_CVPRW_2021_paper.pdf
Size:
1.66 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: