Neural Variational Learning for Grounded Language Acquisition

Author/Creator ORCID

Date

2021-07-20

Department

Program

Citation of Original Publication

N. Pillai, C. Matuszek and F. Ferraro, "Neural Variational Learning for Grounded Language Acquisition," 2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN), 2021, pp. 633-640, doi: 10.1109/RO-MAN50785.2021.9515374.

Rights

© 2021 IEEE.  Personal use of this material is permitted.  Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Abstract

We propose a learning system in which language is grounded in visual percepts without specific pre-defined categories of terms. We present a unified generative method to acquire a shared semantic/visual embedding that enables the learning of language about a wide range of real-world objects. We evaluate the efficacy of this learning by predicting the semantics of objects and comparing the performance with neural and non-neural inputs. We show that this generative approach exhibits promising results in language grounding without pre-specifying visual categories under low resource settings. Our experiments demonstrate that this approach is generalizable to multilingual, highly varied datasets.