ESTA ES UNA NARANJA ATRACTIVA: ADVENTURES IN ADAPTING AN ENGLISH LANGUAGE GROUNDING SYSTEM TO NON-ENGLISH DATA

Author/Creator ORCID

Date

2019-01-01

Department

Computer Science and Electrical Engineering

Program

Computer Science

Citation of Original Publication

Rights

Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

In this theses I describe a multilingual grounded language learning system adapted from an English-only system. This system learns the meaning of words used in crowd-sourced descriptions by grounding them in the physical representations of the objects they are describing. My work compares the performance of the system between languages from different perspectives to identify modifications necessary to attain equal performance, with the goal of enhancing the ability of robots to learn language from a more diverse range of people. I first analyze Spanish using translated English data, and then extend this analysis to a new corpus crowd-sourced Spanish language data. I then take the insights gained from this analysis and repeat the experiment using Hindi. I find that with small modifications the system is able to learn color, object, and shape words with comparable performance between languages.