Learning from Observations Using a Single Video Demonstration and Human Feedback

Gandhi, Sunil; Oates, Tim; Mohsenin, Tinoosh; Waytowich, Nicholas

Learning from Observations Using a Single Video Demonstration and Human Feedback

dc.contributor.author	Gandhi, Sunil
dc.contributor.author	Oates, Tim
dc.contributor.author	Mohsenin, Tinoosh
dc.contributor.author	Waytowich, Nicholas
dc.date.accessioned	2019-11-21T17:41:27Z
dc.date.available	2019-11-21T17:41:27Z
dc.date.issued	2019-09-29
dc.description.abstract	In this paper, we present a method for learning from video demonstrations by using human feedback to construct a mapping between the standard representation of the agent and the visual representation of the demonstration. In this way, we leverage the advantages of both these representations, i.e., we learn the policy using standard state representations, but are able to specify the expected behavior using video demonstration. We train an autonomous agent using a single video demonstration and use human feedback (using numerical similarity rating) to map the standard representation to the visual representation with a neural network. We show the effectiveness of our method by teaching a hopper agent in the MuJoCo to perform a backflip using a single video demonstration generated in MuJoCo as well as from a real-world YouTube video of a person performing a backflip. Additionally, we show that our method can transfer to new tasks, such as hopping, with very little human feedback.	en
dc.description.uri	https://arxiv.org/abs/1909.13392	en
dc.format.extent	8 pages	en
dc.genre	journal articles preprints	en
dc.identifier	doi:10.13016/m2hgsf-uvpt
dc.identifier.citation	Gandhi, Sunil; Oates, Tim; Mohsenin, Tinoosh; Waytowich, Nicholas; Learning from Observations Using a Single Video Demonstration and Human Feedback; Machine Learning (2019); https://arxiv.org/abs/1909.13392	en
dc.identifier.uri	http://hdl.handle.net/11603/16492
dc.language.iso	en	en
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartof	UMBC Student Collection
dc.relation.ispartof	UMBC Faculty Collection
dc.rights	Public Domain Mark 1.0	*
dc.rights	This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rights	This work was written as part of one of the author's official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.
dc.rights.uri	http://creativecommons.org/publicdomain/mark/1.0/	*
dc.subject	video demonstrations	en
dc.subject	human feedback	en
dc.subject	visual representation	en
dc.subject	standard representation	en
dc.subject	numerical similarity rating	en
dc.subject	neural network	en
dc.title	Learning from Observations Using a Single Video Demonstration and Human Feedback	en
dc.type	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 1909.13392.pdf
Size:: 603.89 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.56 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

UMBC Computer Science and Electrical Engineering Department
UMBC Faculty Collection
UMBC Student Collection