Retrospective on the 2021 MineRL BASALT Competition on Learning from Human Feedback

dc.contributor.authorShah, Rohin
dc.contributor.authorWang, Steven H.
dc.contributor.authorWild, Cody
dc.contributor.authorMilani, Stephanie
dc.contributor.authorKanervisto, Anssi
dc.contributor.authorGoecks, Vinicius G.
dc.contributor.authorWaytowich, Nicholas
dc.contributor.authorWatkins-Valls, David
dc.contributor.authorPrakash, Bharat
dc.contributor.authorMills, Edmund
dc.contributor.authorGarg, Divyansh
dc.contributor.authorFries, Alexander
dc.contributor.authorSouly, Alexandra
dc.contributor.authorChan, Jun Shern
dc.contributor.authorCastillo, Daniel del
dc.contributor.authorLieberum, Tom
dc.date.accessioned2022-08-02T21:12:29Z
dc.date.available2022-08-02T21:12:29Z
dc.date.issued2022-07
dc.description.abstractWe held the first-ever MineRL Benchmark for Agents that Solve Almost-Lifelike Tasks (MineRL BASALT) Competition at the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021). The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks. Rather than mandating the use of LfHF techniques, we described four tasks in natural language to be accomplished in the video game Minecraft, and allowed participants to use any approach they wanted to build agents that could accomplish the tasks. Teams developed a diverse range of LfHF algorithms across a variety of possible human feedback types. The three winning teams implemented significantly different approaches while achieving similar performance. Interestingly, their approaches performed well on different tasks, validating our choice of tasks to include in the competition. While the outcomes validated the design of our competition, we did not get as many participants and submissions as our sister competition, MineRL Diamond. We speculate about the causes of this problem and suggest improvements for future iterations of the competition.en_US
dc.description.urihttps://proceedings.mlr.press/v176/shah22a.htmlen_US
dc.format.extent14 pagesen_US
dc.genreconference papers and proceedingsen_US
dc.identifierdoi:10.13016/m2gbci-wyhb
dc.identifier.citationShah, R., Wang, S.H., Wild, C., Milani, S., Kanervisto, A., Goecks, V.G., Waytowich, N., Watkins-Valls, D., Prakash, B., Mills, E., Garg, D., Fries, A., Souly, A., Chan, J.S., del Castillo, D. &amp; Lieberum, T.. (2022). Retrospective on the 2021 MineRL BASALT Competition on Learning from Human Feedback. <i>Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track</i>, in <i>Proceedings of Machine Learning Research</i> 176:259-272 Available from https://proceedings.mlr.press/v176/shah22a.html.en_US
dc.identifier.urihttp://hdl.handle.net/11603/25281
dc.language.isoen_USen_US
dc.publisherProceedings of Machine Learning Researchen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis work was written as part of one of the author's official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.en_US
dc.rightsPublic Domain Mark 1.0*
dc.rights.urihttp://creativecommons.org/publicdomain/mark/1.0/*
dc.titleRetrospective on the 2021 MineRL BASALT Competition on Learning from Human Feedbacken_US
dc.typeTexten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
shah22a.pdf
Size:
1.37 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: