Predicting Motivations of Actions by Leveraging Text
dc.contributor.author | Vondrick, Carl | |
dc.contributor.author | Oktay, Deniz | |
dc.contributor.author | Pirsiavash, Hamed | |
dc.contributor.author | Torralba, Antonio | |
dc.date.accessioned | 2019-07-01T18:17:29Z | |
dc.date.available | 2019-07-01T18:17:29Z | |
dc.date.issued | 2016-12-12 | |
dc.description.abstract | Understanding human actions is a key problem in computer vision. However, recognizing actions is only the first step of understanding what a person is doing. In this paper, we introduce the problem of predicting why a person has performed an action in images. This problem has many applications in human activity understanding, such as anticipating or explaining an action. To study this problem, we introduce a new dataset of people performing actions annotated with likely motivations. However, the information in an image alone may not be sufficient to automatically solve this task. Since humans can rely on their lifetime of experiences to infer motivation, we propose to give computer vision systems access to some of these experiences by using recently developed natural language models to mine knowledge stored in massive amounts of text. While we are still far away from fully understanding motivation, our results suggest that transferring knowledge from language into vision can help machines understand why people in images might be performing an action. | en_US |
dc.description.sponsorship | This work was supported by NSF grant IIS-1524817, and by a Google faculty research award to AT, and a Google PhD fellowship to CV. | en_US |
dc.description.uri | https://ieeexplore.ieee.org/document/7780696 | en_US |
dc.format.extent | 9 pages | en_US |
dc.genre | conference papers and proceedings preprints | en_US |
dc.identifier | doi:10.13016/m2ikgr-iyuv | |
dc.identifier.citation | Carl Vondrick, et.al, Predicting Motivations of Actions by Leveraging Text, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DOI: 10.1109/CVPR.2016.327 | en_US |
dc.identifier.uri | https://doi.org/10.1109/CVPR.2016.327 | |
dc.identifier.uri | http://hdl.handle.net/11603/14329 | |
dc.language.iso | en_US | en_US |
dc.publisher | IEEE | en_US |
dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
dc.relation.ispartof | UMBC Computer Science and Electrical Engineering Department Collection | |
dc.relation.ispartof | UMBC Faculty Collection | |
dc.rights | This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author. | |
dc.rights | © 2016 IEEE | |
dc.subject | computer vision | en_US |
dc.subject | data mining | en_US |
dc.subject | image annotation | en_US |
dc.subject | image recognition | en_US |
dc.subject | natural language processing | en_US |
dc.subject | action motivation prediction | en_US |
dc.subject | knowledge mining | en_US |
dc.subject | human activity understanding | en_US |
dc.title | Predicting Motivations of Actions by Leveraging Text | en_US |
dc.type | Text | en_US |