Predicting Motivations of Actions by Leveraging Text

dc.contributor.authorVondrick, Carl
dc.contributor.authorOktay, Deniz
dc.contributor.authorPirsiavash, Hamed
dc.contributor.authorTorralba, Antonio
dc.date.accessioned2019-07-01T18:17:29Z
dc.date.available2019-07-01T18:17:29Z
dc.date.issued2016-12-12
dc.description.abstractUnderstanding human actions is a key problem in computer vision. However, recognizing actions is only the first step of understanding what a person is doing. In this paper, we introduce the problem of predicting why a person has performed an action in images. This problem has many applications in human activity understanding, such as anticipating or explaining an action. To study this problem, we introduce a new dataset of people performing actions annotated with likely motivations. However, the information in an image alone may not be sufficient to automatically solve this task. Since humans can rely on their lifetime of experiences to infer motivation, we propose to give computer vision systems access to some of these experiences by using recently developed natural language models to mine knowledge stored in massive amounts of text. While we are still far away from fully understanding motivation, our results suggest that transferring knowledge from language into vision can help machines understand why people in images might be performing an action.en_US
dc.description.sponsorshipThis work was supported by NSF grant IIS-1524817, and by a Google faculty research award to AT, and a Google PhD fellowship to CV.en_US
dc.description.urihttps://ieeexplore.ieee.org/document/7780696en_US
dc.format.extent9 pagesen_US
dc.genreconference papers and proceedings preprintsen_US
dc.identifierdoi:10.13016/m2ikgr-iyuv
dc.identifier.citationCarl Vondrick, et.al, Predicting Motivations of Actions by Leveraging Text, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DOI: 10.1109/CVPR.2016.327en_US
dc.identifier.urihttps://doi.org/10.1109/CVPR.2016.327
dc.identifier.urihttp://hdl.handle.net/11603/14329
dc.language.isoen_USen_US
dc.publisherIEEEen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rights© 2016 IEEE
dc.subjectcomputer visionen_US
dc.subjectdata miningen_US
dc.subjectimage annotationen_US
dc.subjectimage recognitionen_US
dc.subjectnatural language processingen_US
dc.subjectaction motivation predictionen_US
dc.subjectknowledge miningen_US
dc.subjecthuman activity understandingen_US
dc.titlePredicting Motivations of Actions by Leveraging Texten_US
dc.typeTexten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1406.5472.pdf
Size:
986.78 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: