Parsing videos of actions with segmental grammars
dc.contributor.author | Pirsiavash, Hamed | |
dc.contributor.author | Ramanan, Deva | |
dc.date.accessioned | 2019-06-28T16:41:53Z | |
dc.date.available | 2019-06-28T16:41:53Z | |
dc.date.issued | 2014-06-28 | |
dc.description.abstract | Real-world videos of human activities exhibit temporal structure at various scales, long videos are typically composed out of multiple action instances, where each instance is itself composed of sub-actions with variable durations and orderings. Temporal grammars can presumably model such hierarchical structure, but are computationally difficult to apply for long video streams. We describe simple grammars that capture hierarchical temporal structure while admitting inference with a finite-state-machine. This makes parsing linear time, constant storage, and naturally online. We train grammar parameters using a latent structural SVM, where latent subactions are learned automatically. We illustrate the effectiveness of our approach over common baselines on a new half-million frame dataset of continuous YouTube videos. | en_US |
dc.description.sponsorship | Funding for this research was provided by NSF Grant 0954083, ONR-MURI Grant N00014- 10-1-0933, and the Intel Science and Technology Center -Visual Computing. | en_US |
dc.description.uri | https://ieeexplore.ieee.org/document/6909479 | en_US |
dc.format.extent | 8 pages | en_US |
dc.genre | conference papers and proceedings preprints | en_US |
dc.identifier | doi:10.13016/m2b0nz-gddn | |
dc.identifier.citation | Hamed Pirsiavash, Deva Ramanan , Parsing videos of actions with segmental grammars, 2014 IEEE Conference on Computer Vision and Pattern Recognition, DOI: 10.1109/CVPR.2014.85 | en_US |
dc.identifier.uri | https://doi.org/10.1109/CVPR.2014.85 | |
dc.identifier.uri | http://hdl.handle.net/11603/14318 | |
dc.language.iso | en_US | en_US |
dc.publisher | IEEE | en_US |
dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
dc.relation.ispartof | UMBC Computer Science and Electrical Engineering Department Collection | |
dc.relation.ispartof | UMBC Faculty Collection | |
dc.rights | This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author. | |
dc.rights | © 2014 IEEE | |
dc.subject | Grammar | en_US |
dc.subject | Videos | en_US |
dc.subject | Hidden Markov models | en_US |
dc.subject | Data models | en_US |
dc.subject | Presses | en_US |
dc.subject | Markov processes | en_US |
dc.subject | finite state machines | en_US |
dc.subject | support vector machines | en_US |
dc.subject | image segmentation | en_US |
dc.subject | latent subactions | en_US |
dc.subject | latent structural SVM | en_US |
dc.title | Parsing videos of actions with segmental grammars | en_US |
dc.type | Text | en_US |