Video Summarization using Unsupervised Methods

Author/Creator

Author/Creator ORCID

Date

2018-01-01

Department

Computer Science and Electrical Engineering

Program

Computer Science

Citation of Original Publication

Rights

Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

Due to the increasing volume of the video data uploaded daily on the web through prime sources including social media, Youtube, and video sharing websites, video summarization has emerged as an important and challenging problem in the industry. Video summarization and its applications in various domains like consumer industry and marketing, generating a trailer for movies, highlights for different sports events. As a result, an efficient mechanism for extracting important video contents is the need to deal with a large amount of videographic repositories. We present a novel unsupervised approach to generate video summaries using simpler networks like VGG and ResNet instead of using complex networks i.e. LSTM and RNN. Video summarization and Image captioning are two completely different and independent tasks, yet we propose an approach that considers generating summaries using a feature space produced as a result of the image captioning of a video. Our main idea is generating short and informative summaries in a completely unsupervised manner using basic and traditional clustering technique modeled jointly with the video captioning framework NeuralTalk2. We conducted experiments in different settings with SumMe and TVSum datasets. Our approach achieved state-of-the-art results for SumMe dataset with an F-score of 35.6.