TY - JOUR
T1 - A novel video summarization based on mining the story-structure and semantic relations among concept entities
AU - Chen, Bo Wei
AU - Wang, Jia Ching
AU - Wang, Jhing Fa
N1 - Funding Information:
Manuscript received April 01, 2008; revised October 07, 2008. Current version published January 16, 2009. This work was supported in part by the National Science Council of the Republic of China under Grant NSC97-2218-E-006-012. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Jiebo Luo.
PY - 2009/2
Y1 - 2009/2
N2 - Video summarization techniques have been proposed for years to offer people comprehensive understanding of the whole story in the video. Roughly speaking, existing approaches can be classified into the two types: one is static storyboard, and the other is dynamic skimming. However, despite that these traditional methods give brief summaries for users, they still do not provide with a concept-organized and systematic view. In this paper, we present a structural video content browsing system and a novel summarization method by utilizing the four kinds of entities: who, what, where, and when to establish the framework of the video contents. With the assistance of the above-mentioned indexed information, the structure of the story can be built up according to the characters, the things, the places, and the time. Therefore, users can not only browse the video efficiently but also focus on what they are interested in via the browsing interface. In order to construct the fundamental system, we employ maximum entropy criterion to integrate visual and text features extracted from video frames and speech transcripts, generating high-level concept entities. A novel concept expansion method is introduced to explore the associations among these entities. After constructing the relational graph, we exploit graph entropy model to detect meaningful shots and relations, which serve as the indices for users. The results demonstrate that our system can achieve better performance and information coverage.
AB - Video summarization techniques have been proposed for years to offer people comprehensive understanding of the whole story in the video. Roughly speaking, existing approaches can be classified into the two types: one is static storyboard, and the other is dynamic skimming. However, despite that these traditional methods give brief summaries for users, they still do not provide with a concept-organized and systematic view. In this paper, we present a structural video content browsing system and a novel summarization method by utilizing the four kinds of entities: who, what, where, and when to establish the framework of the video contents. With the assistance of the above-mentioned indexed information, the structure of the story can be built up according to the characters, the things, the places, and the time. Therefore, users can not only browse the video efficiently but also focus on what they are interested in via the browsing interface. In order to construct the fundamental system, we employ maximum entropy criterion to integrate visual and text features extracted from video frames and speech transcripts, generating high-level concept entities. A novel concept expansion method is introduced to explore the associations among these entities. After constructing the relational graph, we exploit graph entropy model to detect meaningful shots and relations, which serve as the indices for users. The results demonstrate that our system can achieve better performance and information coverage.
KW - Concept expansion tree
KW - Graph entropy
KW - Graph mining
KW - Structural video contents
KW - Video browsing
KW - Video indexing
KW - Video summarization
UR - http://www.scopus.com/inward/record.url?scp=59049098927&partnerID=8YFLogxK
U2 - 10.1109/TMM.2008.2009703
DO - 10.1109/TMM.2008.2009703
M3 - 期刊論文
AN - SCOPUS:59049098927
SN - 1520-9210
VL - 11
SP - 295
EP - 312
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
IS - 2
M1 - 4757424
ER -