Video summarization based on face recognition and speaker verification

Yuan Shan Lee, Chia Yung Hsu, Po Chuan Lin, Chia Yen Chen, Jia Ching Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

In this paper, we propose a video summarization system based on face recognition and speaker verification. In the proposed system, face recognition is performed first. The Adaboost approach is adopted to find out the image regions which contain human faces. We perform the Non-negative Matrix Factorization (NMF) technique to decompose the face regions into basis and corresponding coefficients. Next, we use the coefficients as features to do classification by Support Vector Machine (SVM). Simultaneously, the voice part is used to do speaker verification via GMM-SVM approach. Finally, the video summarization is processed according to the face recognition and speaker verification results. With the consideration of both sound and image parts, the proposed system shall have better performance than traditional video summarization systems.

Original languageEnglish
Title of host publicationProceedings of the 2015 10th IEEE Conference on Industrial Electronics and Applications, ICIEA 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1821-1824
Number of pages4
ISBN (Electronic)9781467373173
DOIs
StatePublished - 20 Nov 2015
Event10th IEEE Conference on Industrial Electronics and Applications, ICIEA 2015 - Auckland, New Zealand
Duration: 15 Jun 201517 Jun 2015

Publication series

NameProceedings of the 2015 10th IEEE Conference on Industrial Electronics and Applications, ICIEA 2015

Conference

Conference10th IEEE Conference on Industrial Electronics and Applications, ICIEA 2015
Country/TerritoryNew Zealand
CityAuckland
Period15/06/1517/06/15

Keywords

  • face detection
  • face recognition
  • NMF
  • speaker verification
  • SVM
  • Video summarization

Fingerprint

Dive into the research topics of 'Video summarization based on face recognition and speaker verification'. Together they form a unique fingerprint.

Cite this