Probabilistic latent prosody analysis for robust speaker verification

Zi He Chen, Zhi Ren Zeng, Yuan Fu Liao, Yau Tarng Juang

研究成果: 書貢獻/報告類型會議論文篇章同行評審

4 引文 斯高帕斯(Scopus)

摘要

In this investigation, two probabilistic latent semantic analyses (PLSA)-based approaches are proposed for use in speaker verification systems to reduce the number of parameters required by prosodic speaker models to (1) estimate reliably speakers' bi-gram models and to (2) reduce the amount of required training and test data. The basic concept is to (1) adopt PLSA to smooth the underlying n-gram-based prosodic speaker models, and to (2) use PLSA to find a compact latent prosody space to represent efficiently the constellation of speakers. The proposed approaches are evaluated on the standard single-speaker detection task of the 2001 NIST Speaker Recognition Evaluation Corpus, where only one 2minute training enrollment speech and 30s test speech on average are available, Experimental results demonstrated that the proposed approach can reduce the required number of bi-gram parameters from 112 to 88 and 63 per speaker and improve the EERs of MAP-GMM and GMM+T-norm from 12.4% and 9.5% to 10.4% and 8.4%, respectively, and finally to 8,1% after fusing all systems.

原文???core.languages.en_GB???
主出版物標題2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings
頁面I105-I108
出版狀態已出版 - 2006
事件2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006 - Toulouse, France
持續時間: 14 5月 200619 5月 2006

出版系列

名字ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
1
ISSN(列印)1520-6149

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006
國家/地區France
城市Toulouse
期間14/05/0619/05/06

指紋

深入研究「Probabilistic latent prosody analysis for robust speaker verification」主題。共同形成了獨特的指紋。

引用此