類神經網路訓練結合環境群集及專家混合系統於強健性語音辨識

Chia Yung Hsu, Jia Ching Wang, Yu Tsao

研究成果: 書貢獻/報告類型會議論文篇章同行評審

摘要

Recently, automatic speech recognition (ASR) using neural network (NN) based acoustic model (AM) has achieved significant improvements. However, the mismatch (including speaker and speaking environment) of training and testing conditions still confines the applicability of ASR. This paper proposes a novel approach that combines the environment clustering (EC) and mixture of experts (MOE) algorithms (thus the proposed approach is termed EC-MOE) to enhance the robustness of ASR against mismatches. In the offline phase, we split the entire training set into several subsets, with each subset characterizing a specific speaker and speaking environment. Then, we use each subset of training data to prepare an NN-based AM. In the online phase, we use a Gaussian mixture model (GMM)-gate to determine the optimal output from the multiple NN-based AMs to render the final recognition results. We evaluated the proposed EC-MOE approach on the Aurora 2 continuous digital speech recognition task. Comparing to the baseline system, where only a single NN-based AM is used for recognition, the proposed approach achieves a clear word error rate (WER) reduction of 5.9 % (5.25% to 4.94%).

貢獻的翻譯標題Neural network training combines environment clustering with expert hybrid systems in robust speechrecognition (automatic recognition using neural networks, the analytics model with environment andblend of experts, in)
原文繁體中文
主出版物標題Proceedings of the 27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015
編輯Sin-Horng Chen, Hsin-Min Wang, Jen-Tzung Chien, Hung-Yu Kao, Wen-Whei Chang, Yih-Ru Wang, Shih-Hung Wu
發行者The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
頁面136-147
頁數12
ISBN(電子)9789573079286
出版狀態已出版 - 1 10月 2015
事件27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015 - Hsinchu, Taiwan
持續時間: 1 10月 20152 10月 2015

出版系列

名字Proceedings of the 27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015
2015-January

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015
國家/地區Taiwan
城市Hsinchu
期間1/10/152/10/15

Keywords

  • Environment clustering
  • Mixture of experts
  • Neural network
  • Robust speech recognition

指紋

深入研究「類神經網路訓練結合環境群集及專家混合系統於強健性語音辨識」主題。共同形成了獨特的指紋。

引用此