Incorporating local environment information with ensemble neural networks to robust automatic speech recognition

Chia Yung Hsu, Ryandhimas E. Zezario, Jia Ching Wang, Chin Wen Ho, Xugang Lu, Yu Tsao

研究成果: 書貢獻/報告類型會議論文篇章同行評審

摘要

This paper proposes an ensemble neural network (ENN) framework for robust automatic speech recognition (ASR). The proposed ENN framework can be divided into offline and online phases. In the offline phase, the ENN framework first applies an environment clustering technique to partition the training data into several subsets, where each subset characterizes specific local information of the entire acoustic space. Next, each subset of training data is adopted to train an NN acoustic model. Finally, the entire set of training data is used to estimate a gating function, which can determine the most suitable NN acoustic model given an input utterance. In the online phase, given the testing utterance, the gating function specifies the optimal NN acoustic model to perform speech recognition. Because local environment information is incorporated, ENN can effectively determine the NN acoustic model that optimally matches the testing condition. The proposed framework was evaluated on the Aurora-2 task. Experimental results show that the proposed ENN framework can provide a notable word error rate reduction of 5.35% (from 5.05% to 4.78%) when compared to the baseline.

原文???core.languages.en_GB???
主出版物標題Proceedings of 2016 10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016
編輯Hsin-Min Wang, Qingzhi Hou, Yuan Wei, Tan Lee, Jianguo Wei, Lei Xie, Hui Feng, Jianwu Dang, Jianwu Dang
發行者Institute of Electrical and Electronics Engineers Inc.
ISBN(電子)9781509042937
DOIs
出版狀態已出版 - 2 5月 2017
事件10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016 - Tianjin, China
持續時間: 17 10月 201620 10月 2016

出版系列

名字Proceedings of 2016 10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016
國家/地區China
城市Tianjin
期間17/10/1620/10/16

指紋

深入研究「Incorporating local environment information with ensemble neural networks to robust automatic speech recognition」主題。共同形成了獨特的指紋。

引用此