Enhancing Automatic Speech Recognition Performance Through Multi-Speaker Text-to-Speech

Po Kai Chen, Hsin Min Wang, Bing Jhih Huang, Chi Tao Chen, Jia Ching Wang

研究成果: 書貢獻/報告類型會議論文篇章同行評審

摘要

In this study, we present a novel approach to enhancing the performance of our Hakka Automatic Speech Recognition (ASR) model through the strategic use of Text-to-Speech (TTS) amplification techniques. Our investigation explores the integration of diverse speakers to expand our training dataset, leading to a notable reduction of Character Error Rate (CER) approximately 0.2 in the validation set and approximately 3.96 on the test set. These compelling results affirm the effectiveness of multi-speaker TTS strategies in generating ASR data, ultimately bolstering the resilience and precision of our ASR system.

原文???core.languages.en_GB???
主出版物標題ROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing
編輯Jheng-Long Wu, Ming-Hsiang Su, Hen-Hsen Huang, Yu Tsao, Hou-Chiang Tseng, Chia-Hui Chang, Lung-Hao Lee, Yuan-Fu Liao, Wei-Yun Ma
發行者The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
頁面370-375
頁數6
ISBN(電子)9789869576963
出版狀態已出版 - 2023
事件35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023 - Taipei City, Taiwan
持續時間: 20 10月 202321 10月 2023

出版系列

名字ROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023
國家/地區Taiwan
城市Taipei City
期間20/10/2321/10/23

指紋

深入研究「Enhancing Automatic Speech Recognition Performance Through Multi-Speaker Text-to-Speech」主題。共同形成了獨特的指紋。

引用此