Enhancing Automatic Speech Recognition Performance Through Multi-Speaker Text-to-Speech

Po Kai Chen, Hsin Min Wang, Bing Jhih Huang, Chi Tao Chen, Jia Ching Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this study, we present a novel approach to enhancing the performance of our Hakka Automatic Speech Recognition (ASR) model through the strategic use of Text-to-Speech (TTS) amplification techniques. Our investigation explores the integration of diverse speakers to expand our training dataset, leading to a notable reduction of Character Error Rate (CER) approximately 0.2 in the validation set and approximately 3.96 on the test set. These compelling results affirm the effectiveness of multi-speaker TTS strategies in generating ASR data, ultimately bolstering the resilience and precision of our ASR system.

Original languageEnglish
Title of host publicationROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing
EditorsJheng-Long Wu, Ming-Hsiang Su, Hen-Hsen Huang, Yu Tsao, Hou-Chiang Tseng, Chia-Hui Chang, Lung-Hao Lee, Yuan-Fu Liao, Wei-Yun Ma
PublisherThe Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
Pages370-375
Number of pages6
ISBN (Electronic)9789869576963
StatePublished - 2023
Event35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023 - Taipei City, Taiwan
Duration: 20 Oct 202321 Oct 2023

Publication series

NameROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing

Conference

Conference35th Conference on Computational Linguistics and Speech Processing, ROCLING 2023
Country/TerritoryTaiwan
CityTaipei City
Period20/10/2321/10/23

Keywords

  • Automatic Speech Recognition
  • Data extension
  • Multi-Speaker Text-to-Speech

Fingerprint

Dive into the research topics of 'Enhancing Automatic Speech Recognition Performance Through Multi-Speaker Text-to-Speech'. Together they form a unique fingerprint.

Cite this