Skip to main navigation Skip to search Skip to main content

Self-supervised Learning and Masked Language Model for Code-switching Automatic Speech Recognition

  • Po Kai Chen
  • , Li Yeh Fu
  • , Cheng Kai Chen
  • , Yi Xing Lin
  • , Chih Ping Chen
  • , Chien Lin Huang
  • , Jia Ching Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Code-switching (CS) is a common linguistic phenomenon that poses significant challenges for automatic speech recognition systems due to the lack of corpus. In this paper, we propose a novel approach to address this challenge by leveraging self-supervised learning (SSL) and the masked language model (MLM) in speech recognition. Specifically, we use the wav2vec 2.0 pre-trained model to reduce the dependency on CS labeled data, and the MLM to rerank sentences generated using beam search decoding. Our proposed method is evaluated on the SEAME dataset, and experimental results show that it outperforms state-of-the-art CS speech recognition approaches by 15.6% and 19.9% in terms of token error rates (TER). Moreover, the proposed method is generalizable and can be extended to other CS languages. These results demonstrate the effectiveness of our approach and its potential for future research in the field of CS speech recognition.

Original languageEnglish
Title of host publicationICCE 2024 - 2024 IEEE 10th International Conference on Communications and Electronics
EditorsSeong Ho Jeong, Ho Dac Loc, Serge Fdida, Tho Le-Ngoc
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages387-391
Number of pages5
ISBN (Electronic)9798350379785
DOIs
StatePublished - 2024
Event10th IEEE International Conference on Communications and Electronics, ICCE 2024 - Da Nang City, Viet Nam
Duration: 31 Jul 20242 Aug 2024

Publication series

NameICCE 2024 - 2024 IEEE 10th International Conference on Communications and Electronics

Conference

Conference10th IEEE International Conference on Communications and Electronics, ICCE 2024
Country/TerritoryViet Nam
CityDa Nang City
Period31/07/242/08/24

Keywords

  • code-switching
  • masked language modeling
  • self-supervised learning
  • speech recognition

Fingerprint

Dive into the research topics of 'Self-supervised Learning and Masked Language Model for Code-switching Automatic Speech Recognition'. Together they form a unique fingerprint.

Cite this