Fast Gated Recurrent Network for Speech Synthesis

Bima Prihasto, Tzu Chiang Tai, Pao Chi Chang, Jia Ching Wang

Research output: Contribution to journalArticlepeer-review

Abstract

The recurrent neural network (RNN) has been used in audio and speech processing, such as language translation and speech recognition. Although RNN-based architecture can be applied to speech synthesis, the long computing time is still the primary concern. This research proposes a fast gated recurrent neural network, a fast RNN-based architecture, for speech synthesis based on the minimal gated unit (MGU). Our architecture removes the unit state history from some equations in MGU. OurMGU-based architecture is about twice faster, with equally good sound quality than the other MGU-based architectures.

Original languageEnglish
Pages (from-to)1634-1638
Number of pages5
JournalIEICE Transactions on Information and Systems
VolumeE105D
Issue number9
DOIs
StatePublished - Sep 2022

Keywords

  • acoustic modelling
  • gated recurrent neural network
  • long short-term memory
  • speech synthesis

Fingerprint

Dive into the research topics of 'Fast Gated Recurrent Network for Speech Synthesis'. Together they form a unique fingerprint.

Cite this