Complex-Valued Gaussian Process Latent Variable Model for Phase-Incorporating Speech Enhancement

Sih Huei Chen, Yuan Shan Lee, Jia Ching Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Traditional speech enhancement techniques modify the magnitude of a speech in time-frequency domain, and use the phase of a noisy speech to resynthesize a time domain speech. This work proposes a complex-valued Gaussian process latent variable model (CGPLVM) to enhance directly the complex-valued noisy spectrum, modifying not only the magnitude but also the phase. The main idea that underlies the developed method is the modeling of short-time Fourier transform (STFT) coefficients across the time frames of a speech as a proper complex Gaussian process (GP) with noise added. The proposed method is based on projecting the spectrum into a low-dimensional subspace. Experiments were carried out on the CHTTL database, which contains the digits zero to nine in Mandarin. Several standard measures are used to demonstrate that the proposed method outperforms baselines with various types of noise and SNR levels.

Original languageEnglish
Title of host publication2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5439-5443
Number of pages5
ISBN (Print)9781538646588
DOIs
StatePublished - 10 Sep 2018
Event2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Calgary, Canada
Duration: 15 Apr 201820 Apr 2018

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2018-April
ISSN (Print)1520-6149

Conference

Conference2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018
Country/TerritoryCanada
CityCalgary
Period15/04/1820/04/18

Keywords

  • Binary mask
  • Complex-valued Gaussian process latent variable model
  • Phase

Fingerprint

Dive into the research topics of 'Complex-Valued Gaussian Process Latent Variable Model for Phase-Incorporating Speech Enhancement'. Together they form a unique fingerprint.

Cite this