CNEG-VC: Contrastive Learning Using Hard Negative Example In Non-Parallel Voice Conversion

Bima Prihasto, Yi Xing Lin, Phuong Thi Le, Chien Lin Huang, Jia Ching Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Contrastive learning has advantages for non-parallel voice conversion, but the previous conversion results could be better and more preserved. In previous techniques, negative samples were randomly selected in the features vector from different locations. A positive example could not be effectively pushed toward the query examples. We present contrastive learning in non-parallel voice conversion to solve this problem using hard negative examples. We named it CNEG-VC. Specifically, we teach the generator to generate negative examples. Our proposed generator has specific features. First, Instance-wise negative examples are generated based on voice input. Second, when taught with an adversarial loss, it can produce hard negative examples. The generator significantly improves non-parallel voice conversion performance. Our CNEG-VC achieved state-of-the-art results by outperforming previous techniques.

Original languageEnglish
Title of host publicationICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728163277
DOIs
StatePublished - 2023
Event48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 - Rhodes Island, Greece
Duration: 4 Jun 202310 Jun 2023

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2023-June
ISSN (Print)1520-6149

Conference

Conference48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
Country/TerritoryGreece
CityRhodes Island
Period4/06/2310/06/23

Keywords

  • contrastive learning
  • generative adversarial networks
  • hard negative example
  • non-parallel data
  • Voice conversion

Fingerprint

Dive into the research topics of 'CNEG-VC: Contrastive Learning Using Hard Negative Example In Non-Parallel Voice Conversion'. Together they form a unique fingerprint.

Cite this