Discriminative training of complex-valued deep recurrent neural network for singing voice separation

Yuan Shan Lee, Kuo Yu, Sih Huei Chen, Jia Ching Wang

研究成果: 書貢獻/報告類型會議論文篇章同行評審

4 引文 斯高帕斯(Scopus)

摘要

Deep neural networks (DNN) have performed impressively in the processing of multimedia signals. Most DNN-based approaches were developed to handle real-valued data; very few have been designed for complex-valued data, despite their being essential for processing various types of multimedia signal. Accordingly, this work presents a complex-valued deep recurrent neural network (C-DRNN) for singing voice separation. The C-DRNN operates on the complex-valued short-time discrete Fourier transform (STFT) domain. A key aspect of the C-DRNN is that the activations and weights are complex-valued. The goal herein is to reconstruct the singing voice and the background music from a mixed signal. For error back-propagation, ℂℝ-calculus is utilized to calculate the complex-valued gradients of the objective function. To reinforce model regularity, two constraints are incorporated into the objective function of the C-DRNN. The first is an additional masking layer that ensures the sum of separated sources equals the input mixture. The second is a discriminative term that preserves the mutual difference between two separated sources. Finally, the proposed method is evaluated using the MIR-1K dataset and a singing voice separation task. Experimental results demonstrate that the proposed method outperforms the state-of-the-art DNN-based methods.

原文???core.languages.en_GB???
主出版物標題MM 2017 - Proceedings of the 2017 ACM Multimedia Conference
發行者Association for Computing Machinery, Inc
頁面1327-1335
頁數9
ISBN(電子)9781450349062
DOIs
出版狀態已出版 - 23 10月 2017
事件25th ACM International Conference on Multimedia, MM 2017 - Mountain View, United States
持續時間: 23 10月 201727 10月 2017

出版系列

名字MM 2017 - Proceedings of the 2017 ACM Multimedia Conference

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???25th ACM International Conference on Multimedia, MM 2017
國家/地區United States
城市Mountain View
期間23/10/1727/10/17

指紋

深入研究「Discriminative training of complex-valued deep recurrent neural network for singing voice separation」主題。共同形成了獨特的指紋。

引用此