Lightweight End-To-End Deep Learning Model For Music Source Separation

Yao Ting Wang, Yi Xing Lin, Kai Wen Liang, Tzu Chiang Tai, Jia Ching Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this work, we propose a lightweight end-to-end music source separation deep learning model. Deep learning models for audio source separation based on time-domain have been proposed for end-to-end processing. However, the proposed models are complex and difficult to use when the computing resources of the device are limited. Additionally, long delays may be expected since long-term inputs are required to obtain adequate results for separation, making the models unsuitable for applications that require low latency. In the proposed model, Atrous Spatial Pyramid Pooling is used to reduce the number of parameters, and the receptive field preserving decoder is utilized to enhance the result of separation while the input context length is limited. The experimental results show that the proposed method obtains better results than previous methods while using 10% or fewer parameters.

Original languageEnglish
Title of host publication2022 13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022
EditorsKong Aik Lee, Hung-yi Lee, Yanfeng Lu, Minghui Dong
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages315-318
Number of pages4
ISBN (Electronic)9798350397963
DOIs
StatePublished - 2022
Event13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022 - Singapore, Singapore
Duration: 11 Dec 202214 Dec 2022

Publication series

Name2022 13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022

Conference

Conference13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022
Country/TerritorySingapore
CitySingapore
Period11/12/2214/12/22

Keywords

  • Deep learning
  • lightweight
  • music source separation

Fingerprint

Dive into the research topics of 'Lightweight End-To-End Deep Learning Model For Music Source Separation'. Together they form a unique fingerprint.

Cite this