摘要
In this paper, we proposed a singer identification approach to automatically identify the singer of an unknown MP3 audio data. Differing from previous researches for singer identification in MP3 compressed domain, we use Mel-Frequency Cepstral Coefficients (MFCC) as the feature instead of MDCT (modified discrete cosine transform) coefficients. Although MFCC is often used in music classification and speaker recognition, it cannot be directly obtained from compressed music data such as MP3 format. We introduce a modified method for calculating MFCC vector in MP3 compressed domain. For describing the distribution of MFCC vector, the Gaussian mixture model (GMM) is applied. To find the nearest singer, we use maximum likelihood classification (MLC) to allot each input MFCC vector to its nearest group. The experimental result verifies the feasibility of the proposed approach.
原文 | ???core.languages.en_GB??? |
---|---|
頁(從 - 到) | 1489-1509 |
頁數 | 21 |
期刊 | Multimedia Tools and Applications |
卷 | 74 |
發行號 | 4 |
DOIs | |
出版狀態 | 已出版 - 2月 2014 |