Content-based singer classification on compressed domain audio data

Tsung Han Tsai, Yu Siang Huang, Pei Yun Liu, De Ming Chen

研究成果: 雜誌貢獻期刊論文同行評審

3 引文 斯高帕斯(Scopus)


In this paper, we proposed a singer identification approach to automatically identify the singer of an unknown MP3 audio data. Differing from previous researches for singer identification in MP3 compressed domain, we use Mel-Frequency Cepstral Coefficients (MFCC) as the feature instead of MDCT (modified discrete cosine transform) coefficients. Although MFCC is often used in music classification and speaker recognition, it cannot be directly obtained from compressed music data such as MP3 format. We introduce a modified method for calculating MFCC vector in MP3 compressed domain. For describing the distribution of MFCC vector, the Gaussian mixture model (GMM) is applied. To find the nearest singer, we use maximum likelihood classification (MLC) to allot each input MFCC vector to its nearest group. The experimental result verifies the feasibility of the proposed approach.

頁(從 - 到)1489-1509
期刊Multimedia Tools and Applications
出版狀態已出版 - 2月 2014


深入研究「Content-based singer classification on compressed domain audio data」主題。共同形成了獨特的指紋。