MediaEval 2015: Recurrent neural network approach to emotion in music tack

Yu Hao Chin, Jia Ching Wang

Research output: Contribution to journalConference articlepeer-review

Abstract

This paper describes our work for the "Emotion in Music" task of MediaEval 2015. The goal of the task is predicting affective content of a song. The affective content is presented in terms of valence and arousal criterions, which are shown in a timecontinuous fashion. We adopt deep recurrent neural network (DRNN) to predict the valence and arousal for each moment of a song, and Limited-Memory-Broyden-Fletcher-Goldfarb-Shanno algorithm (LBFGS) is used to update the weights when doing back-propagation. DRNN considers the target of the previous time segments when predicting the target of the current time segment. Such time-considering manners of predictions are believed to achieve better performance in comparison of common machine learning models. We finally use the baseline feature set, adopted by the champion of last year, after comparing it with our feature set. A 10-fold cross validation evaluation is used to do the inner-experiments. The system achieves r values of -0.5904 for valence and 0.4195 for arousal. The Root-Mean-Squared Error (RMSE) for valence and arousal are 0.4054 and 0.3804, respectively. For the evaluation dataset, the system achieves r values of -0.0103+-0.3420 for valence and 0.3417+-0.2501 for arousal. The Root-Mean-Squared Error for valence and arousal are 0.3359+-0.1614 and 0.2555+-0.1255, respectively.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume1436
StatePublished - 2015
EventMultimedia Benchmark Workshop, MediaEval 2015 - Wurzen, Germany
Duration: 14 Sep 201515 Sep 2015

Fingerprint

Dive into the research topics of 'MediaEval 2015: Recurrent neural network approach to emotion in music tack'. Together they form a unique fingerprint.

Cite this