MediaEval 2015: Recurrent neural network approach to emotion in music tack

Yu Hao Chin, Jia Ching Wang

研究成果: 雜誌貢獻會議論文同行評審

摘要

This paper describes our work for the "Emotion in Music" task of MediaEval 2015. The goal of the task is predicting affective content of a song. The affective content is presented in terms of valence and arousal criterions, which are shown in a timecontinuous fashion. We adopt deep recurrent neural network (DRNN) to predict the valence and arousal for each moment of a song, and Limited-Memory-Broyden-Fletcher-Goldfarb-Shanno algorithm (LBFGS) is used to update the weights when doing back-propagation. DRNN considers the target of the previous time segments when predicting the target of the current time segment. Such time-considering manners of predictions are believed to achieve better performance in comparison of common machine learning models. We finally use the baseline feature set, adopted by the champion of last year, after comparing it with our feature set. A 10-fold cross validation evaluation is used to do the inner-experiments. The system achieves r values of -0.5904 for valence and 0.4195 for arousal. The Root-Mean-Squared Error (RMSE) for valence and arousal are 0.4054 and 0.3804, respectively. For the evaluation dataset, the system achieves r values of -0.0103+-0.3420 for valence and 0.3417+-0.2501 for arousal. The Root-Mean-Squared Error for valence and arousal are 0.3359+-0.1614 and 0.2555+-0.1255, respectively.

原文???core.languages.en_GB???
期刊CEUR Workshop Proceedings
1436
出版狀態已出版 - 2015
事件Multimedia Benchmark Workshop, MediaEval 2015 - Wurzen, Germany
持續時間: 14 9月 201515 9月 2015

指紋

深入研究「MediaEval 2015: Recurrent neural network approach to emotion in music tack」主題。共同形成了獨特的指紋。

引用此