TY - GEN
T1 - Recognition and retrieval of sound events using sparse coding convolutional neural network
AU - Wang, Chien Yao
AU - Santoso, Andri
AU - Mathulaprangsan, Seksan
AU - Chiang, Chin Chin
AU - Wu, Chung Hsien
AU - Wang, Jia Ching
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/8/28
Y1 - 2017/8/28
N2 - This paper proposes a novel deep convolutional neural network (CNN), called sparse coding convolutional neural network (SC-CNN), to address the problem of sound event recognition and retrieval task. Unlike the general framework of a CNN, in which feature learning process is performed hierarchically, the proposed framework models the whole memorizing procedures in the human brain, including encoding, storage, and recollection. Sound data from the RWCP sound scene dataset with added noise from NOISEX-92 noise dataset are used to compare the performance of the proposed system with the state-of-the-art baselines. The experimental results indicated that the proposed SC-CNN outperformed the state-of-the-art systems in sound event recognition and retrieval. In the sound event recognition task, the proposed system achieved an accuracy of 94.6%, 100% and 100% under 0db, 10db and clean noise conditions, respectively. In the retrieval task, the proposed system improves the mAP rate of the general CNN by approximately 6%.
AB - This paper proposes a novel deep convolutional neural network (CNN), called sparse coding convolutional neural network (SC-CNN), to address the problem of sound event recognition and retrieval task. Unlike the general framework of a CNN, in which feature learning process is performed hierarchically, the proposed framework models the whole memorizing procedures in the human brain, including encoding, storage, and recollection. Sound data from the RWCP sound scene dataset with added noise from NOISEX-92 noise dataset are used to compare the performance of the proposed system with the state-of-the-art baselines. The experimental results indicated that the proposed SC-CNN outperformed the state-of-the-art systems in sound event recognition and retrieval. In the sound event recognition task, the proposed system achieved an accuracy of 94.6%, 100% and 100% under 0db, 10db and clean noise conditions, respectively. In the retrieval task, the proposed system improves the mAP rate of the general CNN by approximately 6%.
KW - Sound event recognition
KW - Sound event retrieval
KW - Sparse coding convolutional neural network
UR - http://www.scopus.com/inward/record.url?scp=85030237057&partnerID=8YFLogxK
U2 - 10.1109/ICME.2017.8019552
DO - 10.1109/ICME.2017.8019552
M3 - 會議論文篇章
AN - SCOPUS:85030237057
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
SP - 589
EP - 594
BT - 2017 IEEE International Conference on Multimedia and Expo, ICME 2017
PB - IEEE Computer Society
T2 - 2017 IEEE International Conference on Multimedia and Expo, ICME 2017
Y2 - 10 July 2017 through 14 July 2017
ER -