TY - JOUR
T1 - Sound Events Recognition and Retrieval Using Multi-Convolutional-Channel Sparse Coding Convolutional Neural Networks
AU - Wang, Chien Yao
AU - Tai, Tzu Chiang
AU - Wang, Jia Ching
AU - Santoso, Andri
AU - Mathulaprangsan, Seksan
AU - Chiang, Chin Chin
AU - Wu, Chung Hsien
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2020
Y1 - 2020
N2 - This article proposes two novel deep convolutional neural networks (CNN), which are called the sparse coding convolutional neural network (SC-CNN) and the multi-convolutional-channel SC-CNN (MSC-CNN), to address the sound event recognition and retrieval problem. Unlike the general framework of a CNN, in which the feature learning process is performed hierarchically, the proposed framework models the whole memorization process in the human brain, including encoding, storage, and recollection. In particular, the MSC-CNN is designed to recognize multiple sound events that occur simultaneously. The experimental results indicate that the proposed SC-CNN and MSC-CNN outperforms the state-of-the-art systems in sound event recognition and retrieval.
AB - This article proposes two novel deep convolutional neural networks (CNN), which are called the sparse coding convolutional neural network (SC-CNN) and the multi-convolutional-channel SC-CNN (MSC-CNN), to address the sound event recognition and retrieval problem. Unlike the general framework of a CNN, in which the feature learning process is performed hierarchically, the proposed framework models the whole memorization process in the human brain, including encoding, storage, and recollection. In particular, the MSC-CNN is designed to recognize multiple sound events that occur simultaneously. The experimental results indicate that the proposed SC-CNN and MSC-CNN outperforms the state-of-the-art systems in sound event recognition and retrieval.
KW - Sound event recognition
KW - deep learning
KW - sound event retrieval
KW - sparse coding convolutional neural network
UR - http://www.scopus.com/inward/record.url?scp=85087459713&partnerID=8YFLogxK
U2 - 10.1109/TASLP.2020.2964959
DO - 10.1109/TASLP.2020.2964959
M3 - 期刊論文
AN - SCOPUS:85087459713
SN - 2329-9290
VL - 28
SP - 1875
EP - 1887
JO - IEEE/ACM Transactions on Audio Speech and Language Processing
JF - IEEE/ACM Transactions on Audio Speech and Language Processing
M1 - 8952659
ER -