Spectral-Temporal Receptive Field-Based Descriptors and Hierarchical Cascade Deep Belief Network for Guitar Playing Technique Classification

Chien Yao Wang, Pao Chi Chang, Jian Jiun Ding, Tzu Chiang Tai, Andri Santoso, Yu Ting Liu, Jia Ching Wang

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

Music information retrieval is of great interest in audio signal processing. However, relatively little attention has been paid to the playing techniques of musical instruments. This work proposes an automatic system for classifying guitar playing techniques (GPTs). Automatic classification for GPTs is challenging because some playing techniques differ only slightly from others. This work presents a new framework for GPT classification: it uses a new feature extraction method based on spectral-temporal receptive fields (STRFs) to extract features from guitar sounds. This work applies a supervised deep learning approach to classify GPTs. Specifically, a new deep learning model, called the hierarchical cascade deep belief network (HCDBN), is proposed to perform automatic GPT classification. Several simulations were performed and the datasets of: 1) data on onsets of signals; 2) complete audio signals; and 3) audio signals in a real-world environment are adopted to compare the performance. The proposed system improves upon the F-score by approximately 11.47% in setup 1) and yields an F-score of 96.82% in setup 2). The results in setup 3) demonstrate that the proposed system also works well in a real-world environment. These results show that the proposed system is robust and has very high accuracy in automatic GPT classification.

Original languageEnglish
Pages (from-to)3684-3695
Number of pages12
JournalIEEE Transactions on Cybernetics
Volume52
Issue number5
DOIs
StatePublished - 1 May 2022

Keywords

  • Deep belief network (DBN)
  • guitar playing technique (GPT) classification
  • neural network
  • spectral-temporal receptive fields (STRFs)

Fingerprint

Dive into the research topics of 'Spectral-Temporal Receptive Field-Based Descriptors and Hierarchical Cascade Deep Belief Network for Guitar Playing Technique Classification'. Together they form a unique fingerprint.

Cite this