Acoustic scene classification using reduced mobilenet architecture

Jun Xiang Xu, Tzu Ching Lin, Tsai Ching Yu, Tzu Chiang Tai, Pao Chi Chang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Sounds are ubiquitous in our daily lives, for instance, sounds of vehicles or sounds of conversations between people. Therefore, it is easy to collect all these soundtracks and categorize them into different groups. By doing so, we can use these assets to recognize the scene. Acoustic scene classification allows us to do so by training our machine which can further be installed on devices such as smartphones. This provides people with convenience which improves our lives. Our goal is to maximize our validation rate of our machine learning results and also optimize our usage of hardware. We utilize the dataset from IEEE Detection and Classification of Acoustic Scenes and Events (DCASE) to train our machine. The data of DCASE 2017 contains 15 different kinds of outdoor audio recordings, including beach, bus, restaurant etc. In this work, we use two different types of signal processing techniques which are Log Mel and HPSS (Harmonic-Percussive Sound Separation). Next we modify and reduce the MobileNet structure to train our dataset. We also make use of fine-tuning and late fusion to make our results more accurate and to improve our performances. With the structure aforementioned, we succeed in reaching the validation rate of 75.99% which is approximately the seventh highest performing group of the Detection and Classification of Acoustic Scenes and Events (DCASE) Challenge 2017, with less computational complexity comparing with others having higher accuracy. We deem it a worthy trade-off.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE International Symposium on Multimedia, ISM 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages267-270
Number of pages4
ISBN (Electronic)9781538668573
DOIs
StatePublished - 4 Jan 2019
Event20th IEEE International Symposium on Multimedia, ISM 2018 - Taichung, Taiwan
Duration: 10 Dec 201812 Dec 2018

Publication series

NameProceedings - 2018 IEEE International Symposium on Multimedia, ISM 2018

Conference

Conference20th IEEE International Symposium on Multimedia, ISM 2018
Country/TerritoryTaiwan
CityTaichung
Period10/12/1812/12/18

Keywords

  • DCASE 2017
  • MobileNet
  • Seventh highest.

Fingerprint

Dive into the research topics of 'Acoustic scene classification using reduced mobilenet architecture'. Together they form a unique fingerprint.

Cite this