GMM-Based Speaker Verification System with Hardware MFCC in SoC Design

Tsung Han Tsai, Chiao Li Wang

Research output: Contribution to journalArticlepeer-review

Abstract

In recent years, speaker verification has been extensively explored and has significantly improved its effectiveness. It analyzes the voiceprint characteristics of speakers and finds out the differences in voiceprint characteristics between speakers for verification. In this paper, we propose a text-dependent speaker verification system and its hardware implementation of the feature extraction. The proposed speaker verification system includes two phases: enrollment and verification. In the enrollment phase, the speaker has to provide appropriate speech, such as continuous number strings, sentences, or phrases for building the speakers’ models in the system. In the verification phase, the verified speech is substituted into the enrolled speaker models, and the similarity between the speech and the models is used to discriminate. We further design the whole system in a system-on-a-chip (SoC). We focus on the Mel-frequency cepstral coefficients (MFCCs) pre-processing module on FPGA and implement the lightweight the post-processing models such as Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM) in software. A piece of speech data can be processed in 53.6ms to meet the real-time way. The proposed speaker verification system has a 93.3% accuracy rate. The overall architecture consumes only 4.26W on Xilinx ZCU104. Moreover, the proposed MFCC chip was implemented in TSMC 90nm, and the gate count is 276k at 1 volt while power consumption is 41.15 mW with a 200 MHz operating frequency.

Original languageEnglish
JournalMultimedia Tools and Applications
DOIs
StateAccepted/In press - 2023

Keywords

  • FPGA
  • Gaussian Mixture Model
  • Hidden Markov Model
  • Mel-frequency cepstral coefficients
  • Speaker verification

Fingerprint

Dive into the research topics of 'GMM-Based Speaker Verification System with Hardware MFCC in SoC Design'. Together they form a unique fingerprint.

Cite this