Abstract
This paper investigates a projection-based group delay scheme (PGDS) likelihood measure that significantly reduces noise contamination in speech recognition. Because the norm of the cepstral/GDS vector will be shrinked when the speech signals are corrupted by additive noise, the HMM parameters, namely, the mean vector and the covariance matrix, need to be furthermore modified. In this paper, the mean vector compensation, a covariance matrix adaptation function and state duration based upon the projection-based group delay scheme were incorporated with a semi-continuous HMM to improve the recognition rate in noisy environments. The proposed approach compensates the mean vector using a projection-based scale factor and the mean compensation bias, and fits the covariance matrix using a variance adaptive function. The bias and variance adaptive functions estimated from the training and/or testing data were used to balance the mismatch between different environments. Lastly, a state duration method was utilized to deal with the problem that the additive noise segments the error path in Viterbi decoding. Experiments declare that the PGDS presented herein can remarkably elevate the recognition performance in noisy environments.
Original language | English |
---|---|
Pages (from-to) | 611-626 |
Number of pages | 16 |
Journal | Signal Processing |
Volume | 83 |
Issue number | 3 |
DOIs | |
State | Published - Mar 2003 |
Keywords
- Hidden Markov model
- Likelihood measure
- Model compensation
- Noise-resistant feature
- Projection-based group delay scheme