Regularized constrained maximum likelihood linear regression for speech recognition
- Resource Type
- Conference
- Authors
- Ghalehjegh, Sina Hamidi; Rose, Richard C.
- Source
- 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. :6319-6323 May, 2014
- Subject
- Signal Processing and Analysis
Hidden Markov models
Manifolds
Speech
Vectors
Adaptation models
Training
Speech recognition
Graph embedding
Regularization
Speaker adaptation
Constrained MLLR
- Language
- ISSN
- 1520-6149
2379-190X
The use of a graph embedding framework is investigated as a regularization technique in the expectation-maximization (EM) algorithm applied to automatic speech recognition (ASR). The technique is motivated by the fact that graph em-beddings of feature vectors have been shown to provide useful characterizations of the underlying manifolds on which these features lie. Incorporating intrinsic graphs that describe these manifolds in the optimization criteria for the EM algorithm has the effect of constraining the solution space in a way that preserves the local structure of the data. Graph embedding based regularization is applied here to estimating parameters in constrained maximum likelihood linear regression (CMLLR) speaker adaptation in continuous density hidden Markov model (CDHMM) based ASR. CMLLR adaptation has been widely used as a maximum likelihood procedure for reducing mismatch between a given HMM model and utterances from an unknown speaker through a linear feature space transformation. However, there is no guarantee that CMLLR transformations will preserve the relationships of the feature vectors along this manifold. It is argued here that graph embedding based regularization will preserve this structure. The impact of this approach on ASR performance is evaluated for unsupervised speaker adaptation on two large vocabulary speech corpora.