Model Adaptation for HMM-Based Speech Synthesis under Minimum Generation Error Criterion
- Resource Type
- Conference
- Authors
- Qin, Long; Wu, Yi-Jian; Ling, Zhen-Hua; Wang, Ren-Hua
- Source
- 2008 Tenth IEEE International Symposium on Multimedia Multimedia, 2008. ISM 2008. Tenth IEEE International Symposium on. :539-544 Dec, 2008
- Subject
- Computing and Processing
Communication, Networking and Broadcast Technologies
Adaptation model
Hidden Markov models
Speech synthesis
Maximum likelihood linear regression
Linear regression
Training data
Covariance matrix
Maximum likelihood estimation
Clustering algorithms
USA Councils
HMM-based speech synthesis
model adaptation
minimum generation error
- Language
In order to solve the issues related to the maximum likelihood (ML) based HMM training for HMM-based speech synthesis, a minimum generation error (MGE) criterion had been proposed. This paper continues to apply the MGE criterion to model adaptation for HMM-based speech synthesis. We introduce a MGE linear regression (MGELR) based model adaptation algorithm, where the transforms from source HMMs to target HMMs are optimized to minimize the generation errors for the adaptation data of the target speaker. The regression matrices for both mean vector and covariance matrix of Gaussian distribution are re-estimated. The proposed MGELR approach was compared with the maximum likelihood linear regression (MLLR) based model adaptation. Experimental results indicate that the generation errors were reduced after the MGELR-based model adaptation. And from the subjective listening test, the speaker similarity and the quality of the synthesized speech using MGELR were better than the results using MLLR.