Full covariance state duration modeling for HMM-based speech synthesis
- Resource Type
- Conference
- Authors
- Heng Lu; Wu, Yi-Jian; Tokuda, Keiichi; Dai, Li-Rong; Wang, Ren-Hua
- Source
- 2009 IEEE International Conference on Acoustics, Speech and Signal Processing Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on. :4033-4036 Apr, 2009
- Subject
- Signal Processing and Analysis
Components, Circuits, Devices and Systems
Hidden Markov models
Speech synthesis
Covariance matrix
Context modeling
Gaussian distribution
Predictive models
Training data
Computer science
High temperature superconductors
Flowcharts
full covariance
duration
HMM
speech synthesis
- Language
- ISSN
- 1520-6149
2379-190X
This paper proposes a state duration modeling method using full covariance matrix for HMM-based speech synthesis. In this method, a full covariance matrix instead of the conventional diagonal covariance matrix is adopted in the multi-dimensional Gaussian distribution to model the state duration of each context-dependent phoneme. At synthesis stage, the state durations are predicted using the clustered context-dependent distributions with full covariance matrices. Experimental results show that the synthesized speech using full-covariance state duration models is more natural than the conventional method when we change the speaking rate of synthesized speech.