Maximum intelligibility-based close-loop speech synthesis framework for noisy environments
- Resource Type
- Conference
- Authors
- Liao, Yuan-Fu; Wu, Ming-Long; Lin, Jia-Chi
- Source
- 2013 IEEE International Conference on Acoustics, Speech and Signal Processing Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. :7997-8001 May, 2013
- Subject
- Signal Processing and Analysis
High-temperature superconductors
Noise measurement
Speech
Speech synthesis
Hidden Markov models
Noise
Speech recognition
speech intelligibility
automatic speech recognition
minimum classification error
- Language
- ISSN
- 1520-6149
2379-190X
This paper proposes a maximum intelligibility (MI)-based close-loop speech synthesis framework to actively compensate for the distortion of background noises. In this framework, an extra environmental noise-sensing microphone and an automatic speech recognition (ASR) module are utilized to approximate a subjective intelligibility measure. The hidden Markov model-based speech synthesis system (HTS) is then online adjusted by using the MI-based model adaptation algorithm. Experimental results of two subjective listening tests in noisy environments show that the proposed approach obtains 64% of the votes in an A/B preference test and helps the participants reduce word dictation errors by relative 26% when compared to an HTS baseline.