eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

Speaker normalization for template based speech recognition

Resource Type
Authors: Dirk Van Compernolle; Sébastien Demange
Source: INTERSPEECH
Subject: Normalization (statistics)
Computer science
business.industry
Speech recognition
Word error rate
020206 networking & telecommunications
Pattern recognition
02 engineering and technology
Speaker recognition
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
030507 speech-language pathology & audiology
03 medical and health sciences
[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL]
0202 electrical engineering, electronic engineering, information engineering
Template based
Artificial intelligence
0305 other medical science
Hidden Markov model
business
Vocal tract
Language: English

Online Access

Open Access (OpenAIRE)

초록

International audience; Vocal Tract Length Normalization (VTLN) has been shown to be an efficient speaker normalization tool for HMM based systems. In this paper we show that it is equally efficient for a template based recognition system. Template based systems, while promising, have as potential drawback that templates maintain all non phonetic details apart from the essential phonemic properties; i.e. they retain information on speaker and acoustic recording circumstances. This may lead to a very inefficient usage of the database. We show that after VTLN significantly more speakers - also from opposite gender - contribute templates to the matching sequence compared to the non-normalized case. In experiments on the Wall Street Journal database this leads to a relative word error rate reduction of 10%.

공지

DAU Library

eArticles

요약정보

Speaker normalization for template based speech recognition

Online Access

초록