eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

Multi‐microphone cross‐correlation based processing for robust speech recognition

Resource Type
Authors: Thomas M. Sullivan; Richard M. Stern
Source: The Journal of the Acoustical Society of America. 93:2319-2319
Subject: Adaptive filter
Signal processing
Acoustics and Ultrasonics
Arts and Humanities (miscellaneous)
Noise (signal processing)
Computer science
Microphone
Vowel
Speech recognition
Cepstrum
Speech processing
Binaural recording
Energy (signal processing)
Language
ISSN: 0001-4966

Online Access

초록

A new algorithm of signal processing for robust speech recognition using multiple microphones is described. The algorithm, loosely based on human binaural perception, consists of imposing time‐aligning delays on the speech signals from each microphone and passing the delayed speech through a bank of bandpass filters and nonlinear rectifiers. The outputs of the nonlinear rectifiers within each frequency band are cross‐correlated, providing an estimate of the spectral profile of short‐term energy in the speech signal that is resilient to the presence of off‐axis noise sources. A cepstral representation of these energy estimates is used as the feature set for automatic speech recognition using the CMU SPHINX system. The multichannel cross‐correlation‐based algorithm was found to preserve the shape of vowel spectra in additive noise, and it provides better recognition accuracy than is obtained using equivalent single‐channel processing with nonclosetalking microphones. The performance of this system was compared to that obtained using delay‐and‐sum beamforming and conventional adaptive filtering approaches. Finally, some implications of these results for human binaural hearing will be commented on. [Work supported by Motorola and DARPA.]

공지

DAU Library

eArticles

요약정보

Multi‐microphone cross‐correlation based processing for robust speech recognition

Online Access

초록