Robust speech recognition based on binaural speech enhancement system as a preprocessing step
- Resource Type
- Authors
- Cuong Nguyen Quoc; Binh Nguyen Huu; Khoa Nguyen Dang; Dung Tran Tien
- Source
- SoICT
- Subject
- Speech enhancement
Masking (art)
Voice activity detection
Computer Science::Sound
Computer science
Speech recognition
Frame (networking)
ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION
k-means clustering
Preprocessor
Speech processing
Binaural recording
- Language
In this paper, we present a robust speech recognition based on binaural speech enhancement system as a preprocessing step. This system uses an existing dereverberation technique followed by a spatial masking-based noise removal algorithm where only signals coming from the desired directions are retained by using a threshold angle. While state-of-the art approaches fix the threshold angle heuristically over all time frames, in this paper, we propose to consider an adaptive computation where this threshold angle is first learned in several noise-only frames and then updated frame by frame. Speech recognition results in real environment show the effectiveness of the proposed speech enhancement approach.