Nonlinear multi-scale decomposition by EMD for Co-Channel speaker identification
- Resource Type
- Authors
- Amel Ben Slimane; Wajdi Ghezaiel; Ezzedine Ben Braiek
- Source
- Multimedia Tools and Applications. 76:20973-20988
- Subject
- Stationary process
Computer Networks and Communications
Computer science
Speech recognition
02 engineering and technology
Signal
Hilbert–Huang transform
030507 speech-language pathology & audiology
03 medical and health sciences
symbols.namesake
Wavelet
0202 electrical engineering, electronic engineering, information engineering
Media Technology
Voice activity detection
business.industry
020208 electrical & electronic engineering
Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
Pattern recognition
Nonlinear system
Fourier transform
Computer Science::Sound
Hardware and Architecture
symbols
Artificial intelligence
0305 other medical science
business
Software
- Language
- ISSN
- 1573-7721
1380-7501
A multi-scale analysis method, called Empirical Mode Decomposition (EMD), has been proposed for analysis of nonlinear and non stationary data. The empirical mode decomposition is a method initiated by Huang et al. as an alternative technique to the traditional Fourier and wavelet techniques for examining signals. It decomposes a signal into several components called intrinsic mode functions. This paper deals with this new tool to detect usable speech in co-channel speech. We applied empirical mode decomposition to decompose the co-channel speech signal into intrinsic oscillatory modes. Detected usable speech segments are organized into speaker streams, which are applied to speaker identification system. The system is evaluated on co-channel speech across various Targets to Interferer Ratios (TIR). Performance evaluation has shown that empirical mode decomposition performs better than linear multi-scale decomposition by discrete wavelet for usable speech detection.