Source counting in speech mixtures using a variational EM approach for complex WATSON mixture models
- Resource Type
- Conference
- Authors
- Drude, Lukas; Chinaev, Aleksej; Vu, Dang Hai Tran; Haeb-Umbach, Reinhold
- Source
- 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. :6834-6838 May, 2014
- Subject
- Signal Processing and Analysis
Vectors
Speech
Equations
Mathematical model
Microphones
Noise
Computational modeling
Blind source separation
Bayes methods
Directional statistics
Number of speakers
- Language
- ISSN
- 1520-6149
2379-190X
In this contribution we derive a variational EM (VEM) algorithm for model selection in complex Watson mixture models, which have been recently proposed as a model of the distribution of normalized microphone array signals in the short-time Fourier transform domain. The VEM algorithm is applied to count the number of active sources in a speech mixture by iteratively estimating the mode vectors of the Watson distributions and suppressing the signals from the corresponding directions. A key theoretical contribution is the derivation of the MMSE estimate of a quadratic form involving the mode vector of the Watson distribution. The experimental results demonstrate the effectiveness of the source counting approach at moderately low SNR. It is further shown that the VEM algorithm is more robust with respect to used threshold values.