BBN technologies' OpenSAD system
- Resource Type
- Conference
- Authors
- Novotney, Scott; Karakos, Damianos; Silovsky, Jan; Schwartz, Rich
- Source
- 2016 IEEE Spoken Language Technology Workshop (SLT) Spoken Language Technology Workshop (SLT), 2016 IEEE. :8-12 Dec, 2016
- Subject
- Signal Processing and Analysis
Speech
NIST
Rats
Computer architecture
Training
Data models
Frequency modulation
speech activity detection
unsupervised adaptation
OpenSAD
- Language
We describe our submission to the NIST OpenSAD evaluation of speech activity detection of noisy audio generated by the DARPA RATS program. With frequent transmission degradation, channel interference and other noises added, simple energy thresholds do a poor job at SAD for this audio. The evaluation measured performance on both in-training and novel channels. Our approach used a system combination of feed-forward neural networks and bidirectional LSTM recurrent neural networks. System combination and unsupervised adaptation provided further gains on novel channels that lack training data. These improvements lead to a 26% relative improvement for novel channels over simple decoding. Our system resulted in the lowest error rate on the in-training channels and second on the out-of-training channels.