학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Single-Channel Speech Separation Focusing on Attention DE

Resource Type: Conference
Authors: Li, Xinshu; Tan, Zhenhua; Xia, Zhenche; Wu, Danke; Zhang, Bin
Source: 2022 26th International Conference on Pattern Recognition (ICPR) Pattern Recognition (ICPR), 2022 26th International Conference on. :3204-3209 Aug, 2022
Subject: Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Training
Convolution
Particle separators
Focusing
Speech recognition
Speech enhancement
Feature extraction
Speech Separation
Deep-learning
Encoder
SepFormer Block
Language
ISSN: 2831-7475

Online Access

Full Text (IEEE)

초록

In recent multi-speaker speech separation researches, the overall deep-learning-based architecture consists of three parts: encoder, separator, and decoder. But improvement strategies generally only focus on the separator in the middle, regardless of its input. The most common encoder structure at present is a single 1D convolution layer followed by a nonlinear activation function, ReLU. In this paper, we firstly propose a new encoder named Attention DE, trying to improve the input effectiveness of the separator. The new encoder adds extra 1D convolutional layers and the multi-head attention mechanism to enhance the feature aggregation ability of input speech. Secondly, instead of RNNs, our separator uses SepFormer Blocks to improve the training efficiency and learn the speech sequence patterns better. Experiments show that the Attention DE is generally applicable to improve the performance of the single-channel speech separation model based on the time domain. The method of Attention DE fusion SepFormer blocks achieves an advanced SI-SNRi of 20.3dB on WSJ0-2MIX. Code is publicly available at https://github.com/TAN-OpenLab/AttentionDE.

공지

DAU Library

학술논문

요약정보

Single-Channel Speech Separation Focusing on Attention DE

Online Access

초록