eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

DCHT: Deep Complex Hybrid Transformer for Speech Enhancement

Resource Type: Conference
Authors: Li, Jialu; Li, Junhui; Wang, Pu; Zhang, Youshan
Source: 2023 Third International Conference on Digital Data Processing (DDP) DDP Digital Data Processing (DDP), 2023 Third International Conference on. :117-122 Nov, 2023
Subject: Computing and Processing
Measurement
Neural networks
Speech enhancement
Transformers
Data processing
Data models
Spectrogram
complex deep neural network
speech enhancement
hybrid transformer
Language

Online Access

Full Text (IEEE)

초록

Most of the current deep learning-based approaches for speech enhancement only operate in the spectrogram or wave-form domain. Although a cross-domain transformer combining waveform- and spectrogram-domain inputs has been proposed, its performance can be further improved. In this paper, we present a novel deep complex hybrid transformer that integrates both spectrogram and waveform domains approaches to improve the performance of speech enhancement. The proposed model consists of two parts: a complex Swin-Unet in the spectrogram domain and a dual-path transformer network (DPTnet) in the waveform domain. We first construct a complex Swin-$V$ net network in the spectrogram domain and perform speech enhancement in the complex audio spectrum. We then introduce improved DPT by adding memory-compressed attention. Our model is capable of learning multi-domain features to reduce existing noise on different domains in a complementary way. The experimental results on the BirdSoundsDenoising dataset and the VCTK+DEMAND dataset indicate that our method can achieve better performance compared to state-of-the-art methods.

공지

DAU Library

eArticles

요약정보

DCHT: Deep Complex Hybrid Transformer for Speech Enhancement

Online Access

초록