eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

Data Augmentation with ECAPA-TDNN Architecture for Automatic Speaker Recognition

Resource Type: Conference
Authors: Li, Pinyan; Hoi, Lap Man; Wang, Yapeng; Im, Sio Kei
Source: 2023 12th International Conference on Renewable Energy Research and Applications (ICRERA) Renewable Energy Research and Applications (ICRERA), 2023 12th International Conference on. :414-420 Aug, 2023
Subject: Components, Circuits, Devices and Systems
Computing and Processing
Engineering Profession
Power, Energy and Industry Applications
Renewable energy sources
Fuses
Neural networks
Training data
Data augmentation
Data models
Robustness
data augmentation
ECAPA-TDNN
automatic speaker recognition
Language
ISSN: 2572-6013

Online Access

Full Text (IEEE)

초록

This paper focuses on seven data augmentation methods based on the Emphasized Channel Attention Propagation and Aggregation-Time Delay Neural Network (ECAPA-TDNN) model for increasing the diversity of training data to improve model accuracy and true positive rate (TPR/recall). We propose a method to improve classification performance by replacing and reducing the datasets. We also verified the effect of the number of layers on the classification performance by modifying the number of layers of the SE-Res2Block in the ECAPA-TDNN model. The proposed method is validated with the ZhVoice and VoxCeleb datasets, and the results show that the best model accuracy and classification performance can be obtained by using ZhVoice with seven data augmentations on a 3-layer SE-Res2Block. The accuracy reached 0.9477, the TPR reached 0.8945, and the EER was 0.1278. We also used the diagonal cosine algorithm to determine the similarity between two speakers, validating the classification performance of the model.

공지

DAU Library

eArticles

요약정보

Data Augmentation with ECAPA-TDNN Architecture for Automatic Speaker Recognition

Online Access

초록