학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Deep Speaker Embeddings with Convolutional Neural Network on Supervector for Text-Independent Speaker Recognition

Resource Type: Conference
Authors: Cai, Danwei; Cai, Zexin; Li, Ming
Source: 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2018. :1478-1482 Nov, 2018
Subject: Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Signal Processing and Analysis
Phonetics
Mel frequency cepstral coefficient
Decision trees
Correlation
Task analysis
Indexes
Speaker verification
text-independent
CNN
supervector
deep speaker embedding
Language
ISSN: 2640-0103

Online Access

Full Text (IEEE)

초록

Lexical content variability in different utterances is the key challenge for text-independent speaker verification. In this paper, we investigate using supervector which has ability to reduce the impact of lexical content mismatch among different utterances for supervised speaker embedding learning. A DNN acoustic model is used to align a feature sequence to a set of senones and generate centered and normalized first order statistics supervector. Statistics vectors from similar senones are placed together and reshaped to an image to maintain the local continuity and correlation. The supervector image is then fed into residual convolutional neural network. The deep speaker embedding features are the outputs of the last hidden layer of the network and we employ a PLDA back-end for the subsequent modeling. Experimental results show that the proposed method outperforms the conventional GMM-UBM i-vector system and is complementary to the DNN-UBM i-vector system. The score level fusion system achieves 1.26% ERR and 0.260 DCF10 cost on the NIST SRE 10 extended core condition 5 task.

공지

DAU Library

학술논문

요약정보

Deep Speaker Embeddings with Convolutional Neural Network on Supervector for Text-Independent Speaker Recognition

Online Access

초록