학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Generative Speech Coding with Predictive Variance Regularization

Resource Type: Conference
Authors: Kleijn, W. Bastiaan; Storus, Andrew; Chinen, Michael; Denton, Tom; Lim, Felicia S. C.; Luebs, Alejandro; Skoglund, Jan; Yeh, Hengchin
Source: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2021 - 2021 IEEE International Conference on. :6478-6482 Jun, 2021
Subject: Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Speech codecs
Performance evaluation
Sensitivity
Acoustic distortion
Speech coding
Noise reduction
Mobile handsets
Speech
coding
WaveNet
regularization
Language
ISSN: 2379-190X

Online Access

Full Text (IEEE)

초록

The recent emergence of machine-learning based generative models for speech suggests a significant reduction in bit rate for speech codecs is possible. However, the performance of generative models deteriorates significantly with the distortions present in real-world input signals. We argue that this deterioration is due to the sensitivity of the maximum likelihood criterion to outliers and the ineffectiveness of modeling a sum of independent signals with a single autoregressive model. We introduce predictive-variance regularization to reduce the sensitivity to outliers, resulting in a significant increase in performance. We show that noise reduction to remove unwanted signals can significantly increase performance. We provide extensive subjective performance evaluations that show that our system based on generative modeling provides state-of-the-art coding performance at 3 kb/s for real-world speech signals at reasonable computational complexity.

공지

DAU Library

학술논문

요약정보

Generative Speech Coding with Predictive Variance Regularization

Online Access

초록