학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Any-to-Any Voice Conversion with F0 and Timbre Disentanglement and Novel Timbre Conditioning

Resource Type: Conference
Authors: Kovela, Sudheer; Valle, Rafael; Dantrey, Ambrish; Catanzaro, Bryan
Source: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2023 - 2023 IEEE International Conference on. :1-5 Jun, 2023
Subject: Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Signal processing
Real-time systems
Acoustics
Decoding
Timbre
Speech processing
Language
ISSN: 2379-190X

Online Access

Full Text (IEEE)

초록

Despite recent advances in voice conversion (VC), it is still challenging to do real-time one-shot voice conversion with good control over timbre and F 0 . In this work, we present a PPG-based VC model that directly decodes waveforms. We designed a speaker conditioned decoder based on HiFi-GAN[1], along with a new discriminator that produces high quality audio. Using an F 0 prenet and F 0 augmented speaker encoder, we are able to control F 0 and timbre independently with high fidelity. Our objective and subjective evaluations show that our method is preferred over others in terms of audio quality, timbre similarity and prosody retention.

공지

DAU Library

학술논문

요약정보

Any-to-Any Voice Conversion with F0 and Timbre Disentanglement and Novel Timbre Conditioning

Online Access

초록