학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Face-Dubbing++: LIP-Synchronous, Voice Preserving Translation Of Videos

Resource Type: Conference
Authors: Waibel, Alexander; Behr, Moritz; Yaman, Dogucan; Eyiokur, Fevziye Irem; Nguyen, Tuan-Nam; Mullov, Carlos; Demirtas, Mehmet Arif; Kantarci, Alperen; Constantin, Stefan; Ekenel, Hazim Kemal
Source: 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW) Acoustics, Speech, and Signal Processing Workshops (ICASSPW), 2023 IEEE International Conference on. :1-5 Jun, 2023
Subject: Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Adaptation models
Lips
Conferences
Signal processing
Speech
Acoustics
Synchronization
end-to-end video translation
speech translation
text-to-speech
voice conversion
lip generation
Language

Online Access

Full Text (IEEE)

초록

In this paper, we propose a neural end-to-end system for voice preserving and lip-synchronous video translation. The system is designed to combine multiple component models and produces a video of the original speaker speaking in the target language that is lip-synchronous with the target speech, yet maintains emphases in speech, voice characteristics, and face video of the original speaker. The result is a video of a speaker speaking in another language without actually knowing it. For the evaluation, we present a user study of the complete system and separate evaluations of the single components. Since there is no available dataset to evaluate our whole system, we collect a test set to evaluate our system. The results indicate that our system is able to generate convincing videos of the original speaker speaking the target language while preserving the original speaker’s characteristics.

공지

DAU Library

학술논문

요약정보

Face-Dubbing++: LIP-Synchronous, Voice Preserving Translation Of Videos

Online Access

초록