학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Ensemble of Deep Neural Network Models for MOS Prediction

Resource Type: Conference
Authors: Kunesova, Marie; Matousek, Jindrich; Lehecka, Jan; Svec, Jan; Michalek, Josef; Tihelka, Daniel; Bulin, Martin; Hanzlicek, Zdenek; Rezackova, Marketa
Source: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2023 - 2023 IEEE International Conference on. :1-5 Jun, 2023
Subject: Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Measurement
Deep learning
Neural networks
Signal processing
Predictive models
Data models
Speech synthesis
MOS prediction
speech quality assessment
speech synthesis
mean opinion score
Language
ISSN: 2379-190X

Online Access

Full Text (IEEE)

초록

Automatic evaluation of the quality of synthetic speech has the potential to serve as a cheaper and less time-consuming alternative to standard listening tests. In this paper, we present our contribution to the ongoing research: a system for automatic prediction of the mean opinion score (MOS) given by human listeners. The system was specifically developed for the recent VoiceMOS Challenge. Following the success of fusion systems in similar challenges, our contribution is an ensemble that interpolates the outputs of seven different models: four different wav2vec models, a CNN-RNN model, QuartzNet, and the LDNet baseline. During the VoiceMOS challenge, our system achieved the second-best utterance-level MSE of 0.171 and ranged from 2nd to 8th place among all 22 participating teams in terms of other evaluation metrics.

공지

DAU Library

학술논문

요약정보

Ensemble of Deep Neural Network Models for MOS Prediction

Online Access

초록