eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

Contribution of Timbre and Shimmer Features to Deepfake Speech Detection

Resource Type: Conference
Authors: Chaiwongyen, Anuwat; Songsriboonsit, Norranat; Duangpummet, Suradej; Karnjana, Jessada; Kongprawechnon, Waree; Unoki, Masashi
Source: 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022 Asia-Pacific. :97-103 Nov, 2022
Subject: Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Voice activity detection
Deepfakes
Frequency-domain analysis
Neural networks
Information processing
Multilayer perceptrons
Feature extraction
Language
ISSN: 2640-0103

Online Access

Full Text (IEEE)

초록

Advanced deep-learning techniques can generate natural and synthetic voices that might be close to someone's voice. Nevertheless, misuse of such technologies is of great concern. Hence, researchers focus on detecting these malicious synthetic voices, called “deepfake speech.” Although many feature extractions and classifications have been proposed, the accuracy of deepfake detection is still unreliable. In addition, most of the current features are computed in the frequency domain. To this end, we conducted experiments to investigate the contribution of two acoustic features and deepfake speech signals. The acoustic features are timbre and shimmer, which represent our auditory perception in the time domain. We point out that eight timbre components and four shimmer components significantly contribute to discriminating deepfake speech from genuine speech. We also propose a method for detecting deepfake speech based on these timbre and shimmer features. The method was evaluated by using a dataset from the Audio Deep Synthesis Detection Challenge (ADD 2022). The results suggest that combining these eight timbre components and four shimmer components with a simple classifier using multilayer perceptron neural networks can enable deepfake speech to be detected potentially effectively.

공지

DAU Library

eArticles

요약정보

Contribution of Timbre and Shimmer Features to Deepfake Speech Detection

Online Access

초록