eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

Performance Comparison of Audio Tampering Detection Using Different Datasets

Resource Type: Conference
Authors: Hsu, Hsiang-Ping; Chang, Sheng-Chain; Hung, Chao-Hsiang; Wang, Syu-Siang; Fang, Shih-Hau
Source: 2023 24th IEEE International Conference on Mobile Data Management (MDM) MDM Mobile Data Management (MDM), 2023 24th IEEE International Conference on. :286-290 Jul, 2023
Subject: Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineering Profession
Transportation
Machine learning algorithms
Databases
Machine learning
Interference
Market research
Recording
Reliability
Acoustic signal
Machine Learning
Forgery Detection
Language
ISSN: 2375-0324

Online Access

Full Text (IEEE)

초록

Currently, there are concerns regarding the impact of AI-generated fake sentences or artificially manipulated audio files on people’s lives and property. However, the public database for this purpose is very limited and the performance of algorithms is language dependent. The verification of the authenticity of audio files using other languages cannot completely determine whether experimental results can be replicated in Chinese sentences. Therefore, this article created an acoustic Chinese database, named YuanZe Mandarin Dataset (YZMD), by recording different tampered audio samples in a silent room using a microphone, minimizing external interference with the audio files. The text used was adjusted based on the hearing test for Taiwanese Mandarin in noise. The purpose of this database is to verify the effectiveness of detecting audio tampering and to provide it for future public use. We evaluated several algorithms, including two different features, CQCC (Constant Q Cepstral Coefficients) and MFCC (Mel-frequency Cepstral Coefficients), and two different learning models, ResNet (Deep Residual Networks) and GMM (Gaussian Mixture Model). This article also compared and evaluated a database called Automatic Speaker Verification Spoofing and Countermeasures Challenge. The experiments showed that YZMD had similar trends to the public database. The results showed that the CQCC feature was significantly superior to MFCC due to its higher dimensionality. Meanwhile, ResNet provided better performance than GMM. The results in both databases indicate that combining CQCC and ResNet can further improve the performance of detecting forged audio files, compared to all other feature-model combinations.

공지

DAU Library

eArticles

요약정보

Performance Comparison of Audio Tampering Detection Using Different Datasets

Online Access

초록