학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Kullback–Leibler Divergence Frequency Warping Scale for Acoustic Scene Classification Using Convolutional Neural Network

Resource Type: Conference
Authors: Yang, Yuhong; Zhang, Huiyu; Tu, Weiping; Ai, Haojun; Cai, Linjun; Hu, Ruimin; Xiang, Fei
Source: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2019 - 2019 IEEE International Conference on. :840-844 May, 2019
Subject: Bioengineering
Communication, Networking and Broadcast Technologies
Signal Processing and Analysis
Spectrogram
Filter banks
Acoustics
Two dimensional displays
High frequency
Support vector machines
Frequency measurement
KL divergence
Frequency Warping
Acoustic Scene Classification
CNN
Conditional-GAN
Language
ISSN: 2379-190X

Online Access

Full Text (IEEE)

초록

Most of current best performing Acoustic Scene Classification (ASC) systems utilize Mel scale spectrograms with Convolutional Neural Networks (CNNs). Mel scale is a common way to suit frequency warping of human ears, with strict decreasing frequency resolution on low to high frequency range. However, we find that significant frequency bins are located at mid to high frequency range for some acoustic scenes, such as travelling by bus, tram or train. In this paper, we show that a better frequency warping scale for ASC can be automatically learned from raw spectrograms, using Kullback-Leibler (KL) divergence scale. Our KL scale spectrograms with CNN method is evaluated on two public ASC datasets. The results show that we outperform the Mel scale method on both datasets. In addition, we also employ a Conditional Generative Adversarial Nets (Conditional-GAN) model for data augmentation, to prevent overfitting problem and allow further improvements on ASC.

공지

DAU Library

학술논문

요약정보

Kullback–Leibler Divergence Frequency Warping Scale for Acoustic Scene Classification Using Convolutional Neural Network

Online Access

초록