Attention Mixup: An Accurate Mixup Scheme Based On Interpretable Attention Mechanism for Multi-Label Audio Classification
- Resource Type
- Conference
- Authors
- Liu, Wuyang; Ren, Yanzhen; Wang, Jingru
- Source
- ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2023 - 2023 IEEE International Conference on. :1-5 Jun, 2023
- Subject
- Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Performance gain
Transformers
Acoustics
Task analysis
Speech processing
Spectrogram
Mixup
Sound Event Detection
Audio Classification
Data Augmentation
Transformer
- Language
- ISSN
- 2379-190X
Mixup proves to be an efficient data augmentation method on audio classification tasks. Original mixup scheme directly mixes the waveform of two random samples, which not only ignores the temporal distribution of the sound events but may also interfere with the original sound events in another sample. This paper proposes Attention MixUp (AMU), which only selects those segments that contain sound events for mixup, rather than simply mixing the entire sample. AMU utilizes the attention maps of pretrained audio classification Vision Transformer (ViT) to filter out the patches on the spectrogram that are useful for classification and then selects the regions for mixup according to three different strategies. Experimental results show a remarkable improvement (+1.9 mAP) on state-of-the-art Audioset classification methods with either CNN or ViT backbone. Further experiments show that AMU achieves the performance gain by improving the accuracy on short events (0.1s to 2s) by an average of 6.8% while keeping the accuracy on longer events.