NFFKD: A Knowledge Distillation Method Based on Normalized Feature Fusion Model
- Resource Type
- Conference
- Authors
- Wang, Zihan; Xie, Junwei; Yao, Zhiping; Kuang, Xu; Gao, Qinquan; Tong, Tong
- Source
- 2022 IEEE 5th International Conference on Big Data and Artificial Intelligence (BDAI) Big Data and Artificial Intelligence (BDAI), 2022 IEEE 5th International Conference on. :111-116 Jul, 2022
- Subject
- Computing and Processing
Knowledge engineering
Big Data
Benchmark testing
Robustness
knowledge distillation
deep learning
convolutional neural network
knowledge transfer
- Language
The aim of Knowledge Distillation (KD) is to train lightweight student models through extra supervision from large teacher models. Most previous KD methods transfer feature information from teacher models to student models via connections of feature maps at the same layers. This paper proposes a novel multi-level knowledge distillation method, referred to as Normalized Feature Fusion Knowledge distillation(NFFKD). The proposed model learns different levels of knowledge to improve the network performance. We proposed to use the hierarchical mixed loss(HML) module to minimize the gap between the intermediate feature layers of the teacher and the student, and the teacher-student gap is reduced by normalizing the logits. Experimental results have demonstrated that the proposed NFFKD shows superiority over several state-of-the-art KD methods on public datasets under different settings.