Fault-Tolerant Ensemble CNNs Increasing Diversity Based on Knowledge Distillation
- Resource Type
- Conference
- Authors
- Koeda, Shunsuke; Tomioka, Yoichi; Saito, Hiroshi
- Source
- 2023 IEEE 16th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC) MCSOC Embedded Multicore/Many-core Systems-on-Chip (MCSoC), 2023 IEEE 16th International Symposium on. :399-405 Dec, 2023
- Subject
- Components, Circuits, Devices and Systems
Computing and Processing
Robotics and Control Systems
Training
Fault tolerance
Power demand
Computational modeling
Fault tolerant systems
Mission critical systems
Hardware
fault-tolerant
ensemble learning
convolutional neural network
knowledge distillation
- Language
- ISSN
- 2771-3075
False inference of convolutional neural networks (CNNs) can lead to serious accidents in mission-critical artificial intelligence (AI) applications such as self-driving and medical systems. It is important not only to achieve sufficient accuracy but also to detect hardware faults and continue reliable inferences. Triple Modular Redundancy (TMR) is a conventional fault-tolerant method taking the majority voting of the outputs of three identical modules. However, it increases the computational cost three times, leading to increasing the power consumption and circuit area. In this paper, we propose a computationally low-cost fault-tolerant ensemble model consisting of multiple CNN models and its training approach to improve the accuracy. Each CNN model involves a shared part that has common parameters among all CNN models. It can detect the fault of each hardware module running on each CNN model by comparing the outputs of shared parts and dynamically reconfigure the ensemble model by taking the average of the outputs only for non-faulted CNN models. In the experiments, we demonstrate that the proposed method significantly reduces the computational cost to 43% while achieving comparable accuracy and robustness.