Energy-Aware Workload Allocation for Distributed Deep Neural Networks in Edge-Cloud Continuum
- Resource Type
- Conference
- Authors
- Jin, Yi; Xu, Jiawei; Huan, Yuxiang; Yan, Yulong; Zheng, Lirong; Zou, Zhuo
- Source
- 2019 32nd IEEE International System-on-Chip Conference (SOCC) System-on-Chip Conference (SOCC), 2019 32nd IEEE International. :213-217 Sep, 2019
- Subject
- Components, Circuits, Devices and Systems
Computing and Processing
- Language
- ISSN
- 2164-1706
This paper presents an energy-aware workload allocation framework for Distributed Deep Neural Networks (DNNs) in the Edge-Cloud continuum. As opposed to conventional approaches where the inference is performed in a standalone device, a computing-communication mode is proposed to distribute computing tasks of different layers of DNNs to different levels of the Edge-Cloud network to achieve the minimum energy cost per inference. The optimal exit layer (EL) can be determined where the intermediate data of the neural networks are transmitted to the higher level in the Edge-Cloud continuum. Case studies are illustrated for AlexNet and VGG-16 considering a set of DNN processors and wireless interfaces. Using the GPU GTX1080 with 22.8 GOPS/W and the WiFi with 10 nJ/bit transmission efficiency, the optimized energy consumption for AlexNet is estimated to be 0.016 J when the inference exits from the edge at the EL2 (Conv1) layer. For VGG-16, the optimal EL is EL1 with the minimum inference cost of 0.0482 J.