eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

Accelerating Neural Network Training with Processing-in-Memory GPU

Resource Type: Conference
Authors: Fei, Xiang; Han, Jianhui; Huang, Jianqiang; Zheng, Weimin; Zhang, Youhui
Source: 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid) CCGRID Cluster, Cloud and Internet Computing (CCGrid), 2022 22nd IEEE International Symposium on. :414-421 May, 2022
Subject: Communication, Networking and Broadcast Technologies
Computing and Processing
Training
Energy consumption
Three-dimensional displays
Computational modeling
System performance
Neural networks
Graphics processing units
Hybrid Memory Cube
GPU
processing-in-memory
deep neural network
Language

Online Access

Full Text (IEEE)

초록

Processing-in-memory (PIM) architecture is promising for accelerating deep neural network (DNN) training due to its low-latency and energy-efficient data movement between computation units and the memory. This paper explores a novel GPU-PIM architecture for DNN training, where streaming multiprocessors of GPU are integrated into the logic layer of 3D memory stack, and multiple such stacks are connected to form a PIM-network. Two corresponding optimization strategies are proposed. The first is to increase the computational parallelism of the data-parallel training mode with the large memory, high bandwidth/high network transmission speed of GPU-PIM. The second is further utilizing the optimized model-parallel training to significantly reduce the communication overhead: We propose a mapping scheme to decide the proper parallelization for different DNN layers on the proposed architecture. Experiments show that the proposed architecture outperforms the baseline GPU by 35.5% and 59.9% and reduces energy consumption by 28.2% and 27.8% for the two benchmarks we evaluated.

공지

DAU Library

eArticles

요약정보

Accelerating Neural Network Training with Processing-in-Memory GPU

Online Access

초록