학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

A 1ynm 1.25V 8Gb, 16Gb/s/pin GDDR6-based Accelerator-in-Memory supporting 1TFLOPS MAC Operation and Various Activation Functions for Deep-Learning Applications

Resource Type: Conference
Authors: Lee, Seongju; Kim, Kyuyoung; Oh, Sanghoon; Park, Joonhong; Hong, Gimoon; Ka, Dongyoon; Hwang, Kyudong; Park, Jeongje; Kang, Kyeongpil; Kim, Jungyeon; Jeon, Junyeol; Kim, Nahsung; Kwon, Yongkee; Vladimir, Kornijcuk; Shin, Woojae; Won, Jongsoon; Lee, Minkyu; Joo, Hyunha; Choi, Haerang; Lee, Jaewook; Ko, Donguc; Jun, Younggun; Cho, Keewon; Kim, Ilwoong; Song, Choungki; Jeong, Chunseok; Kwon, Daehan; Jang, Jieun; Park, Il; Chun, Junhyun; Cho, Joohwan
Source: 2022 IEEE International Solid-State Circuits Conference (ISSCC) Solid-State Circuits Conference (ISSCC), 2022 IEEE International. 65:1-3 Feb, 2022
Subject: Bioengineering
Components, Circuits, Devices and Systems
Computing and Processing
Engineering Profession
Costs
System performance
Conferences
Random access memory
Bandwidth
Throughput
Proposals
Language
ISSN: 2376-8606

Online Access

Full Text (IEEE)

초록

With advances in deep-neural-network applications the increasingly large data movement through memory channels is becoming inevitable: specifically, RNN and MLP applications are memory bound and the memory is the performance bottleneck [1]. DRAM featuring processing in memory (PIM) significantly reduces data movement [1]–[4], and the system performance is enhanced by the large internal parallel bank bandwidth. Among DRAM-based PIM proposals, [3] is near commercialization, but the required HBM technology may prevent it from being applied to other applications due to its high cost [5]. In this situation, an accelerator-in-memory (AiM) based on GDDR6 may be applicable: it has a relatively low-cost, is compatible with GDDR6 interface, and is designed to accelerate deep-learning (DL) applications. AiM offers a peak throughput of 1 TFLOPS with processing units (PUs) with a speed of 1 GHz utilizing the characteristics of GDDR6 with a speed of 16Gb/s. It can also support many applications as it has various activation functions. This paper first looks at the AiM architecture and the supported command set for DL operations. Next, the DL operations in the PU and supported activation functions are described. Finally, we present evaluation results of DL behavior of AiM at the package and the system level.

공지

DAU Library

학술논문

요약정보

A 1ynm 1.25V 8Gb, 16Gb/s/pin GDDR6-based Accelerator-in-Memory supporting 1TFLOPS MAC Operation and Various Activation Functions for Deep-Learning Applications

Online Access

초록