학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

H3DAtten: Heterogeneous 3-D Integrated Hybrid Analog and Digital Compute-in-Memory Accelerator for Vision Transformer Self-Attention

Resource Type: Periodical
Authors: Li, W.; Manley, M.; Read, J.; Kaul, A.; Bakir, M.S.; Yu, S.
Source: IEEE Transactions on Very Large Scale Integration (VLSI) Systems IEEE Trans. VLSI Syst. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on. 31(10):1592-1602 Oct, 2023
Subject: Components, Circuits, Devices and Systems
Computing and Processing
Transformers
Computational modeling
Windows
Task analysis
Common Information Model (computing)
Hardware
System-on-chip
Artificial intelligence (AI) accelerator
compute-in-memory (CIM)
deep learning (DL)
heterogeneous 3-D integration (H3D)
resistive random access memory (RRAM)
vision transformer
Language
ISSN: 1063-8210
1557-9999

Online Access

초록

After the success of the transformer networks on natural language processing (NLP), the application of transformers to computer vision (CV) has followed suit to deliver unprecedented performance gains on vision tasks, including image recognition and object detection. The multihead self-attention (MHSA) is the key component in transformers, allowing the models to learn the amount of attention paid to each input position. Despite its strong modeling capability, MHSA involves complex operations that make transformers prohibitively costly for hardware deployment. Existing acceleration efforts with conventional hardware platforms are challenged by the memory wall. To alleviate the memory wall problem, compute-in-memory (CIM) is a promising solution by storing all model parameters on-chip in compute-capable memory arrays. The footprint of 2-D CIM designs must, however, expand to accommodate the increasingly larger model sizes. In this work, we present a heterogeneous 3-D integrated (H3D) accelerator to target the MHSA workloads in vision transformers. H3D allows the proposed H3DAtten architecture to combine the merits of resistive random access memory (RRAM)-based analog CIM (ACIM) in 40 nm and static random access memory (SRAM)-based digital CIM (DCIM) in 16 nm. We perform comprehensive signaling and thermal analyses to examine the effects of 3-D stacking on the accelerator. Compared to iso-capacity 2-D baseline designs, the proposed 5-tier H3DAtten accelerator achieves $8.4\times $ compute density without experiencing accuracy loss on the ImageNet-1k dataset.

공지

DAU Library

학술논문

요약정보

H3DAtten: Heterogeneous 3-D Integrated Hybrid Analog and Digital Compute-in-Memory Accelerator for Vision Transformer Self-Attention

Online Access

초록