eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

16.7 A 40-310TOPS/W SRAM-Based All-Digital Up to 4b In-Memory Computing Multi-Tiled NN Accelerator in FD-SOI 18nm for Deep-Learning Edge Applications

Resource Type: Conference
Authors: Desoli, Giuseppe; Chawla, Nitin; Boesch, Thomas; Avodhyawasi, Manui; Rawat, Harsh; Chawla, Hitesh; Abhijith, VS; Zambotti, Paolo; Sharma, Akhilesh; Cappetta, Carmine; Rossi, Michele; De Vita, Antonio; Girardi, Francesca
Source: 2023 IEEE International Solid-State Circuits Conference (ISSCC) Solid-State Circuits Conference (ISSCC), 2023 IEEE International. :260-262 Feb, 2023
Subject: Bioengineering
Components, Circuits, Devices and Systems
Computing and Processing
Phase change materials
Tensors
Scalability
Random access memory
In-memory computing
Safety
Artificial intelligence
Language
ISSN: 2376-8606

Online Access

Full Text (IEEE)

초록

In-memory computing (IMC) has been proposed to address compute-intensive data-driven AI workloads, using either SRAM or emerging memory technologies such as PCM, RRAM, and MRAM offering different trade-offs when used as an integrated computing device at the system level. A notable distinction is between digital vs. analog IMC. The latter uses either resistive or capacitive sharing techniques to maximize row parallelism, but at the expense of inaccuracies and accumulation resolution loss due to device variations across PVT and the limited SNR and dynamic range of the ADC/readout circuits. Most of the analog SRAM IMC solutions make use of large logic bit cells and aggressive ADC/readout bitwidth reduction leading to low memory density and computing inaccuracy. These drawbacks significantly limit deployment when functional safety, low-cost testing and system scalability to handle general-purpose workloads are required. In contrast, the deterministic behavior of digital IMC, and compatibility with pushed technology scaling rules offer a fast path for the next generation of neural processing systems. However, the integration of IMC into a Neural Processing Unit (NPU) must retain a mix of computing capabilities, while aiming at a substantial improvement in terms of power and cost efficiency. In this work, we present the architecture of a scalable and design time parametric NPU for edge AI relying on digital SRAM IMC (DIMC) using 8T standard bitcells integrated into IMC tiles supporting 1, 2, and 4b operation (a version with 8b support is in the works), instantiated in multiple clusters with digital logic and driven by a custom tensor slicing optimizing graph compiler, achieving an end-to-end system-level energy efficiency from 40–31 OTOPS/W in 18nm FD-SOI.

공지

DAU Library

eArticles

요약정보

16.7 A 40-310TOPS/W SRAM-Based All-Digital Up to 4b In-Memory Computing Multi-Tiled NN Accelerator in FD-SOI 18nm for Deep-Learning Edge Applications

Online Access

초록