학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

A 28nm 128TFLOPS/W Computing-In-Memory Engine Supporting One-Shot Floating-Point NN Inference and On-Device Fine-Tuning for Edge AI

Resource Type: Conference
Authors: Diao, Haikang; Luo, Haoyang; Song, Jiahao; Xu, Bocheng; Wang, Runsheng; Wang, Yuan; Tang, Xiyuan
Source: 2024 IEEE Custom Integrated Circuits Conference (CICC) Custom Integrated Circuits Conference (CICC), 2024 IEEE. :1-2 Apr, 2024
Subject: Components, Circuits, Devices and Systems
Application specific integrated circuits
Limiting
Image edge detection
Object detection
Throughput
Energy efficiency
Common Information Model (computing)
Language
ISSN: 2152-3630

Online Access

Full Text (IEEE)

초록

Recent research has extended computing-in-memory (CIM) to floating-point (FP) operations, enabling high-precision computation to handle complex edge tasks such as object detection and segmentation [1]–[3]. However, the ever-growing edge intelligence escalated the need for higher throughput, better energy efficiency, and on-device updates, imposing significant challenges on prior pre-aligning-based FP CIMs (Fig. 1). 1) A fundamental limitation exists in the INT mantissa multiply-accumulate (MAC): bit-parallel computation is fast but consumes significant area/energy due to wide bit-width multipliers and adder trees, and thus, most designs adopt the bit-serial compute scheme. However, it requires multiple compute cycles. E.g., 8 cycles are required for a BF16 mantissa MAC, severely limiting the throughput. 2) The exponent sorting and mantissa normalization process of FP/INT conversion in previous FP CIMs introduce a complex comparison tree and shifter, greatly increasing the area/energy overhead. 3) Previous FP CIMs do not support on-device fine-tuning for environment changes, resulting in accuracy loss in real-world applications.

공지

DAU Library

학술논문

요약정보

A 28nm 128TFLOPS/W Computing-In-Memory Engine Supporting One-Shot Floating-Point NN Inference and On-Device Fine-Tuning for Edge AI

Online Access

초록