학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

TAIT: One-Shot Full-Integer Lightweight DNN Quantization via Tunable Activation Imbalance Transfer

Resource Type: Conference
Authors: Jiang, Weixiong; Yu, Heng; Liu, Xinzhe; Sun, Hao; Li, Rui; Ha, Yajun
Source: 2021 58th ACM/IEEE Design Automation Conference (DAC) Design Automation Conference (DAC), 2021 58th ACM/IEEE. :1027-1032 Dec, 2021
Subject: Components, Circuits, Devices and Systems
Power, Energy and Industry Applications
Deep learning
Quantization (signal)
Tensors
Design automation
Convolution
Neural networks
Prediction algorithms
Language

Online Access

Full Text (IEEE)

초록

Both parameter quantization and depthwise convolution are essential measures to provide high-accuracy, lightweight, and resource-friendly solutions when deploying deep neural networks (DNNs) onto edge-AI devices. However, combining the two methodologies may lead to adverse effects: It either suffers from significant accuracy loss or long finetuning time. Besides, contemporary quantization methods are only selectively applied to weight and activation values but not bias and scaling factor values, making them less practical for ASIC/FPGA accelerators. To solve these issues, we propose a novel quantization framework that is effectively optimized for depthwise convolution networks. We discover that the uniformity of the value range within a tensor can serve as a predictor for the tensor’s quantization error. Under the guidance of this predictor, we develop a mechanism called Tunable Activation Imbalance Transfer (TAIT), which tunes the value range uniformity between an activated feature map and its latter weights. Moreover, TAIT fully supports full-integer quantization. We demonstrate TAIT on SkyNet and deploy it on FPGA. Compared to the state-of-the-art, our quantization framework and system design achieve 2.2%+ IoU, $2.4 \times$ speed, and $1.8 \times$ energy efficiency improvements, without any requirement of finetuning.

공지

DAU Library

학술논문

요약정보

TAIT: One-Shot Full-Integer Lightweight DNN Quantization via Tunable Activation Imbalance Transfer

Online Access

초록