학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Method to Deploy Lightweight Models with a Novel Pipeline for Efficient Inference

Resource Type: Conference
Authors: Dong, Jingxuan; Li, Wanying; Huang, Zhengxu; Xu, Li
Source: 2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) ICMEW Multimedia and Expo Workshops (ICMEW), 2023 IEEE International Conference on. :498-501 Jul, 2023
Subject: Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Signal Processing and Analysis
Conferences
Pipelines
Neural networks
Hardware
Real-time systems
lightweight
neural network
inference
model deployment
Language

Online Access

Full Text (IEEE)

초록

Lightweight neural network models are commonly designed in real-time scenarios to meet the requirement of fast processing. During the deployment of the inference flow, lightweight models are frequently built by existing frameworks such as TensorFlow and OpenVINO. Since these frameworks are heavy and always need to create a call stack from the program entry to model execution, causing a lot of time consumption. The inference speed cannot be effectively improved especially with high latency requirement. To address the problem, we propose a novel lightweight model deployment pipeline to promote efficient inference on hardware. Our method optimizes primitives of executable operations to take full advantages of the hardware. The executable graph is finally created, significantly reducing the time of stack calling. Experimental results demonstrate that our method is superior to TensorFlow and OpenVINO in the aspect of the inference speed, and can be applied for the construction of lightweight models.

공지

DAU Library

학술논문

요약정보

Method to Deploy Lightweight Models with a Novel Pipeline for Efficient Inference

Online Access

초록