In this paper, we focus on temperature-aware Monolithic 3D (Mono3D) deep neural network (DNN) inference accelerators for biomedical applications. We develop an optimizer that tunes aspect ratios and footprint of the accelerator under user-defined performance and thermal constraints, and generates near-optimal configurations. Using the proposed Mono3D optimizer, we demonstrate up to 61% improvement in energy efficiency for biomedical applications over a performance-optimized accelerator.
Comment: This paper was accepted to be presented at the Design, Automation and Test in Europe Conference (DATE) 2022 workshop on "3D Integration: Heterogeneous 3D Architectures and Sensors"