학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Diagnosing the Interference on CPU-GPU Synchronization Caused by CPU Sharing in Multi-Tenant GPU Clouds

Resource Type: Conference
Authors: Elmougy, Youssef; Jia, Weiwei; Ding, Xiaoning; Shan, Jianchen
Source: 2021 IEEE International Performance, Computing, and Communications Conference (IPCCC) Performance, Computing, and Communications Conference (IPCCC), 2021 IEEE International. :1-10 Oct, 2021
Subject: Communication, Networking and Broadcast Technologies
Computing and Processing
Degradation
Cloud computing
Runtime
Graphics processing units
Interference
Delays
Task analysis
Language
ISSN: 2374-9628

Online Access

Full Text (IEEE)

초록

The GPU-accelerated cloud, enabled by maturing GPU virtualization techniques, has become the most attractive platform for high-performance computing and machine learning workloads. However, it is notoriously challenging to build the multi-tenant GPU cloud where resources, like CPUs and GPUs, can be shared. One well-known and heavily studied reason is that workloads suffer from poor performance isolation and low GPU utilization when GPUs are shared. But little attention has been paid to another fundamental yet under studied problem: how sharing CPUs among GPU instances could affect the workload performance?Targeting this problem, the paper conducts experiments to measure the performance slowdown and vGPU utilization decrease under interference from CPU sharing. The results show that GPU workloads suffer from poor and unpredictable performance and heavy vGPU under-utilization because of CPU sharing. We find that such interference is the result of the complex interplay between the characteristics of CPU-GPU interactions and the special behavior of shared vCPUs: vCPU discontinuity. To diagnose how vCPU discontinuity causes the interference, the paper leverages NVIDIA Nsight Systems for fine-grained profiling and has the following findings: 1) vCPU discontinuity causes inefficient CPU-GPU synchronizations; 2) vCPU discontinuity delays task offloading to the vGPU; 3) Polling-based CPU-GPU synchronization suffers from interference more than blocking-based CPU-GPU synchronization; 4) GPU workloads with frequent task offloads and synchronizations are more vulnerable. Based on the findings, the paper proposes a novel polling-then-blocking CPU-GPU synchronization primitive. Evaluation shows that it can improve the performance by 4.2x.

공지

DAU Library

학술논문

요약정보

Diagnosing the Interference on CPU-GPU Synchronization Caused by CPU Sharing in Multi-Tenant GPU Clouds

Online Access

초록