Analysis and Correlation of Application I/O Performance and System-Wide I/O Activity
- Resource Type
- Conference
- Authors
- Madireddy, Sandeep; Balaprakash, Prasanna; Carns, Philip; Latham, Robert; Ross, Robert; Snyder, Shane; Wild, Stefan M.
- Source
- 2017 International Conference on Networking, Architecture, and Storage (NAS) Networking, Architecture, and Storage (NAS), 2017 International Conference on. :1-10 Aug, 2017
- Subject
- Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Measurement
Correlation
Interference
Monitoring
Servers
Optimization
Tools
- Language
Storage resources in high-performance computing are shared across all user applications. Consequently, storage performance can vary markedly, depending not only on an application's workload but also on what other activity is concurrently running across the system. This variability in storage performance is directly reflected in overall execution time variability, thus confounding efforts to predict job performance for scheduling or capacity planning. I/O variability also complicates the seemingly straightforward process of performance measurement when evaluating application optimizations. In this work we present a methodology to measure I/O contention with more rigor than in prior work. We apply statistical techniques to gain insight from application-level statistics and storage-side logging. We examine different correlation metrics for relating system workload to job I/O performance and identify an effective and generally applicable metric for measuring job I/O performance. We further demonstrate that the system-wide monitoring granularity can directly affect the strength of correlation observed. Insufficient granularity and measurements can hide the correlations between application I/O performance and system-wide I/O activity.