Performance analysis is crucial for optimizing the viability, efficiency, and scalability of high-performance computing (HPC) applications. It involves exploring the target application to identify scalability and efficiency limiting artifacts, known as bottlenecks. Performance measurements, which observe the target's runtime behavior, are commonly used for this analysis. However, current tools and methods are often too simplistic, making performance analysis expensive and less effective. There is a need for simpler control by analysts to enable precise measurements and easier insights. Three main inspection techniques—sampling, pre-link instrumentation, and binary instrumentation—facilitate performance measurements. Each technique introduces measurement perturbation, or overhead, which can exceed the original application's runtime significantly. Users must balance this overhead with data coverage, necessitating flexibility in what and how often measurements are taken. While state-of-the-art sampling and binary instrumentation tools provide this flexibility, current pre-link instrumentation tools fall short. To address this gap, we introduce InstRO, a component-based toolbox for performance instrumentation that enhances flexibility and control, offering a much-needed solution for effective performance analysis.
Christian M. Iwainsky Books
