Automated Timer Generation for Empirical Tuning
The performance of real world applications often critically depends on a few computationally intensive routines that are either invoked numerous times by the application and/or include a significant number of loop iterations. It is desirable to separately study the performance of these critical routines, particularly in the context of automatic performance tuning, where a routine of interest is repetitively transformed using varying optimization strategies, and the performance of different implementations is gathered to guide further optimizations. This paper presents a framework for automatically generating timing drivers that independently measure the performance of critical routines separately from their original applications. We show that the timing drivers can accurately replicate the performance of routines when invoked directly within whole applications, thereby allowing critical routines to be correctly optimized in the context of empirical performance tuning.