Performance bound energy efficient cache organization for multi-core processors: A comparison of private and shared cache
Continuous scaling down of transistors and diminishing processor energy efficiency have led to the search of various power saving methods. While performance and power efficiency are the major advantages of the multicore versus single-processor approach, there are performance challenges as the number of cores increase. One of the potential performance issues is the memory bandwidth bottleneck. In multicore this problem can be dealt with by distributing caches along with the processors within the cores (private cache). Another problem is as the number of cores increases the average cache size per core will be decreased resulting in higher miss rates. If a shared cache is used it could be allocated based on the need to overcome this problem. Keeping these two problems in mind the energy savings through tuning a private and shared cache is explored in this research.
The full potential of multicore processors can be harnessed when the application running on them shows parallelism. Today's applications and workloads have ample parallelism and emphasis on parallel programming is increasing. Hence the performance and energy analysis is done with parallel work load on the processors.
A slow cache with high hit-rates can yield the same or better speed-up than fast caches with small hit-rates. This fact can be used to build multi-level cost efficient cache hierarchies. The target applications must be known to maximize the performance improvements through increasing cache hit rates. Also applications require highly diverse cache configurations for optimal energy consumption in the memory hierarchy. Hence various cache organizations are simulated and their performance and energy tradeoff are studied for emerging workload. Finally, the trend in performance and energy consumption for the optimized private and shared cache configurations with increasing number of cores is analyzed.