Intelligent cache management techniques for reducing memory systemwaste
The performance gap between modern processors and memory is a primary concern for computer architecture. Caches mitigate the long memory latency that limits the performance of modern processors. However, modern chip multiprocessor performance is sensitive to the last-level cache capacity and miss latency. Unfortunately, caches can be quite inefficient. On average, 86.2% of the blocks in a 2MB last level cache are useless. These blocks are dead as they will not be referenced again before eviction. These dead blocks are a waste of valuable cache space that should contain useful blocks that will contribute to the hit rate and improve performance.
This dissertation explores the inefficiencies in the memory system and proposes simple cache management techniques that reduce memory system waste and improve performance. We propose dead block cache management techniques that reduce dead time and improve performance. We introduce a new dead block predictor that can accurately identify dead blocks by sampling only a small fraction of memory references. This predictor learns from a few cache sets, reducing the predictor power and storage overhead. It also decouples the replacement policy from prediction, so it can improve performance even with the inexpensive random cache replacement policy. We propose a new cache management scheme to use dead blocks efficiently. We propose placing victim blocks in the predicted dead blocks of the cache. When the victim blocks are referenced again, they are found in the dead blocks. This "virtual victim cache" improves performance by avoiding misses. We also propose a dynamic cache segmentation technique that reduces dead time of dead-on-arrival blocks. This segmentation attempts to keep the best number of non-referenced and referenced blocks in cache sets. Dynamic cache segmentation even with a default random policy can outperform LRU using half the space overhead.