Memory based IQ-capping technique for multithreaded cores
The optimum utilization of shared resources among the threads in extensively used SMT systems is a daunting challenge. Long-latency loads and stores could cause a thread to allocate critical shared resources without making further progress. Critically, unbalanced long-latency loads could lower the efficacy of other threads to achieve highest performance gain. To address the challenge, it is imperative and feasible to manage the balancing of memory instructions at dispatch stage. In this study, we demonstrate that SMT design can achieve significant performance gain by implementing memory-based fix IQ capping technique on SMT system. For IQ size of 32 entries, the average % improvement in IPC value for 4-threaded eight workloads, the performance gain is 10.76 % over default dispatching algorithm. For the same IQ but for 8-thread workloads, the performance gain is 14.58 %. The bottom line is that the highest average performance gain---regardless of all selected IQ sizes and type of workloads---comes with lowest EAC cap value due to balancing of memory instructions at dispatch stage. An even the proposed technique ensure insignificant hardware overhead and it can be easily coupled with other advance technique employed at other stages of the SMT pipeline for more potential benefits.