Nvm-based Computation and Storage Frameworks for Emerging Applications
Conventional DRAM-based memories have difficulty in scaling to large capacities. Therefore, computer architects have turned to emerging nonvolatile memory such as spin transfer torque magneto-resistive random-access memory (STT-MRAM) technologies to replace DRAM. By doing so, they can achieve a wide variety of advantages for both data storage and computation purposes such as zero idle power, no data refreshes, and higher level of parallelism for computation. However, utilizing NVM in a computer system causes new design challenges. For storage purposes, read disturbance characterizes accidental data corruption in STT-MRAM after it is read, leading to the need of restoring data back to memory after each read operation. We propose both device and architecture innovations to mitigate and tolerate read disturbance. These device-level schemes turn out to be effective in reducing the read disturbance probability but come with costs on other design metrics. Consequently, we further propose a restore-aware memory controller design at the architecture level to tolerate read disturbance. Since the extra restores incurred by read disturbance greatly change the timing scenarios that conventional memory controllers were optimized for, directly adopting restore-agnostic DRAM memory management techniques will lead to suboptimal designs for STT-MRAM. Therefore, we propose restore-aware policy selection (RAPS), a dynamic and hybrid row buffer management scheme that factors in the inevitable data restores in STT-MRAM-based main memory. RAPS monitors the row buffer hit rate at run time, dynamically switching between two static page-closure policies. By factoring in restores, RAPS accurately captures the optimal design points, achieving optimal policy selections at run time. Our experimental results show that RAPS significantly improves system performance and energy efficiency compared to conventional page-closure policies. For computation purposes, a number of recent research efforts, including nonvolatile memories (NVM) and processing-in-memory (PIM), have attempted to design memory-centric and domain specific architectures, which trade off generality for better memory performance. On the other hand, with the slowdown of Moore's Law, 3D stacking has been proposed to vertically stack multiple memory and/or logic dies, thus offering a new dimension of scalability in IC design. We seek to integrate these three disruptive technologies (NVM, PIM, and 3D stacking) into a novel architecture, namely 3D PI-NVM, for the first time to accelerate deep learning applications. With the proposed 3D-aware model mapping and data flow management mechanisms, as well as an inter-layer allocation scheme, 3D PI-NVM achieves not just the benefits of each individual technology but also the unique benefits of the integration. Experimental results demonstrate that, compared to the latest state-of-the-art NVM-based accelerator designs, our proposed 3D PI-NVM framework can provide orders of magnitude speedup with similar computation resource overhead.