
Mitigating Data-Dependent DRAM Failures for Reliable System Operation
Explore cutting-edge methods for detecting and mitigating data-dependent DRAM failures to ensure reliable system performance. Learn about leveraging memory content, online profiling, technology scaling, and efficient detection techniques. Discover how to tackle intermittent failures and detect data dependencies in DRAM cells effectively.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Session 1A at 11.20 am MEMCON Detecting and Mitigating Data-Dependent DRAM Failures by Exploiting Current Memory Content Samira Khan Chris Wilkerson, Zhe Wang, Alaa Alameldeen, Donghyuk Lee, Onur Mutlu
VISION: VISION: SYSTEM SYSTEM- -LEVEL DETECTION AND MITIGATION LEVEL DETECTION AND MITIGATION Detect and Mitigate Unreliable DRAM Cells Reliable System Detect and mitigate errors after the system has become operational ONLINE PROFILING 2
BENEFITS OF ONLINE PROFILING Technology Scaling Unreliable DRAM Cells Reliable DRAM Cells 1. Improves yield, reduces cost, enables scaling Vendors can make cells smaller without a strong reliability guarantee 3
BENEFITS OF ONLINE PROFILING LO-REF HI-REF LO-REF HI-REF LO-REF Unreliable DRAM Cells Reduce refresh count by using a lower refresh rate, but use higher refresh rate for faulty cells 2. Improves performance and energy efficiency Reduce refresh rate, refresh faulty rows more frequently 4
DETECTION IS HARD DUE TO INTERMITTENT FAILURES DETECTION IS HARD DUE TO INTERMITTENT FAILURES DATA-DEPENDENT FAILURE NO NO 1 1 1 FAILURE FAILURE 1 0 FAILURE FAILURE 0 Some cells can fail depending on the data stored in neighboring cells 5
HOW TO DETECT DATA HOW TO DETECT DATA- -DEPENDENT FAILURES? DEPENDENT FAILURES? Test with specific data pattern in neighboring cells L D R 0 1 0 LINEAR MAPPING X-1 X X+1 6
HOW TO DETECT DATA HOW TO DETECT DATA- -DEPENDENT FAILURES? DEPENDENT FAILURES? Test with specific data pattern in neighboring cells L D R 0 1 0 LINEAR MAPPING X-1 X X+1 7
HOW TO DETECT DATA HOW TO DETECT DATA- -DEPENDENT FAILURES? DEPENDENT FAILURES? Test with specific data pattern in neighboring cells L D R 0 1 0 LINEAR MAPPING X-1 X X+1 0 1 0 0 0 1 SCRAMBLED MAPPING X-1 X X+1 8
HOW TO DETECT DATA HOW TO DETECT DATA- -DEPENDENT FAILURES? DEPENDENT FAILURES? Test with specific data pattern in neighboring cells L D R 0 1 0 LINEAR MAPPING X-1 X X+1 0 1 0 0 0 1 NOT EXPOSED TO THE SYSTEM SCRAMBLED MAPPING X-4 X X+2 9
MEMCON 0 1 0 X-? X X+? SCRAMBLED MAPPING Detects data-dependent failures without the knowledge of the DRAM internal address mapping 40%-50% Performance improvement using 32Gb DRAM 65%-74% Reduction in refresh count 10
Session 1A at 11.20 am MEMCON Detecting and Mitigating Data-Dependent DRAM Failures by Exploiting Current Memory Content Samira Khan Chris Wilkerson, Zhe Wang, Alaa Alameldeen, Donghyuk Lee, Onur Mutlu