
Understanding Non-Volatile Main Memory in Computer Architecture
Discover the impact and benefits of Non-Volatile Main Memory (NVMM) on system performance and data persistence. Explore the motivation, architecture, and solutions for efficient memory management in NVMM technology.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Log Log- -Structured Non Structured Non- -Volatile Main Memory Main Memory Qingda Hu*, Jinglei Ren, Anirudh Badam, and Thomas Moscibroda Microsoft Research *Tsinghua University Volatile
Non Non- -volatile memory is coming volatile memory is coming Data storage 3D XPoint/Optane (2015 - ) Read: ~50ns Write: ~10GB/s PCM Read: ~100ns Write: ~1GB/s Read: ~10 s Write: ~100MB/s 2
Background: Impact of NVM Background: Impact of NVM Architecture: Non-Volatile Main Memory (NVMM) NVM DRAM DRAM SSD Data persistence as a bottleneck 10+x application performance improvement 3
Executive Summary Executive Summary Motivation Inefficient use of memory space Inefficient support for crash consistency Application Application Library Library DRAM NVMM SSD Solution: Log-structured memory management for NVMM. Evaluation: 7x less memory waste; 90% higher write throughput. 4
Outline Outline Motivation Log-Structured NVMM Tree-Based Address Mapping Evaluation 5
Motivation I Motivation I Inefficient use of memory space Reason: Traditional DRAM allocators incur high memory fragmentation. Explanation: 8B 8B 8B 8B 8B 8B 8B 8B 16B 16B 16B 16B Internal fragmentation: External fragmentation: 32B Waste 24B 32B Waste (32B) 32B 32B Waste (32B) 64B request 6
Motivation I Motivation I Inefficient use of memory space (cont.) Fragmentation is a more severe issue for NVM! process process process process process process NVMM DRAM 7
Motivation II Motivation II Inefficient support for crash consistency Reason: Write-twice in log and home. Explanation: Redo logging for example. transaction { a += 1; b -= 1; } NVMM a b a b Home Log 8
Outline Outline Motivation Log-Structured NVMM Tree-Based Address Mapping Evaluation 9
Log Log- -Structured NVMM Structured NVMM Library and architecture Process (user space) Transaction Address mapping (DRAM) Home addr. Log addr. translate(&a) &a a &b a Allocated Available a Memory management: An append-only log mmap() NVM device Application X 10
Log Log- -Structured NVMM Structured NVMM Low fragmentation For internal fragmentation: Compact append Allocated a Available No internal fragmentation For external fragmentation: Log cleaning Allocated a a Available 11
Log Log- -Structured NVMM Structured NVMM Efficient crash-consistent update No separate areas. Write only once. Address mapping transaction { a += 1; b -= 1; } Home addr. Log addr. &a &b a b Allocated Available a b Header: size, checksum, etc. 12
Outline Outline Motivation Log-Structured NVMM Tree-Based Address Mapping Evaluation 13
Tree Tree- -Based Address Mapping Based Address Mapping Unique challenges to NVMM Pervasive and highly frequent memory accesses. Allocation granularity access granularity No O(1) lookup. Filesystems: hash(block number)as the index. Databases: hash(key or tuple ID)as the index. Main memory: hash(address)? That maps every address! Tree-based mapping made performant. ? 0xABC8 0xABB4, size=16 0xABC0, size=24 ... 14
Tree Tree- -Based Address Mapping Based Address Mapping Two-layer mapping Partition index: (1) Tree for a small partition (4KB) (log?) Improves transaction throughput by 39.6% on average. 15
Tree Tree- -Based Address Mapping Based Address Mapping Skip list A probabilistically balanced tree. No complex balancing operations No locking for read- only operations. Improves transaction throughput by 48.9% with four threads. 16
Tree Tree- -Based Address Mapping Based Address Mapping Group update Within each transaction, all writes are first buffered in DRAM. Writes with contiguous addresses are combined on transaction commit. Improves transaction throughput by 42.3% on average. 17
Tree Tree- -Based Address Mapping Based Address Mapping Hot tree node cache A thread-local cache that references recently accessed nodes of the trees. A special hash table design: Deliberately high collision. Motivation: Addresses within a cached node are not hit due to random distribution of their hash values. Solution: Use high-order bits of an address as its hash value. 0xABB* 0xABC* 0xABCD0 (size=16) 0xABC00 (size=24) ? 0xABC08 0xABD* Collison and found! Improves transaction throughput by 30.1% on average. 18
Outline Outline Motivation Log-Structured NVMM Tree-Based Address Mapping Evaluation 19
Evaluation Evaluation Environment: 8-core Intel Xeon CPU E5-2637 v3 (3.5 GHz), 64 GB DRAM 64-bit Linux kernel version 4.2.3 NVM emulation: write latency = max{500ns, ?????_???? 1GB/? } Part I: How effective are individual optimizations? Already shown. Part II: How does LSNVMM perform against traditional systems? Part III: What are the inherent costs of the log-structured approach? 20
Evaluation Evaluation Fragmentation: Compared to Hoard and jemalloc Workloads 1 ~ 3 collected from [S. Rumble, FAST 14]. Hoard/jemalloc produces 25.3%/35.0% fragmentation on average. Log-structured NVM (LSNVMM) produces 4.5% fragmentation on average. 21
Evaluation Evaluation Transaction throughput compared to Mnemosyne With 4 threads, log-structured NVMM performs 44.7% and 80.8% better than Mnemosyne and Mnemosyne-Undo, respectively, on average. 22
Evaluation Evaluation Cost of log cleaning The performance degradation due to log cleaning is 8% at 90% memory utilization. 23
Conclusion Conclusion Takeaway I: Applying the log-structured approach to NVMM can largely reduce memory fragmentation and improve system performance. Takeaway II: A tree-based address mapping mechanism can be made efficient to serve log-structured NVMM. Thank you! Q & A 24
Backup Backup Recovery time (10GB logs) 25
Backup Backup DRAM footprint (1GB data) 26