
Columnar Data Replication for Scale-Up Hybrid Memory Systems
Explore the latest research on polymorphic compressed replication for columnar data on scale-up hybrid memory systems, addressing challenges and impacts on database systems. This study delves into state-of-the-art analytical processing, storage overhead, replication strategies, and the benefits of hybrid memory replication. Discover how this advancement enhances performance, reliability, and data protection in memory-intensive applications.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Polymorphic Compressed Replication for Columnar Data on Scale-Up Hybrid Memory Systems Mikhail Zarubin, Patrick Damme, Dirk Habich, Wolfgang Lehner SYSTOR 2020, Haifa, Israel October 13-14, 2020
Polymorphic Compressed Replication for Columnar Data on Scale-Up Hybrid Memory Systems 2
Scale-Up Hybrid Memory Systems Challenge Impact on Database Systems True in-memory database Byte-addressable data processing Huge performance increase Instant access and recovery Unreliable NVRAM Hardware -> need for primary data protection Data A Data B UPI Cores Cores DRAM NVRAM NVRAM DRAM Socket 2 Socket 1 Pool 3
Columnar Data on Scale-up Hybrid Memory Systems State-of-the-art in Analytical Processing (e.g. in Big Data) Column: sequence of values mapped to integers (64-bit) Query: sequence of operators executed on columns Column A Column B UPI Cores Cores DRAM NVRAM NVRAM DRAM Socket 2 Socket 1 Pool 4
Polymorphic Compressed Replication for Columnar Data on Scale-Up Hybrid Memory Systems 5
Replication State-of-the-art Intel PMDK pool replication Synchronous master-slave Features Offers protection against all possible NVRAM failures Software-coordinated for flexibility Physical level for the best performance Synchronous for strong consistency 3. Storage Overhead Master Replica Write UPI 2. Only NVRAM Replication Replica Read Cores Cores DRAM NVRAM NVRAM DRAM Socket 2 Socket 1 1. NUMA Awareness Pool Worker Data access 6
Replication Master Replica Write Replica UPI Master Replica Read Read Cores Cores DRAM NVRAM NVRAM DRAM Socket 2 Socket 1 Pool Worker Data access 1. Per data structure NUMA Awareness + Analytical query execution on replicas - Speedup - Improved hardware utilization 7
Replication Master Replica Write Replica Replica UPI Master Replica Read Read Cores Cores DRAM NVRAM NVRAM DRAM Socket 1 Socket 2 Data access Pool Worker 2. Hybrid memory replica placement - unified scheme for NVRAM and DRAM replication - improved performance and hardware utilization - persistent capacity increase 8
Compressed Replication Master Replica Write Replica UPI Master Replica Read Read Cores Cores DRAM NVRAM NVRAM DRAM Socket 2 Socket 1 Pool Worker Compression Data access 3. Compressed format of replicas (and possibly masters) - reduced storage overhead (wear out minimization) - possibly reduced runtime overhead (increase in effective bandwidth) - query execution speedup due to direct operations on compressed data 9
Polymorphic Compressed Replication Master Replica Write Replica UPI Master Replica Read Read Cores Cores DRAM NVRAM NVRAM DRAM Socket 1 Socket 2 Pool Worker Compression B Data access Compression A 4. Various compression formats for different replicas of the same column - averaged storage overhead reduction - query execution speedup as different formats may facilitate different access patterns 10
Polymorphic Compressed Replication (PCR) 3. Storage Overhead Master Replica Write Replica Replica UPI Master 2. Unified DRAM/ NVRAM replication Replica Read Read Cores Cores DRAM NVRAM NVRAM DRAM 1. NUMA Awareness Socket 1 Socket 2 Pool Worker Compression B Data access Compression A Proof of concept - abstract user-space library 11
Abstract User-space Library Hides PCR complexity behind simplified API Integrated with our in-house query processing engine MorphStore User application mostly need to change only memory management calls 12
Polymorphic Compressed Replication Master Replica Write Replica Replica UPI Master Replica Read Read Cores Cores DRAM NVRAM NVRAM DRAM Socket 1 Socket 2 Pool Worker Compression B Data access Compression A Evaluated on Dual-Socket system: Caskade Lake 2.6 GHz, 384GiB DRAM, 1.5 TiB NVRAM Every idea is useful Techniques could be combined (e.g. deliver up to 5x speedup) Improvement depends on access pattern, number of threads, data characteristics and system workload - - - Thank you for your attention! 13