
Dynamic Storage Reconfiguration and Distributed Shared Storage Solutions
Explore the concept of dynamic storage reconfiguration and distributed shared storage solutions through the lens of key topics such as motivation, definition, and implementation strategies. Discover how these technologies enable reliable and fault-tolerant storage systems in distributed environments.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Dynamic Storage Reconfiguration Alexander Spiegelman, Idit Keidar
Agenda Reconfigurable atomic storage Motivation Definition Solutions Expired configuration oracle Traversing possible configurations Using a generic SpSn abstraction Existing algorithms as SpSn implementations
Distributed Shared Storage server client Reliable replicated storage Using unreliable components Asynchrony - tolerate unpredictable network delays 3
Distributed Shared Storage 101 Servers fault-prone Clients ephemeral, infinitely many Servers emulate read/write registers to clients ABD [Attiya, Bar-Noy and Dolev] read/write 4 2 1 3 5
ABD Register Emulation Write to majority 4 2 1 3 5
ABD Register Emulation 4 2 1 3 5
ABD Register Emulation Read from majority 4 2 1 3 5
ABD Register Emulation 4 2 1 3 5 Every two majorities intersect
Motivation for Reconfiguration ABD tolerates minority failures 4 2 1 3 5 No more failures are allowed! need to be able to remove faulty and add correct servers
But, Be Careful! Remove(1,2) Add(6,7) Remove(1,2) Add(8,9) 4 3 1 2 5
But, Be Careful! 4 7 3 9 6 8 5
But, Be Careful! Split Brain! 4 7 3 9 6 8 5 Client have to coordinate
Agenda Reconfigurable atomic storage Motivation Definition Solutions Expired configuration oracle Traversing possible configurations Using a generic SpSn abstraction Existing algorithms as SpSn implementations Liveness with endless reconfigurations
Model Infinite set of clients any client can fail by crashing Infinite set of potential servers failures are restricted generalize majority to dynamic model (soon) Communication asynchronous
Configuration Finite set of servers & history how we got to it Changes = {+,-} X E.g., change can be +s12 or s7 Configuration = finite subset of Changes e.g., {+s1, +s2,-s2 +s3, -s4, +s12} Initial configuration C0 is known to all clients e.g., {+s1, +s2, +s3, +s4, +s5}
Configurations Membership Servers that were added and not removed i.e., {s | +s conf -s conf } e.g., Initial membership is {s1, s2, s3, s4, s5} by slight abuse of notation, we sometimes refer to C.membership as C configuration C C.membership +s1 s1 +s2 -s2 +S3 S3 +s4 s4 +s5 -s5
Install event Configurations can be installed i.e., install(C0) Intuitively, indicates that from now on we expect to be able to perform operations in the new configuration We will use it to define the failure model
Changing the Configuration Add reconfig(C,P) to the API Changes configuration C by applying changes in P Returns new installed configuration C May install more configurations C becomes superseded C is either C0 or a configuration returned by an earlier reconfig (assumption on usage) Proposal P is a set of changes E.g., {+s12, -s7, -s2} By convention, reconfig({}, C0) returnsC0 at time 0
Changing the Configuration (sequential example) reconfig(C0, {+s4, -s1+}) C0 C1 installed +s1 +s1 -s1 +s4 When reconfig returns C1, C0 is superseded +s2 +s2 +S3 +S3
Reconfiguration Properties reconfig(C, P) returns C s.t. C was installed C reflects P: C C P Validity: if C was installed, then for every c \in C there was reconfig(*, P ) invocation s.t. c P
Reconfiguration Properties C0 +s1 For example consider reconfig(C0, {+s4}) Concurrent with reconfig(C0, {+s5, - s1}), +s2 +S3 C1 C2 C3 C4 +s1 +s1 +s1 +s1 +s4 +s4 +s4 +s5 +s4 +s5 -s1 +s2 +s2 +s2 +s2 -s1 +S3 +S3 +S3 +S3 Both operations can install all reconfig(C0, {+s4}) can return all
Failure Model For ABD, need live majority but in which configuration? Availability: A configuration C is available if a majority of C.members is alive Rationale: can emulate register using ABD We require reconfigurability: Every installed configuration C must be available until it superseded (a reconfig returns C C) Rationale: Administrator may take down machines once a reconfiguration removing them returns
Shared Memory Abstraction We can see each configuration as a collection of (low-level) MWMR registers emulated by the servers in the configuration s membership (ABD) c c X.read() Y.write(v) accessible by clients if the configuration is available s s s s all the registers can be read in one access s
Problem: Dynamic Storage Emulate MWMR register Using a dynamic set of servers May fail subject to reconfigurability Clients invoke operations read(), write(v), and reconfig(C,P) sequential read/write specification: a read returns the last written value reconfig(C, P) returns C that is installed, reflects P, and satisfies validity
Agenda Reconfigurable atomic storage Motivation Definition Solutions Expired configuration oracle Traversing possible configurations Using a generic SpSn abstraction Existing algorithms as SpSn implementations
Tracking Available Configurations C0 +s1 +s2 +S3
Tracking Available Configurations C0 C +s1 +s4 +s5 +s1 reconfig +s4, +s5 +s2 +s2 +S3 +S3
Tracking Available Configurations C0 C +s1 +s4 +s5 +s1 reconfig +s4, +s5 +s2 +s2 +S3 +S3 May be unavailable (reconfigurability) C client We need to help clients
Expired Configuration Oracle Necessary for liveness Distributed Notified of superseded configurations Using expire(config, next) event Informs clients of next configurations
Tracking Available Configurations C0 C +s1 +s4 +s5 +s1 reconfig +s4, +s5 +s2 +s2 +S3 +S3 Expire(C0,C1) May be unavailable (reconfigurability) C client
Expire: A Closer Look Client invokes expire(config, next) config is expired, next is newer Different clients may expire the same configuration withdifferent nexts A client that accesses an unavailable configuration May either succeed OR Receive an expired message with some next configuration OR Stuck forever if the configuration was not previously expired
Expire Event c1 C0
Expire Event Reconfig(C0,P1) c1 C0
Expire Event Reconfig(C0,P1) Install(C1) c1 C1=C0 P1 C0
Expire Event Reconfig(C0,P1) Install(C1) expire(C0, C1) c1 C1=C0 P1 C0
Expire Event Reconfig(C0,P2) Reconfig(C0,P1) Install(C1) expire(C0, C1) c2 c1 C1=C0 P1 C0
Expire Event Reconfig(C0,P2) Install(C2) Reconfig(C0,P1) Install(C1) expire(C0, C1) c2 c1 C1=C0 P1 C0 C2=C0 P2
Expire Event Reconfig(C0,P2) Install(C2) expire(C0, C2) Reconfig(C0,P1) Install(C1) expire(C0, C1) c2 c1 C1=C0 P1 C0 C2=C0 P2
Expire Event Reconfig(C0,P2) Install(C2) expire(C0, C2) Reconfig(C0,P1) Install(C1) expire(C0, C1) C1 C2 c3 c4 c2 c1 C1=C0 P1 C0 Expire is dangerous! C2=C0 P2 Clients must coordinate. Unavailable
Agenda Reconfigurable atomic storage Motivation Definition Solutions Install/expire oracle Traversing possible configurations Using a generic SpSn abstraction Existing algorithms as SpSn implementations
Dynamic Atomic Storage Read/Write Na ve Solution When there is no reconfig in progress, emulate read/write using ABD on latest configuration returned by reconfig What happens during reconfiguration? Tomorrow Technion servers will be down for maintenance from 5:30am to 6:45am Virtually Yours, Moshe Barak
We Dont Want to Stop the World Where do we read/write during reconfiguration? Track installed non-superseded configurations Read in all Write in the latest How we do it? What about reconfig?
Consensus Can Help With Reconfig Rambo uses consensus to agree on the next conf C0
Consensus Can Help With Reconfig Rambo uses consensus to agree on the next conf Reconfig: C0
Consensus Can Help With Reconfig Rambo uses consensus to agree on the next conf Reconfig: Propose(P1) Propose(P2) c1 c2 C0
Consensus Can Help With Reconfig Rambo uses consensus to agree on the next conf Reconfig: c1 c2 Decide(P ) C0
Consensus Can Help With Reconfig Rambo uses consensus to agree on the next conf Reconfig: Install(C1) expire(C0, C1) c1 c2 Decide(P ) C1=C0 P C0
Consensus Can Help With Reconfig Rambo uses consensus to agree on the next conf Reconfig: Install(C1) expire(C0, C1) c1 c2 Decide(P ) C1=C0 P C2 C0
Traversing Configuration DAG Asynchronous solutions traverse DAG DynaStore [Aguilera, Keidar, Malkhi and Shraer] SmartMerge[Jehl, Vitenberg and Meling] Parsimonious SpSn [Gafni and Malkhi]