
C++ Memory Model and Synchronization Concepts
Explore the C++11 memory model and synchronization implementation for multithreaded programs, ensuring portability and code correctness. Learn about memory interactions, synchronization variables, and communication mechanisms in C++.
Uploaded on | 0 Views
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Lecture 15 The C++ Memory model Implementing synchronization)
The C++11 Memorymodel C++ provides a layer of abstraction over thehardware, so we need another model, i.e. a contract between the hardware and the C++11 programmer Ensurethatmultithreadedprogramsareportable:theywill run correctlyondifferenthardware Clarifywhichoptimizationswillor willnot break our code We needthese rules, forexample.to understandwhenwe canhave a data race,sowe canknowwhenour programis correct, and that it will run correctly by all compliant C++11compilers Thread 1 r1 = X; if (r1 ==1) Y=1; Thread 2 r2 = Y; if (r2 ==1) X=1; For example, wemight ask: If X=Y=1, is itpossible for the outcomeofthis programto be r1 = r2 = 1? 2 Scott B. Baden / CSE 160 /Wi '16
Preliminaries The C++11memorymodel describesanabstractrelation betweenthreadsand memory Providesguaranteesaboutthe interactionbetweeninstruction sequences and variablesin memory Everyvariableoccupies1 memorylocation Bitfieldsandarraysaredifferent;don tloadall of c[ ] as a32 bit word Awrite to one locationcan taffectwritesto adjacentones struct s { charc[4]; int i:3,j:4; struct in { doubled; } id; }; 3 Scott B. Baden / CSE 160 /Wi '16
Whydont we want to load all of the c[ ] array as oneword? A. Because each element is consideredan variable B. Because another thread could be writing a single element C. Because another thread could be reading a single element D. A andB E. B and C struct in{ doubled; } id; }; struct s { charc[4]; int i:3,j:4; 18 Scott B. Baden / CSE 160 /Wi '16
Whydont we want to load all of the c[ ] array as oneword? A. Because each element is consideredan variable B. Because another thread could be writing a single element C. Because another thread could be reading a single element D. A andB E. B and C struct in{ doubled; } id; }; struct s { charc[4]; int i:3,j:4; 18 Scott B. Baden / CSE 160 /Wi '16
Communication Memorywrites made byonethreadcanbecomevisible,but . specialmechanismsareneededtoguaranteethatcommunication happens betweenthreads Without explicit communication,you can t guarantee which writes get seen by otherthreads,oreventheorderinwhichtheywillbeseen The C++ atomic variable (and the Java volatile modifier) constitutesa specialmechanismtoguaranteethatcommunicationhappensbetween threads When one thread writes to a synchronization variable (e.g. an atomic) and anotherthread sees thatwrite,thefirstthreadistellingthesecond aboutall of the contents ofmemoryupuntilit performed thewriteto thatvariable Ready is a synchronization variable In C++ we use load and store member functions All the memory contents seen by T1, before it wrote to ready, must be visible to T2, after it reads the value true for ready. http://jeremymanson.blogspot.com/2008/11/what-volatile-means-in-java.html 6 Scott B. Baden / CSE 160 /Wi '16
The effects ofsynchronization Synchronizationcanbe characterizedintermsof 3properties Atomicity, Visibility, Ordering All changes made in one synchronized variable or code block are atomicandvisiblewithrespectto othersynchronizedvariablesand blocksemployingthesamelock,and processingof synchronized methodsor blockswithinany giventhreadis in program-specified order Out of orderprocessingcannotmatterto otherthreadsemploying synchronization Whensynchronizationis not usedor is used inconsistently,answers are morecomplex Imposesadditionalobligationsonprogrammersattemptingtoensure objectconsistencyrelationsthatlieatthe heartof exclusion Objectsmustmaintaininvariantsas seenby allthreadsthatrelyon them,not just by the threadperforminganygivenstatemodification 7 Scott B. Baden / CSE 160 /Wi '16
The 3Properties Of most concernwhenvaluesmust be transferredbetween main memoryand per-threadmemory Atomicity.Whichinstructionsmust have indivisible effects? Visibility. Underwhatconditions arethe effectsof one thread visible to another? The effects of interest are: writes to variables,asseen via readsof those variables Ordering.Under whatconditionscanthe effectsof operations appear out of order to any given thread? In particular, reads and writes associated with sequences of assignment statements. 8 Scott B. Baden / CSE 160 /Wi '16
What kindsof variablesrequireatomic updates? A. Instance variables and staticvariables B. Array elements. Depends on the access pattern C. Local variables insidemethods D. A &B E. B & C 9 Scott B. Baden / CSE 160 /Wi '16
What kindsof variablesrequireatomic updates? A. Instance variables and staticvariables B. Array elements. Depends on the access pattern C. Local variables insidemethods D. A &B E. B & C 10 Scott B. Baden / CSE 160 /Wi '16
What kindsof variablesrequireatomic updates? A. Instance variables and staticvariables B. Array elements. Depends on the access pattern C. Local variables insidemethods D. A &B E. B & C 11 Scott B. Baden / CSE 160 /Wi '16
Dataraces Wesaythata programallowsadataraceona particularsetofinputs if thereis a sequentiallyconsistentexecution,i.e.aninterleavingof operations of the individual threads, in which two conflicting operationscanbeexecuted simultaneously (Boehm) Wesaythatoperationscanbeexecuted simultaneously iftheyoccur next to each other in the interleaving, and correspond to different threads Wecanguaranteesequentialconsistencyonlywhentheprogram avoids dataraces Considerthisprogram,withx= = y = = 0 initially Thread 1 x = 1; r1 = y; Thread 2 y = 1; r2 = x; 24 Scott B. Baden / CSE 160 /Wi '16
Does this programhavea data race? A. Yes B. No x = = y = = 0initially Atomic<int> x; int y; Thread1 Thread2 x = 1; r1 = y; y = 1; r2 = x; 25 Scott B. Baden / CSE 160 /Wi '16
Does this programhavea data race? A. Yes B. No x = = y = = 0initially Atomic<int> x; int y; Thread1 Thread2 x = 1; r1 = y; y = 1; r2 = x; 25 Scott B. Baden / CSE 160 /Wi '16
Does this programhavea data race? A. Yes B. No x = = y = = 0initially Atomic<int> x; int y; Thread1 Thread2 x = 1; r1 = y; y = 1; r2 = x; 25 Scott B. Baden / CSE 160 /Wi '16
Dataraces Wesaythata programallowsadataraceona particularsetofinputs if thereis a sequentiallyconsistentexecution,i.e.aninterleavingof operations of the individual threads, in which two conflicting operationscanbeexecuted simultaneously (Boehm) Wesaythatoperationscanbeexecuted simultaneously iftheyoccur next to each other in the interleaving, and correspond to different threads Wecanguaranteesequentialconsistencyonlywhentheprogram avoids dataraces Thisprogramhas a datarace(x = = y = = 0 initially) Thread 1 x = 1; r1 = y; Thread 2 y = 1; r2 = x; Execution x = 1; r1 = y; y = 1; r2 = x; // r1 = 1 r2 ==1 26 Scott B. Baden / CSE 160 /Wi '16
Happens-before Fundamentalconceptin understandingthe memory model Consider these 2 threads,with counter= 0 A:counter++; B: prints outcounter Evenif B executesafterA, we cannotguaranteethatB will see 1 unless we establisha happens-beforerelationship betweenthese two statementsrunning in differentthreads Whatguaranteeis made by a happens-beforerelationship? A guarantee that memory writes by one specific statement are visible to another specificstatement Differentwaysof accomplishingthis: synchronization, atomics, variables,threadcreationand completion,e.g. thread tA =thread(A); thread tB =thread(B); tA.join(); tB.join(); 27 Scott B. Baden / CSE 160 /Wi '16
Establishing a happens-beforerelationship C++ and Java provide synchronization variables to communicate between threads, and are intended to be accessed concurrently: the atomic types, mutexes Such concurrent accesses are not considered dataraces Thus, sequential consistency is guaranteed so long as the only conflicting concurrent accesses are to synchronization variables Any write to a synchronization variable establishes a happens-before relationship with subsequent reads ofthat same variable: x_ready=true happens-before the read of x_ready in Thread 2. A statement sequenced before another happens-before it x=42 happens-before x_ready-true Happens-before is transitive: everything sequenced before a write to synchronization variable also happens-before the read of that synchronization variable byanother thread. Thus, assignment x=42 (T1) is visible after the read of x_ready by Thread2, e.g the assignment to r1 global: int x; Thread 1 x =42; x_ready =true; atomic<bool>x_ready; Thread 2 while (!x_ready){} r1 =x; 28 Scott B. Baden / CSE 160 /Wi '16
Does this programhavea race condition? A. Yes B. No global: int x; Thread 1 x =42; x_ready =true; atomic<bool>x_ready; Thread 2 while (!x_ready){} r1 =x; 29 Scott B. Baden / CSE 160 /Wi '16
Does this programhavea race condition? A. Yes B. No global: int x; Thread 1 x =42; x_ready =true; atomic<bool>x_ready; Thread 2 while (!x_ready){} r1 =x; 29 Scott B. Baden / CSE 160 /Wi '16
Communication & synchronization variables The C++atomicvariableprovidesaspecialmechanismtoguaranteethat communicationhappensbetweenthreads Whichwrites get seen by otherthreads The order in which they will beseen The happens-beforerelationshipprovidestheguaranteethatmemorywrites by onespecificstatementarevisibletoanotherspecificstatement Differentways ofaccomplishingthis:atomics,variables,threadcreation andcompletion When one thread writes to a synchronization variable (e.g. an atomic or mutex) and another thread sees that write, the first thread is telling the secondaboutallofthe contentsofmemory upuntilit performedthewrite to thatvariable Ready is a synchronization variable In C++ we use load and store member functions All the memory contents seen by T1, before it wrote to ready, must be visible to T2, after it reads the value true for ready. http://jeremymanson.blogspot.com/2008/11/what-volatile-means-in-java.html 21 Scott B. Baden / CSE 160 /Wi '16
Establishing a happens-beforerelationship Sequential consistency is guaranteedso long as the only conflicting concurrent accesses are to synchronization variables Any write to a synchronization variable establishes ahappens-before relationship with subsequentreadsof thatsamevariable: x_ready=true happens-beforethe readof x_ready inThread2. A statement happens-before another statement sequenced immediately after it x=42 happens-before x_ready-true Happens-before is transitive:everything sequenced before awrite to synchronization variable also happens-before the read ofthat synchronizationvariable by another thread: x=42 (T1)is visible after the read of x_ready byT2, e.g. the assignment to r1 The programisfree from data races Thread 2 is guaranteed notto progress to the second statementuntilthe firstthread has completed and setx_ready There cannot be an interleaving of the steps in whichthe actions x =42 and r1 =x are adjacent Declaring avariableasasynchronization variable Ensures that the variable is accessedindivisibly Prevents both the compiler andthe hardware from reordering memoryaccesses in ways that are visible to the program and couldbreakit global: int x; atomic<bool> x_ready; Thread 1 Thread 2 while (!x_ready) {} r1 = x; x = 42; x_ready = true; 22 Scott B. Baden / CSE 160 /Wi '16
Using synchronization variables to ensure sequentially consistent execution Declaringavariableas asynchronizationvariable Ensuresthat the variable is accessed indivisibly Preventsboth the compiler and the hardware fromreordering memoryaccessesin waysthat are visible to the programand could breakit In practice this requires the compiler to obey extra constraints and to generate special code to prevent potential hardwareoptimizations that could re-order the time to access the variablesin memory (e.g.cache) The programisfreefrom dataraces Thread 2 is guaranteed not to progressto the second statement until the first thread has completed and set x_ready There cannot be an interleaving of the steps in which the actionsx = 42 and r1 = x are adjacent. Thisensures asequentiallyconsistentexecution,guarantees that r1 = 42 atprogram s end Thread1 x =42; x_ready =true; Thread2 while (!x_ready){} r1 =x; 23 Scott B. Baden / CSE 160 /Wi '16
Visibility Changesto variablesmadeby onethread are guaranteedto be visibleto otherthreadsundercertain conditionsonly Awritingthreadreleasesa synchronizationlockanda reading threadsubsequentlyacquiresthatsamelock If a variableisdeclaredasatomic atomic<bool> ready = false; int answer =0 All the memory contents seen by T1, before it wrote to ready, must be visible to T2, after it reads the value true for ready. http://jeremymanson.blogspot.com/2008/11/what-volatile-means-in-java.html 24 Scott B. Baden / CSE 160 /Wi '16
Sequentialconsistency in action Thread 2 can only print 42 The assignment to ready doesn t return a reference, but rather, the return type (bool) atomic<bool>ready; int answer; // notatomic void thread2(){ if(ready) print answer; void thread1() { answer=42; ready= true; } } 25 Scott B. Baden / CSE 160 /Wi '16
How visibilityworks Awritingthreadreleasesa synchronizationlockanda reading threadsubsequentlyacquiresthatsamelock Releasingalockflushes allwrites fromthethread sworking memory, acquiring a lock forces a (re)load of the values of accessiblevariables While lock actions provide exclusion only for the operations performed withinasynchronizedblock,thesememoryeffectsare definedtocoverall variablesusedbythethreadperformingthe action If a variableisdeclaredasatomic Anyvaluewrittento it isflushedandmade visiblebythewriter thread before the writer thread performs any further memory operation. Readers mustreloadthevalues ofvolatilefieldsuponeachaccess As a threadterminates,allwrittenvariablesareflushedtomain memory. If a threadusesjointo synchronizeon theterminationofanother thread,thenit sguaranteedtosee the effectsmadebythat thread 26 Scott B. Baden / CSE 160 /Wi '16
Sequentially consistencyin practice Too expensive to guarantee sequentially consistency all the time Code transformations made by thecompiler Instruction reordering in modern processors Write buffers inprocessors In short, different threads perceive that memory references are reordered 27 Scott B. Baden / CSE 160 /Wi '16
Caveats The memory modelguaranteesthataparticularupdatetoaparticular variablemade byonethreadwilleventuallybevisibletoanother Buteventuallycanbeanarbitrarilylongtime Long stretches of code in threads that use no synchronization can be hopelessly out of synch with other threads with respect to values of fields Shall not write loops waiting for values written by other threads unless the fields are atomic or accessed via synchronization But:guaranteesmadebythememorymodelareweakerthanmost programmers intuitivelyexpect,andarealsoweakerthanthose typicallyprovidedbyanygivenC++implementation Rules donotrequirevisibilityfailures across threads,theymerely allowthesefailures tooccur Notusingsynchronizationinmultithreadedcode doesn'tguarantee safetyviolations,itjustallowsthem Detectablevisibilityfailures mightnotarise inpractice Testingfor freedomfromvisibility-basederrorsimpractical,sincesuch errorsmightoccurextremelyrarely, oronlyonplatformsyoudonot haveaccessto,oronlyon thosethathavenotevenbeenbuiltyet! 28 Scott B. Baden / CSE 160 /Wi '16
Summary:why do we need a memory model? When one thread changes memory then there needs to be a definite order to those changes, as seen by other threads Ensure that multithreaded programs are portable: they will run correctly on different hardware Clarify which optimizations will or will not break our code Compileroptimizationscanmovecode Hardwareschedulerexecutesinstructionsout of order 29 Scott B. Baden / CSE 160 /Wi '16
Acquire and release Why can the program tolerate non-atomic reads andwrites? (Listing 5.2, Williams, p. 120) How are the happens-before relationships established? 1.std::vector<int> data; 2. std::atomic<bool> data_ready(false); 3. void reader_thread() { 4. while(!data_ready.load()) 5. std::this_thread::sleep(std::milliseconds(1)); 6.std::cout << The answer= << data[0] << std::endl; 7. } 8. void writer_thread() { 9. data.push_back(42); 10. data_ready=true; 11. } 30 Scott B. Baden / CSE 160 /Wi '16
Which happens-before relationships established? 1.std::vector<int> data; 2. std::atomic<bool> data_ready(false); 3. void reader_thread() { 4. while(!data_ready.load()) 5. std::this_thread::sleep(std::milliseconds(1)); 6.std::cout << The answer= << data[0] << std::endl; 7. } 8. 9. 10. data_ready=true; 11. } C. Wr @ (9) h-b Rd@(6) D. A & Bonly E. A, B & C void writer_thread() { data.push_back(42); A. Wr@ (9) h-b Wr @ (10) B. Rd@ (4) h-b Rd @(6) 31 Scott B. Baden / CSE 160 /Wi '16
How do we implementa synchronization variable? We said that when one thread writes to a synchronization variable such as an atomic.. .. and another thread sees that write, . the first thread is telling the second about all of the contents of memory up until it performed the write to that variable How does that thread tell those other threads to synchronize with those variables? Thread 1 All the memory contents seen by T1, before it wrote to ready, must be visible to T2, after it see that the value of read is true x = 42; atomic <bool> ready=true; Thread 2 while (!x_ready){} cout << data << endl; 32 Scott B. Baden / CSE 160 /Wi '16
Memory fences The threadneedsto flush allvariablestomemory,i.e.it synchronizesthem We implementflushingoperationswitha specialfence instruction, e.g.MFENCE A serializing operation guaranteeing that every load and store instruction that precedes, in program order, the MFENCE instructionisgloballyvisiblebeforeanyloador store instruction thatfollowstheMFENCE instructionisgloballyvisible. Intel64 & IA32architecturessoftwaredevelopermanual,vol3a http://goo.gl/SrdKS2 Also see www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html atomic<bool> ready; int answer = 32 ready=true 33 Scott B. Baden / CSE 160 /Wi '16
Implementing synchronization primitives Butthereis moreto thestory:we haveonlyensuredvisibility Whataboutatomicity,usedtobuildmutexes&othersynchvariables? We use a special machine instruction, e.g. CMPXCHG, that implements Compare and Swap (CAS) jfdube.wordpress.com/2011/11/30/understanding-atomic-operations http://heather.cs.ucdavis.edu/~matloff/50/PLN/lock.pdf Do atomically:comparecontentsofmemory locationloctoexpected; if theyare thesame, modifythelocationwithnewval Do atomically: CAS(*loc, expected, newval ): if (*loc== expected ){ *loc= newval; return0; } else return1 A CAS implementation of amutex 1 = UNLOCKED, 0 = LOCKED Lock( *mutex ) {while (CAS ( *mutex , 1, 0)) ; } Unlock( *mutex ) { *mutex = 1; } 34 Scott B. Baden / CSE 160 /Wi '16
Building an atomic counter Let s implement an atomic integer counterthat exports two operations, plus a constructor getValue() incr() CAS (*loc, expected , newval ): if (*loc== expected ){ *loc= newval; return 0; } else return 1 class AtomicCounter{ private: std::atomic<int> _ctr; public: int getCtr() { return _ctr; } AtomicCounter(){ _ctr = 0; } int incr() { int oldCtr = getCtr(); while (CAS(&_ctr, &oldCtr, oldCtr+1) ) oldCtr = getCtr(); return oldCtr + 1; } if _ctr ==oldCtr _ctr oldCtr+1 return 0; else return1; 35 Scott B. Baden / CSE 160 /Wi '16