Impact of Compilers on Memory Models in Programming
Explore the influence of compilers on memory models and programming languages, delving into the limitations, transformations, and legal aspects of compiler actions within the context of memory models. Understand how compilers affect code behavior and the enforcement of memory structures.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Compilers, Languages, and Memory Models DAVID DEVECSERY
Administration! Readings! Most are up! (some still subject to change) Sign-up for discussions On Canvas! Project Proposal info coming soon!
Today: Compilers! Compilers are like hardware They do things we don t expect! What limitations are there to what a compiler can do? What memory model should a compiler enforce? C++ Memory Model studies
Why do compilers impact MMs? How can the compiler transform this code? Initial: ready = false, a = null Operation splitting Thread 2 Thread 1 Operation reordering a = new int[10]; ready = true; Operation combining if (ready) a[5] = 5;
Why do compilers impact MMs? What can t the compiler do? Initial: ready = false, a = null Break single-threaded semantics Thread 2 Thread 1 Order across synchronizations a = new int[10]; ready = true; Break control dependencies if (ready) a[5] = 5;
Why do compilers impact MMs? Is this a legal transformation? Initial: ready = false, a = null Thread 2 Thread 1 Thread 2 Thread 1 ready = true; a = new int[10]; a = new int[10]; ready = true; if (ready) a[5] = 5; if (ready) a[5] = 5;
Why do compilers impact MMs? Is this a legal transformation? Initial: ready = false, a = null Thread 2 Thread 1 Thread 2 Thread 1 a = new int[10]; ready = true; tmp = new int[10] a = tmp & 0xFF; ready = true a |= tmp & ~0xFF; if (ready) a[5] = 5; if (ready) a[5] = 5;
Why do compilers impact MMs? Is this a legal compiler transformation? Why or why not? Initial: ready = false, a = null Thread 2 Thread 2 Thread 1 Thread 1 b = a +1 z = b + 10 c = a +1 w = c[1] b = a +1 z = b + 10 a = new int[10] a = new int[10] w = b[1]
This transformation breaks single-threaded semantics, loading a[5] may segfault if ready is not true Why do compilers impact MMs? Is this a legal transformation? Initial: ready = false, a = null Thread 2 Thread 1 Thread 2 Thread 1 a = new int[10]; ready = true; a = new int[10] ready = true if (ready) a[5] = 5; int *tmp = a[5] if (ready) *tmp = 5;
Why can hardware reorder across control dependencies? Hardware has full dynamic information Can efficiently mis-speculate Compiler reorderings are without dynamic information Compilers Hardware Instruction Window More Less Dynamic Information Less More
What restrictions do we need for DRF-0? What are the limitations on the compiler to enforce DRF0? Hint: Think of the hardware restrictions Any optimizations must preserve the happens- before relationship created by the original program.
Why do compilers impact MMs? Is this valid in a DRF0 compiler? Initial: ready = false, a = null There are no DRF0 defined Thread 2 happens-before orderings between the threads! Thread 1 Thread 2 Thread 1 ready = true; a = new int[10]; a = new int[10]; ready = true; if (ready) a[5] = 5; if (ready) a[5] = 5;
Why do compilers impact MMs? Is this valid in a DRF0 compiler? Initial: ready = false, a = null Reordering within the lock Thread 2 is valid, these instructions don t have HB orderings w/ other threads Thread 1 Thread 2 Thread 1 lock(m1) ready = true; a = new int[10]; unlock(m1); lock(m1); a = new int[10]; ready = true; unlock(m1) lock(m1) if (ready) a[5] = 5; unlock(m1) lock(m1); if (ready) a[5] = 5; unlock(m1)
Is DRF-0 enough? What happens when DRF0 breaks? What are some negative consequences that can happen? e.g. Crash bugs! (below) Thread 2 Thread 1 Thread 2 Thread 1 ready = true; a = new int[10]; a = new int[10]; ready = true; if (ready) a[5] = 5; if (ready) a[5] = 5;
Safety vulnerabilities? Can DRF0 lead to safety concerns? Initial: foo = 2 Thread 2 foo = 1000001 Thread 1 Thread 2 Thread 1 x = foo; switch(x) { case 1: case 2: case n: } foo = 1; tmp = table[foo] loc = tmp goto loc; foo -= 1000000
What about if we want racy programs? Sometimes, we intentionally write racy programs: Initial: foo = null if (foo == null) lock(m1); if (foo == null) { foo = new Foo(); } unlock(m1); }
What about if we want racy programs? Sometimes, we intentionally write racy programs: Initial: foo = null Thread 1 if (foo == null) lock(m1); if (foo == null) { } unlock(m1); } Is this safe? Thread 2 Double Check Locking if (foo == null) lock(m1); if (foo == null) { foo = new Foo(); } unlock(m1); } foo = malloc(sizeof(Foo)); foo.construct(); if (foo == null) { } use uninitialized Foo
How can we safely write racy code? C++11 has Weak DRF0 (or WDRF0) Allows defined atomic operations, with relaxed consistency semantics Acquire Release AcquireRelease Sequential Consistency Relaxed (no ordering restrictions)
Why do compilers impact MMs? No!This is impossible! The reordering breaks the happens-before ordering between the store Can the following segfault in C++11? Initial: ready = false, a = null Thread 2 and release. Thread 1 Thread 2 Thread 1 store(ready, true, ReleaseSemantics) a = new int[10]; a = new int[10]; store(ready, true, ReleaseSemantics) if (load(ready, AcquireSemantics)) a[5] = 5; if (load(ready, AcquireSemantics)) a[5] = 5;
Atomics are simple, right? Can we see a==1, b==0, c==1, d==0 in C++11? Initial: x = 0, y = 0 Thread 1 Thread 3 Thread 4 Thread 2 store(x, 1, Release) a = load(x, Acquire) b = load(y, Acquire) c = load(y, Acquire) d = load(x, Acquire) store(y, 1, Release)
Atomics are simple, right? Yes! There is no relationship between thread1 and thread2, so the stores are not globally ordered. Can we see a==1, b==0, c==1, d==0 in C++11? Initial: x = 0, y = 0 Thread 1 Thread 3 Thread 4 (Think of our TSO example) Thread 2 store(x, 1, Release) a = load(x, Acquire) b = load(y, Acquire) c = load(y, Acquire) d = load(x, Acquire) store(y, 1, Release)
Why do compilers impact MMs? Can we see a==1, b==0, c==1, d==0 in C++11? Messy, but: Initial: y = 0, x = 0 No broken dependencies! Thread 1 store(x, 1, Release) Thread 2 Thread 3 Thread 4 a = load(x, Acquire) b = load(y, Acquire) c = load(y, Acquire) d = load(x, Acquire) store(y, 1, Release)
Why do compilers impact MMs? No! Can we see a==1, b==0, c==1, d==0 in C++11? Initial: x = 0, y = 0 The request for an SC operation enforces ordering between thread1 and thread2 Thread 1 Thread 3 Thread 4 Thread 2 store(x, 1, SC) a = load(x, Acquire) b = load(y, Acquire) c = load(y, Acquire) d = load(x, Acquire) store(y, 1, SC)
Why do compilers impact MMs? Can we see a==1, b==0, c==1, d==0 in C++11? Initial: x = 0, y = 0 Thread 1 Thread 3 Thread 4 Thread 2 store(x, 1, SC) a = load(x, Acquire) b = load(y, Acquire) c = load(y, Acquire) d = load(x, Acquire) store(y, 1, SC)
Why do compilers impact MMs? Can you spot the problem with these dependencies? Can we see a==1, b==0, c==1, d==0 in C++11? Initial: y = 0, x = 0 Thread 1 store(x, 1, SC) Thread 2 Thread 3 Thread 4 a = load(x, Acquire) b = load(y, Acquire) c = load(y, Acquire) d = load(x, Acquire) store(y, 1, SC)
Why do compilers impact MMs? Can you spot the problem with these dependencies? Can we see a==1, b==0, c==1, d==0 in C++11? Initial: y = 0, x = 0 Thread 1 store(x, 1, SC) Thread 2 Thread 3 Thread 4 a = load(x, Acquire) b = load(y, Acquire) c = load(y, Acquire) d = load(x, Acquire) store(y, 1, SC)
Weak DRF-0 Is Weak DRF0 really needed? What does the paper say? What do you think? Do you think this abstraction is programmable?
C++11 and volatile What is the use of volatile? Essentially assumes the value may change outside of the thread Disables reordering, splitting, and value caching optimizations Would making a and/or ready volatile help this code? Thread 2 Thread 1 a = new int[10]; ready = true; if (ready) a[5] = 5;
Next class Sequential Consistency and compilers! Java Memory Model