
Cutting-Edge Program Analyses and Newton's Method Insights
Dive into innovative program analyses using Newton's Method by Thomas Reps from the University of Wisconsin and GrammaTech. Explore solutions for undecidability, dataflow analysis, general CS knowledge, and more in the realm of advanced computing. Uncover the big picture and ingredients of this fascinating story, along with practical applications in the field. Update your knowledge with the latest research in program analysis methodologies.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Program Analyses using Newton's Method Thomas Reps University of Wisconsin and GrammaTech, Inc.
Sidestepping Undecidability Bad States Reachable States Universe of States 2
Sidestepping Undecidability Overapproximate the reachable states Bad States Reachable States False positive! Universe of States 3
Big Picture: Dataflow Analysis Program Abstraction engine System of Dataflow Equations In this work, we combine two methods for solving equations to create an improved equation solver Equation solver Solution (= dataflow facts) 4
Five Pieces of General CS Knowledge Five Pieces of General CS Knowledge 1. Regular languages a. ? ??? = ? ?????? = ??? b. For a left-linear CFG, e.g., ? = ? ? ? ? ?, ? ???, e.g., ? = ? + ? ??? c. For the linear CFG ? = ? ? ? | ?, ? = ???? ??? d. For the linear CFG ? = ?1 ? ?1 | ?2 ? ?2 | ?, ? = ,?1?2?2?1,?2?1?1?2,?2?1?2?2?1?2, ??? 6
Ingredients of the Story Regular languages LCFLs, such as ???? Gottfried Wilhelm Leibniz Isaac Newton 7
Five Pieces of General CS Knowledge Five Pieces of General CS Knowledge 1. Regular languages a. ? ??? = ? ?????? = ??? b. For a left-linear CFG, e.g., ? = ? ? ? ? ?, ? ???, e.g., ? = ? + ? ??? c. For the linear CFG ? = ? ? ? | ?, ? = ???? ??? d. For the linear CFG ? = ?1 ? ?1 | ?2 ? ?2 | ?, ? = ,?1?2?2?1,?2?1?1?2,?2?1?2?2?1?2, ??? Composition of relations = two-step reachability = multiplication of Boolean adjacency matrices ? ?,? ? ?,? = ? ,?.(? ?,? ?(?,? ) ? = ? ) 2. = 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 = 8
Outline Outline Background on dataflow analysis Functional approach to interprocedural analysis Newtonian program analysis Newtonian program analysis via tensor product 9
Possibly Uninitialized Variables (PUV) May Contain Junk Start ??.{?,?,?} x = 3 {?,?,?} ??.? {?} if . . . {?,?} ??.? ??.? y = x {?,?} ??.?? ? ? ? ?? ? ? ???? ? {?} y = w {?,?} ??.?? ? ? ? ?? ? ? ???? ? {?} w = 8 {?} ??.? {?} printf(y) ?,? = ?,?
Possibly Uninitialized Variables (PUV) May Contain Junk Start ??.{?,?,?} x = 3 Reachable States {?,?,?} ??.? {?} if . . . {?,?} ??.? ??.? y = x {?,?} ??.?? ? ? ? ?? ? ? ???? ? {?} y = w ??.?? ? ? ? ?? ? ? ???? ? {?} {?,?} w = 8 {?} ??.? {?} printf(y) ?,? = ?,?
Possibly Uninitialized Variables (PUV) May Contain Junk Start ??.{?,?,?} x = 3 {?,?,?} ??.? {?} if . . . ?,? ?,?,? = {?,?,?} {?,?} ??.? ??.? y = x {?,?} {?,?,?} ??.?? ? ? ? ?? ? ? ???? ? {?} x = w {?,?} {?,?,?} ??.?? ? ? ? ?? ? ? ???? ? {?} w = 8 {?} {?,?,?} ??.? ??.? {?} printf(y) ?,?,? = ?,?,? {?,?} ?,?,? = ?,?,?
Path Semantics for Intraprocedural Analysis C ?1 ?2 ?? 1 ?? n start ???= ?1 ?2 ?? 1 ?? ??? ? = ???(?) ? ??? ???[?]
Path Semantics for Intraprocedural Analysis C ?1 ?2 ?? 1 ?? n start ???= ?1 ?2 ?? 1 ?? ?[?????,?] = ??? ? ??? ???[?] ? ?????,? : transition relation (two-vocabulary formula) To obtain the dataflow value at ?, perform ? ?????,? (C)
From Path Semantics to Equational Semantics ???? ?,? = ????,? when distributes over : for all ?,?,?, ? ? ? = ? ? ? ? ? ? ? = ? ? ? ? ?[?,?] = ??? Path semantics: ? ??? ???[?] ? ?,? = ?? Equational semantics: ? ?,? = ? ?,?? ?[??,?] ?? ? ? ??,? ????? s x s x m1 ?[s,s] = Id n mk Susan Graham Mark Wegman 15
Outline Outline Background on dataflow analysis Functional approach to interprocedural analysis Newtonian program analysis Newtonian program analysis via tensor product 16
start Swap Procedure ?? ( ) void swap(bool p1, bool p2) { if (*) { bool t = p1; p1 = p2; p2 = t; } else { p1 = p1 ^ p2; p2 = p1 ^ p2; p1 = p1 ^ p2; } } ? = ?1 ?1= ?1 ??? ?2 ?1= ?2 ?2= ?1 ??? ?2 ?2= ? ?1= ?1 ??? ?2 ???? ?[?????,????] Procedure summary for swap() 17
Inter Becomes Intra When Procedure Summaries are in Hand s x P ?2 ?1 ?2 ?1 ? ?[? ,? ] ? 18
Inter Becomes Intra When Procedure Summaries are in Hand s x P ?2 ?1 ?2 ?1 ??1 ? ? ,? ???1 ??2 ? ? ,? ???2 Now can solve using any intraprocedural method 19
Problem Statement Problem Statement Given a collection of possibly recursive procedures ? = {??}, and an abstract semantics, i.e., a transformer on each edge of each control-flow graph extend ( ) and combine ( ) operations, find a procedure summary for each ?? ? 20
start Swap Procedure ?? ( ) void swap(bool p1, bool p2) { if (*) { bool t = p1; p1 = p2; p2 = t; } else { p1 = p1 ^ p2; p2 = p1 ^ p2; p1 = p1 ^ p2; } } ? = ?1 ?1= ?1 ??? ?2 ?1= ?2 ?2= ?1 ??? ?2 ?2= ? ?1= ?1 ??? ?2 ???? 21
Predicate Abstraction (Boolean Programs) post-state assignment (?1 pre-state assignment (?1,?2) ,?2 ) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) ?1 ?1 ??? ?2 ?2 ?1 ??? ?2 ?1 ?1 ??? ?2 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0
Predicate Abstraction (Boolean Programs) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) ?1 ?1 ??? ?2 ?2 ?1 ??? ?2 ?1 ?1 ??? ?2 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 (00) (01) (10) (11) (00) (01) (10) (11) ?1 ?1 ??? ?2;?2 ?1 ??? ?2 1 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 23
Predicate Abstraction (Boolean Programs) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) (00) (01) (10) (11) ?1 ?1 ??? ?2 ?2 ?1 ??? ?2 ?1 ?1 ??? ?2 ????(?1,?2) 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 (00) (01) (10) (11) (00) (01) (10) (11) ?1 ?1 ??? ?2;?2 ?1 ??? ?2;?1 ?1 ??? ?2 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 24
start Swap Procedure ?? ( ) void swap(bool p1, bool p2) { if (*) { bool t = p1; p1 = p2; p2 = t; } else { p1 = p1 ^ p2; p2 = p1 ^ p2; p1 = p1 ^ p2; } } ? = ?1 ?1= ?1 ??? ?2 ?1= ?2 ?2= ?1 ??? ?2 ?2= ? ?1= ?1 ??? ?2 (00) (00) ? (01) (01) (10) (10) ???? (11) (11) 25
Predicate Abstraction (Boolean Programs) pre-state assignment (?1,?2,?) post-state assignment (?1 ,?2 ,? ) (000) (001) (010) (011) (100) (101) (110) (111) (000) (000) (001) (010) (011) (100) (101) (110) (111) (000) (001) (010) (011) (100) (101) (000) (001) (000) (001) (010) (011) (100) (101) (110) (111) (001) (010) (011) (100) (101) (110) (111) (010) (011) (100) (101) (110) (111) (110) (111) t ?1 ?1 ?2 ?2 ? 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 26
Predicate Abstraction (Boolean Programs) (000) (001) (010) (011) (100) (101) (110) (111) (000) (000) (001) (010) (011) (100) (101) (110) (111) (000) (001) (010) (011) (100) (101) (000) (001) (000) (001) (010) (011) (100) (101) (110) (111) (001) (010) (011) (100) (101) (110) (111) (010) (011) (100) (101) (110) (111) (110) (111) t ?1 ?1 ?2 ?2 ? 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 (000) (001) (000) (001) (010) (011) (100) (101) (110) (111) (010) (011) (100) (101) (110) (111) 27
Predicate Abstraction (Boolean Programs) (000) (001) (010) (011) (100) (101) (110) (111) (000) (000) (001) (010) (011) (100) (101) (110) (111) (000) (001) (010) (011) (100) (101) (000) (001) (000) (001) (010) (011) (100) (101) (110) (111) (001) (010) (011) (100) (101) (110) (111) (010) (011) (100) (101) (110) (111) (110) (111) t ?1 ?1 ?2 ?2 ? 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 (00) (00) (000) (001) (000) (001) (010) (011) (100) (101) (110) (111) (01) (01) (010) (011) (100) (101) (110) (111) exit scope (10) (10) (11) (11) ????(?1,?2) 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 28
start Swap Procedure ?? ( ) void swap(bool p1, bool p2) { if (*) { bool t = p1; p1 = p2; p2 = t; } else { p1 = p1 ^ p2; p2 = p1 ^ p2; p1 = p1 ^ p2; } } ? = ?1 ?1= ?1 ??? ?2 ?1= ?2 ?2= ?1 ??? ?2 ?2= ? ?1= ?1 ??? ?2 (00) (00) (00) (00) (01) (01) (01) (01) (10) (10) (10) (10) ???? ?[?????,????] (11) (11) (11) (11) (00) (00) (01) (01) Procedure summary for swap() (10) (10) 29 (11) (11)
From Path Semantics to Equational Semantics ?[?,?] = ??? Path semantics: ? ??? ???[?] ? ?,? = ?? Equational semantics: ? ?,? = ? ?,?? ?[??,?] ?? ? ? ??,? ????? s x s x m1 ?[s,s] = Id n mk Susan Graham Mark Wegman 30
Interprocedural Equational Semantics s x ? ?,? = ?? ? ?,? = ? ?,?? ?[??,?] ?? ? ? ??,? ????? ?[s,s] = Id ? ?,? = ? ?,? ???,? ? ? ,? ???? ,? ? ? s x ?[?,?] m1 c r ???,? ???? ,? ? ? n mk ?[? ,? ] 31
Conventional Approach to Summary Functions Set up equation system: ? ?,? = ?? ??? ??? ? ?????????? ? ?,? = ? ?,?? ?[??,?] ??? ??? ? ?????????? ? ?????????? ??,? ????? Summary for called procedure ? ?,? = ? ?,? ???,? ? ? ,? ???? ,? ??? ??? ?,? ????????? ?,? ????? Summary for call site Solve using successive approximation (Kleene evaluation or chaotic iteration) Radhia Cousot 32 Micha Sharir Amir Pnueli Patrick Cousot
Outline Outline Background on dataflow analysis Functional approach to interprocedural analysis Newtonian program analysis Newtonian program analysis via tensor product 33
Big Picture: Dataflow Analysis Program Abstraction engine System of Dataflow Equations NPA: An alternative equation solver for finding procedure summaries Equation solver Solution (dataflow facts; procedure summaries) 34
Newtons Method for Finding Roots Newton s Method for Finding Roots y A way to find successively better approximations of a root of a function f(xi) f(xi+1) Given a function f, its derivative f and an initial x0, repeatedly perform x xi+1 xi xi+2 Newtonian Program Analysis: The same general idea repeatedly create and solve a linear model can be applied to programs, too Create a linear model of the function and use it to find a better approximation of the solution ? ?? ? ?? ??+1= ?? Isaac Newton 35
Newtons Method for Programs Kleene iteration ?(0)= ?(?+1)= ?( ?(?)) NPA iteration ?(0) = ?(?+1)= ? ?? ???????????????????? ?, ?(?)
Newtons Method for Programs Kleene iteration ?(0)= ?(?+1)= ?( ?(?)) NPA iteration ?(0) = ?(?+1)= ? ?? ???????????????????? ?, ?(?) Non-numeric equations with a commutative multiplication Numeric equations from probabilistic programs Dexter Kozen Mark Hopkins Mihalis Yannakakis Kousha Etessami 37 37
Newtons Method for Programs Kleene iteration ?(0)= ?(?+1)= ?( ?(?)) NPA iteration ?(0) = ?(?+1)= ? ?? ???????????????????? ?, ?(?) Non-numeric equations from interprocedural dataflow analysis: non-commutative multiplication 38 Javier Esparza Michael Luttenberger Stefan Kiefer 38
Newtons Method for Program Analysis Newton s Method for Program Analysis [Esparza et al.] Esparza et al. had to finesse certain differences Real-valued equations Algebraic semiring Mathematics Multiplication: x Program Analysis Abstract Compose: Join: Addition: + Root finding versus fixed-point finding? Derivatives? 39
Polynomial Equations for Summary Functions Polynomial Equations for Summary Functions proc X1 proc X2 x1() { a; x2(); } x2() { if (*) d; else{ b; x2(); x2(); c; } } b a X2 X2 d X2 X2 c Equation System ?1= ? ?2 ?2= ? ? ?2 ?2 ? 40
Polynomial Equations for Summary Functions Polynomial Equations for Summary Functions proc X1 proc X2 x1() { a; x2(); } x2() { if (*) d; else{ b; x2(); x2(); c; } } b a X2 X2 d X2 c Equation System This term is quadratic ?1= ? ?2 ?2= ? ? ?2 ?2 ? 41
Problematic Issues . . . Problematic Issues . . . We have no minus operation in dataflow analysis ? ? ? = 0 ?(?) = ? What the heck do we do about ? ? ? ?? ? ?? ??+1= ?? 42
Newtons Method for Programs Newton s Method for Programs [Esparza et al.] [Esparza et al.] Leibniz product rule Finesse the derivative question by fiat: ? ? = ?? ? ? ? = ?? ? ? Gottfried Wilhelm Leibniz Outer fixpoint computation Really a differential: ? ? NPA iteration ?(0) = ?(0) ?(?+1)= ?(?) where ?(?) is the least solution of ? = ? ?? ? ?|?? (?) ? ? ? ? ?? ? ?, modulo non-commutativity of if ?(?) is a constant if ?(?) = ? if ?(?) = ?(?) (?) if ? ? = ? ? (?) 0 ? ??(?)|?? ? (?)|?? ??(?)|?? ? ??(?)|?? = Inner fixpoint computation ? ? ? ? |?? ? Example 1: ? ? Linear correction term ? ? ? ? ? ? ? ? ? Example 2: ? ? ? ? 43
Polynomial Equations for Summary Functions Polynomial Equations for Summary Functions ?2 is the value of ?2 from the previous Newton round i.e., some constant for the current round Differentiated equations Original equations ?1= ? ?2 ?2= ? ? ?2 ?2 ? ? ?2 ?2 ? ? ?2 ?2 ? ? ?2 Apply the linearization operator ?1= ? ?2 ?2= ? ? ?2 ?2 ? y Each summand now only has one variable; this system of equations is linear! proc X1 proc X2 f(xi) b a X2 X2 xi x d xi+1 xi+2 X2 c 44
Polynomial Equations for Summary Functions Polynomial Equations for Summary Functions Differentiated equations Original equations ?1= ? ?2 ?2= ? ? ?2 ?2 ? ? ?2 ?2 ? ? ?2 ?2 ? procY1 ? ?2 Apply the linearization operator ?1= ? ?2 ?2= ? ? ?2 ?2 ? proc X1 Operational interpretation: ?2 is the current approximation At first call, use ?2; at second call, perform exploration (of the transformed procedure) At first call, perform exploration (of the transformed procedure); at second call, use ?2 Combine ( ) proc Y2 proc X2 a a b b b b a y2 y2 Y2 y2 X2 X2 Y2 d d X2 y2 Y2 y2 c c c c 45
Goal: Precise Solution of Linear Equations Pre-call action Post-call action start exit ??,? ??,? ??,2 Matched path ? ??,2 ??,1 ??,1 ???= ??,? ??,2 ??,1 ??,1 ??,2 ??,? = ??,? ??,2??,1??,1??,2 ??,? ? ?????,???? = ??? ? ???? ????? ??? ???? = ??,? ??,2??,1??,1??,2 ??,? ? ???? ????? ???[????]
Outline Outline Background on dataflow analysis Functional approach to interprocedural analysis Newtonian program analysis Newtonian program analysis via tensor product 47
Big Picture: Dataflow Analysis Program Abstraction engine System of Dataflow Equations NPA-TP: Combines NPA with Tarjan s method to create an improved NPA solver (for finding procedure summaries) Equation solver Solution (dataflow facts; procedure summaries) 48
A Misconception on My Part Polynomial Linear Interprocedural analysis Intraprocedural analysis On each Newton round, use a fast intraprocedural solver, such as Tarjan s path-expression method! 49
Tarjans Tarjan s Path Path- -Expression Method Expression Method FSM-to-regular-expression conversion == the most-general dataflow-analysis problem Dataflow analysis for a single control-flow graph (CFG) ( intraprocedural analysis ) For each ? ?????, find a regular expression ?? CFG = finite-state machine over alphabet of edges Convert FSM to a regular expression (Kleene s theorem) ?? characterizes all paths from the start node to ? For each client analysis Interpret each alphabet symbol as the abstract transformer of the corresponding CFG edge Interpret +, , and * using suitable operations , , and Evaluate each ??, ? ????? Robert Tarjan 50
A Misconception on My Part A Misconception on My Part Issue: Extend ( ) typically models the reversal of function composition: ? ? ? ? In general, is not commutative In a semiring ? ?2|?(?) = ? ? ? |?(?) = ? ? |?? ? ? ? ? |?? =? ? ? ? ? ? ? ? In calculus ? ?2|?(?) = ? ? ? |?(?) = =? ? +? ? = ? ? + ? ? = 2?? ? ? |?? ? +? ? ? |?? ? ?2 ?2 ? ?2 ? ?2 ? b ?2 ? ?2 = 2??? Esparza s linear equations are not left-linear or right- linear, which Tarjan requires 51