Importance of Data Visualisation in Analysis

on abstraction refinement for program analyses n.w
1 / 44
Embed
Share

Data visualisation plays a crucial role in the analysis process, aiding in exploring data, checking assumptions, generating hypotheses, and conveying results effectively. Understanding the principles and elements of good data visualisation is key to presenting information in a clear and concise manner.

  • Data Visualisation
  • Analysis
  • Principles
  • Effective Communication
  • Data Integrity

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. On Abstraction Refinement for Program Analyses in Datalog Xin Zhang, Ravi Mangal, Mayur Naik Georgia Tech Radu Grigore, Hongseok Yang Oxford University

  2. Datalog for program analysis Datalog 2 Programming Language Design and Implementation, 2014 6/10/2014

  3. What is Datalog? Datalog 3 Programming Language Design and Implementation, 2014 6/10/2014

  4. What is Datalog? Input relations: Output relations: Rules: (1) path(i, i). (2) path(i, k) :- path(i, j), edge(j, k). edge(i, j). path(i, j). Least fixpoint computation: Input: edge(0, 1), edge(1, 2). path(0, 0). path(1, 1). path(2, 2). path(0, 1) :- path(0, 0), edge(0, 1). path(0, 2) :- path(0, 1), edge(1, 2). Datalog 4 Programming Language Design and Implementation, 2014 6/10/2014

  5. Why Datalog? If there exists a path from a to b, and there is an edge from b to c, then there exists a path from a to c: path(a, c) :- path(a, b), edge(b, c). Datalog 5 Programming Language Design and Implementation, 2014 6/10/2014

  6. Why Datalog? k-object-sensitivity, k = 2, ~100KLOC 6 Programming Language Design and Implementation, 2014 6/10/2014

  7. Limitation k-object-sensitivity, k = 10, ~500KLOC k-object-sensitivity, k = 2, ~100KLOC 7 Programming Language Design and Implementation, 2014 6/10/2014

  8. Program abstraction Abstraction Precision Scalability 8 Programming Language Design and Implementation, 2014 6/10/2014

  9. Parametric program abstraction 1 1 1 1 1 Abstraction Precision Scalability 9 Programming Language Design and Implementation, 2014 6/10/2014

  10. Parametric program abstraction 1 0 1 1 0 Abstraction Precision Scalability 10 Programming Language Design and Implementation, 2014 6/10/2014

  11. Parametric program abstraction: Example 1 Cloning depth K for each call site and allocation site 1 0 1 1 0 Pointer Analysis 11 Programming Language Design and Implementation, 2014 6/10/2014

  12. Parametric program abstraction: Example 2 Predicates to use as abstraction predicates 1 0 1 1 0 Shape Analysis 12 Programming Language Design and Implementation, 2014 6/10/2014

  13. Program abstraction 1 0 1 1 0 0 1 1 0 0 Datalog Program Datalog Program alias(p, q)? alias(m, n)? 13 Programming Language Design and Implementation, 2014 6/10/2014

  14. Program abstraction 1 0 1 1 0 0 1 1 0 0 Counterexample guided refinement (CEGAR) via MAXSAT Datalog Program Datalog Program alias(p, q)? alias(m, n)? 14 Programming Language Design and Implementation, 2014 6/10/2014

  15. Pointer analysis example f(){ v2 = id1(v1); v3 = id2(v2); q2:assert(v3!= v1); } g(){ v5 = id1(v4); v6 = id2(v5); q1:assert(v6!= v1); } v1 = new ...; v4 = new ...; id1(v){return v;} id2(v){return v;} 15 Programming Language Design and Implementation, 2014 6/10/2014

  16. Pointer analysis as graph reachability 0 3 a0 b0 b1 a1 6 6 6 a0 b0 b1 a1 1 4 c1 d0 c0 d1 7 7 7 c0 d1 d0 c1 2 5 16 Programming Language Design and Implementation, 2014 6/10/2014

  17. Graph reachability in Datalog 0 3 Input relations: edge(i, j, n), abs(n) a0 b0 b1 a1 6 6 6 Output relations: path(i, j) a0 b0 b1 a1 1 4 c1 d0 c0 d1 Rules: (1) path(i, i). (2) path(i, j) :- path(i, k), edge(k, j, n), abs(n). 7 7 7 c0 d1 d0 c1 2 5 Input tuples: edge(0, 6, a0), edge(0, 6 , a1), edge(3, 6, b0), abs(a0) abs(a1), abs(b0) abs(b1), abs(c0) abs(c1), abs(d0) abs(d1). Query Tuple q1: path(0, 5) assert(v6!= v1) q2: path(0, 2) assert(v3!= v1) Original Query 16 possible abstractions in total 17 Programming Language Design and Implementation, 2014 6/10/2014

  18. Desired result 0 3 Input relations: edge(i, j, n), abs(n) a0 b0 b1 a1 6 6 6 Output relations: path(i, j) a0 b0 b1 a1 1 4 c1 d0 c0 d1 Rules: (1) path(i, i). (2) path(i, j) :- path(i, k), edge(k, j, n), abs(n). 7 7 7 c0 d1 d0 c1 2 5 Input tuples: edge(0, 6, a0), edge(0, 6 , a1), edge(3, 6, b0), abs(a0) abs(a1), abs(b0) abs(b1), abs(c0) abs(c1), abs(d0) abs(d1). Query Answer a1b0c1d0 Impossibility q1: path(0, 5) q2: path(0, 2) 18 Programming Language Design and Implementation, 2014 6/10/2014

  19. Iteration 1 0 3 path(0, 0). path(0, 6) :- path(0, 0), edge(0, 6, a0), abs(a0). path(0, 1) :- path(0, 6), edge(6, 1, a0), abs(a0). path(0, 7) :- path(0, 1), edge(1, 7, c0), abs(c0). path(0, 2) :- path(0, 7), edge(7, 2, c0), abs(c0). path(0, 4) :- path(0, 6), edge(6, 4, b0), abs(b0). path(0, 7) :- path(0, 4), edge(4, 7, d0), abs(d0). path(0, 5) :- path(0, 7), edge(7, 5, d0), abs(d0). a0 b0 b1 a1 6 6 6 a0 b0 b1 a1 1 4 c1 d0 c0 d1 7 7 7 c0 d1 d0 c1 2 5 Query Eliminated Abstractions q1: path(0, 5) q2: path(0, 2) abs(a0) abs(a1), abs(b0) abs(b1), abs(c0) abs(c1), abs(d0) abs(d1). 19 Programming Language Design and Implementation, 2014 6/10/2014

  20. Iteration 1 - derivation graph 0 3 a0 b0 b1 a1 6 6 6 a0 b0 b1 a1 1 4 c1 d0 c0 d1 7 7 7 c0 d1 d0 c1 2 5 Query Eliminated Abstractions q1: path(0, 5) q2: path(0, 2) abs(a0) abs(a1), abs(b0) abs(b1), abs(c0) abs(c1), abs(d0) abs(d1). 20 Programming Language Design and Implementation, 2014 6/10/2014

  21. Iteration 1 - derivation graph abs(a0) path(0,0) edge(0,6,a0) abs(b0) abs(a0) edge(6,1,a0) path(0,6) edge(6,4,b0) abs(d0) edge(4,7,d0) abs(c0) edge(1,7,c0) path(0,1) path(0,4) abs(c0) edge(7,2,c0) edge(7,5,d0) abs(d0) path(0,7) path(0,2) path(0,5) 21 Programming Language Design and Implementation, 2014 6/10/2014

  22. Iteration 1 - derivation graph abs(a0) path(0,0) edge(0,6,a0) abs(b0) abs(a0) edge(6,1,a0) path(0,6) edge(6,4,b0) abs(d0) edge(4,7,d0) abs(c0) edge(1,7,c0) path(0,1) path(0,4) abs(c0) edge(7,2,c0) edge(7,5,d0) abs(d0) path(0,7) path(0,2) path(0,5) a0 c0 22 Programming Language Design and Implementation, 2014 6/10/2014

  23. Iteration 1 - derivation graph abs(a0) path(0,0) edge(0,6,a0) abs(b0) abs(a0) edge(6,1,a0) path(0,6) edge(6,4,b0) abs(d0) edge(4,7,d0) abs(c0) edge(1,7,c0) path(0,1) path(0,4) abs(c0) edge(7,2,c0) edge(7,5,d0) abs(d0) path(0,7) path(0,2) path(0,5) a0 c0d0 23 Programming Language Design and Implementation, 2014 6/10/2014

  24. Iteration 1 - derivation graph abs(a0) path(0,0) edge(0,6,a0) abs(b0) abs(a0) edge(6,1,a0) path(0,6) edge(6,4,b0) abs(d0) edge(4,7,d0) abs(c0) edge(1,7,c0) path(0,1) path(0,4) abs(c0) edge(7,2,c0) edge(7,5,d0) abs(d0) path(0,7) path(0,2) path(0,5) a0b0 d0 24 Programming Language Design and Implementation, 2014 6/10/2014

  25. Iteration 1 - derivation graph 0 3 a0 b0 b1 a1 6 6 6 a0 b0 b1 a1 1 4 c1 d0 c0 d1 7 7 7 c0 d1 d0 c1 2 5 Query Eliminated Abstractions a0 c0d0, a0b0 d0(4/16) a0 c0 (4/16) q1: path(0, 5) q2: path(0, 2) abs(a0) abs(a1), abs(b0) abs(b1), abs(c0) abs(c1), abs(d0) abs(d1). 25 Programming Language Design and Implementation, 2014 6/10/2014

  26. Encoded as MAXSAT Hard Constraints Soft Constraints MAXSAT(??, ??,??, ,(??,??)): Find solution ?that Maximize ?? ? ??,1 ? ?} Subject to ? ?0 26 Programming Language Design and Implementation, 2014 6/10/2014

  27. Encoded as MAXSAT Avoid all the counterexamples Hard constraints: ??? (0,0) (??? (0,6) ??? (0,0) ???(?0)) (??? (0,1) ??? (0,6) ???(?0)) (??? (0,7) ??? (0,1) ???(?0)) (??? (0,4) ??? (0,6) ???(?0)) Minimize the abstraction cost Soft constraints: (??? ?0 ?????? ?) (??? ?0 ?????? ?) (??? ?0 ?????? ?) (??? ?0 ?????? ?) ( ??? 0,2 ?????? ?) ( ??? 0,5 ?????? ?) abs(a0) abs(a1), abs(b0) abs(b1), abs(c0) abs(c1), abs(d0) abs(d1). 27 Programming Language Design and Implementation, 2014 6/10/2014

  28. Encoded as MAXSAT Solution: ??? 0,0 = true,??? 0,6 = false, ??? 0,1 = false,??? 0,4 = false, ??? 0,7 = false,??? 0,2 = false, ??? 0,5 = false,?,?? 0,6 = 0, ??? ?0 = false,??? ?0 = true, ??? ?0 = true,??? ?0 = true. Hard constraints: ??? (0,0) (??? (0,6) ??? (0,0) ???(?0)) (??? (0,1) ??? (0,6) ???(?0)) (??? (0,7) ??? (0,1) ???(?0)) (??? (0,4) ??? (0,6) ???(?0)) Soft constraints: a1b0c0d0 (??? ?0 ?????? ?) (??? ?0 ?????? ?) (??? ?0 ?????? ?) (??? ?0 ?????? ?) ( ??? 0,2 ?????? ?) ( ??? 0,5 ?????? ?) Query Eliminated Abstractions a0 c0d0, a0b0 d0(4/16) a0 c0 (4/16) q1: path(0, 5) q2: path(0, 2) 28 Programming Language Design and Implementation, 2014 6/10/2014

  29. Iteration 2 and beyond Iteration 1 Derivation ?? Constraints Datalog solver MAXSAT solver ?? ?? a1b0c0d0 Query Answer Eliminated Abstractions a0 c0d0, a0b0 d0, a1 c0d0 (4/16) a0 c0 , a1 c0 (4/16) q1: path(0, 5) q2: path(0, 2) 29 Programming Language Design and Implementation, 2014 6/10/2014

  30. Iteration 2 and beyond Iteration 2 Derivation ?? Constraints Datalog solver MAXSAT solver ?? ?? a1b0c0d0 Query Answer Eliminated Abstractions a0 c0d0, a0b0 d0, a1 c0d0 a0 c0 , a1 c0 (4/16) q1: path(0, 5) q2: path(0, 2) (4/16) 30 Programming Language Design and Implementation, 2014 6/10/2014

  31. Iteration 2 and beyond Iteration 2 Derivation ?? Constraints Datalog solver MAXSAT solver ?? ?? ?? a1b0c0d0 Query Answer Eliminated Abstractions a0 c0d0, a0b0 d0, a1 c0d0 (4/16) a0 c0 (4/16) q1: path(0, 5) q2: path(0, 2) 31 Programming Language Design and Implementation, 2014 6/10/2014

  32. Iteration 2 and beyond Iteration 2 Derivation ?? Constraints Datalog solver MAXSAT solver ?? ?? ?? a1b0c1d0 Query Answer Eliminated Abstractions a0 c0d0, a0b0 d0, a1 c0d0 (6/16) a0 c0 , a1 c0 (8/16) q1: path(0, 5) q2: path(0, 2) 32 Programming Language Design and Implementation, 2014 6/10/2014

  33. Iteration 2 and beyond Iteration 3 Derivation ?? Constraints Datalog solver MAXSAT solver ?? ?? ?? q1 is proven. a1b0c1d0 Query Answer Eliminated Abstractions a0 c0d0, a0b0 d0, a1 c0d0 a0 c0 , a1 c0 (8/16) q1: path(0, 5) q2: path(0, 2) (6/16) a1b0c1d0 33 Programming Language Design and Implementation, 2014 6/10/2014

  34. Iteration 2 and beyond Iteration 3 Derivation ?? Constraints Datalog solver MAXSAT solver ?? ?? ?? ?? q2 is impossible to prove. q1 is proven. a1b0c1d0 Query Answer Eliminated Abstractions a0 c0d0, a0b0 d0, a1 c0d0 a0 c0 , a1 c0 , a1 c1 , a0 c1 (16/16) q1: path(0, 5) q2: path(0, 2) (6/16) a1b0c1d0 Impossibility 34 Programming Language Design and Implementation, 2014 6/10/2014

  35. Mixing counterexamples Iteration 1 Iteration 3 Eliminated Abstractions: a0 c0 a1 c1 35 Programming Language Design and Implementation, 2014 6/10/2014

  36. Mixing counterexamples Iteration 1 Mixed! Iteration 3 Eliminated Abstractions: a0 c0 a0 c1 a1 c1 36 Programming Language Design and Implementation, 2014 6/10/2014

  37. Experimental setup Implemented in JChord using off-the-shelf solvers: Datalog: bddbddb MAXSAT: MiFuMaX Applied to two analyses that are challenging to scale: k-object-sensitivity pointer analysis: flow-insensitive, weak updates, cloning-based typestate analysis: flow-sensitive, strong updates, summary-based Evaluated on 8 Java programs from DaCapo and Ashes. 37 Programming Language Design and Implementation, 2014 6/10/2014

  38. Benchmark characteristics classes 1K 1K 1.2K 1K 1.1K 1.3K 1.2K 1.9k methods 6K 6.5K 8K 7K 7.7K 7.9K 8K 12K bytecode(KB) 423 434 504 442 532 508 511 708 KLOC 258 265 326 283 303 295 314 460 toba-s javasrc-p weblech hedc antlr luindex lusearch schroeder-m 38 Programming Language Design and Implementation, 2014 6/10/2014

  39. Results: pointer analysis 4-object-sensitivity < 50% queries abstraction size resolved iterations total current 7 46 5 47 143 138 322 51 baseline 0 0 2 6 5 67 29 25 final 170 470 140 730 970 1K 1K 450 max 18K 18K 31K 29K 29K 40K 39K 58K toba-s javasrc-p weblech hedc antlr luindex lusearch schroeder-m 7 46 5 47 143 138 322 51 10 13 10 18 15 26 17 15 < 3% of max 39 Programming Language Design and Implementation, 2014 6/10/2014

  40. Performance of Datalog: pointer analysis k = 4, 3h28m Baseline k = 3, 590s k = 2, 214s lusearch k = 1, 153s 40 Programming Language Design and Implementation, 2014 6/10/2014

  41. Performance of MAXSAT: pointer analysis lusearch 41 Programming Language Design and Implementation, 2014 6/10/2014

  42. Statistics of MAXSAT formulae pointer analysis variables clauses 0.7M 1.5M toba-s javasrc-p weblech hedc antlr luindex lusearch schroeder-m 0.5M 0.9M 1.6M 3.3M 1.2M 2.7M 3.6M 6.9M 2.4M 5.6M 2.1M 5M 6.7M 23.7M 42 Programming Language Design and Implementation, 2014 6/10/2014

  43. Conclusion Datalog MAXSAT Abstraction 43 Programming Language Design and Implementation, 2014 6/10/2014

  44. Conclusion Datalog MAXSAT A(x, y):- B(x, z), C(z, y) Soundness Hard Constraints ? ?,? ? ?,? ?(?,?) Tradeoffs 1 0 1 1 0 Scalability vs. Precision Soft Constraints ??? ?0 ?????? ? Sound vs. Complete 44 Programming Language Design and Implementation, 2014 6/10/2014

Related


More Related Content