Accelerating Program Analyses by Cross-Program Training

Accelerating Program Analyses by Cross-Program Training
Slide Note
Embed
Share

Introducing a new approach to accelerate inter-procedural static analysis by synergistically combining compositional and whole-program methods. The Polymer framework offers speedup benefits for call-graph and pointer analyses on Java programs from the Dacapo benchmark suite.

  • Program Analyses
  • Cross-Program Training
  • Polymer Framework
  • Static Analysis
  • Java Programs

Uploaded on Apr 13, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Accelerating Program Analyses by Cross-Program Training Sulekha Kulkarni, Ravi Mangal, Xin Zhang, Mayur Naik Georgia Tech

  2. Motivation App Library 2 4/13/2025

  3. Motivation App A App B Library 3 4/13/2025

  4. Existing Approach 1: Whole-Program Analysis App A App B App B Library 4 4/13/2025

  5. Existing Approach 2: Compositional Analysis App A App B App B Library 5 4/13/2025

  6. An Idea Can we synergistically combine the laziness of the whole-program approach with the reuse of the compositional approach? 6 4/13/2025

  7. Our Contributions New approach to accelerate inter-procedural static analysis Synergistically combines compositional and whole-program approaches Polymer: generic framework for analyses written in Datalog Requires pre/post condition specification per analysis Demonstrated speedup for two popular analyses 2.6x for call-graph analysis and 5.2x for pointer analysis on 10 Java programs from Dacapo benchmark suite 7 4/13/2025

  8. Overall Architecture Polymer Analysis + Pre/Post Spec Analysis Result Online Phase Offline Phase Summaries App D App A App B App C Library How to define a summary for an analysis in Datalog? Which summaries to learn in the Offline phase? Which summaries to reuse in the Online phase? 8 4/13/2025

  9. Graph Reachability Example A0 B0 L1 L3 L2 Input: A directed graph with: - a distinguished root node - each node in application or library L4 A1 B1 L5 Problem: Which application nodes are reachable from the root node? L6 B2 L7 L8 A2 B3 9 4/13/2025

  10. Our Definition of a Summary A0 B0 ????????? ?????????? L1 L3 L2 Examples: L4 {?1} {?2, ?3, ?4} (1) A1 B1 {?5} {?6, ?7, ?8} (2) {?2} ?4 L5 (3) L6 B2 L7 In general, there are multiple valid choices for summaries. L8 A2 B3 10 4/13/2025

  11. Choosing Good Summaries Examples: A0 B0 L1 {?0} ?1,?2,?3,?4 (1) L3 L2 {?2} {?4} (2) L4 A1 B1 L5 {?1} {?2,?3,?4} (3) {?5} {?6,?7, ?8} (4) L6 B2 L7 PickGoodPre is a specification for ????????? L8 PickGoodPre returns { ?1 , ?5 } A2 B3 11 4/13/2025

  12. Choosing Good Summaries (contd.) A0 B0 There are multiple choices for ?????????? L1 {?1} {?2, ?3, ?4} (1) L3 L2 L4 {?1} {?4} (2) A1 B1 Pruned! L5 Pruning allows better performance by eliminating intermediate analysis facts L6 B2 L7 L8 A2 B3 12 4/13/2025

  13. Ensuring Soundness More pruning scenarios: A0 B0 L1 {?5} {?6, ?7, ?8} (1) L3 L2 {?5} {?8} (2) L4 Unsound for B A1 B1 How to ensure that a pruned summary can be used soundly? L5 Soundness condition: A pruned summary is sound w.r.t to a particular program if every tuple that is pruned away is ONLY used to derive facts contained in the unpruned summary L6 B2 L7 L8 A2 B3 13 4/13/2025

  14. Ensuring Soundness (contd.) PickGoodPost is a specification for ?????????? what to prune away checking function for sound reuse A0 B0 L1 L3 L2 L4 Na ve:{?1} {?2, ?3, ?4} (1) After pruning: {?1} {?4} (2) Checking function: No edge from ?2, ?3 to app nodes A1 B1 L5 L6 Na ve:{?5} {?6, ?7, ?8} (3) After pruning: {?5} ?8 (4) Checking function: No edge from ?6, ?7 to app nodes B2 L7 L8 A2 B3 14 4/13/2025

  15. Putting it All Together PickGoodPre : Impacts performance A0 B0 L1 PickGoodPost : Impacts performance and soundness L3 L2 L4 Specifies what can be pruned A1 B1 Provides a checking function to prevent the unsound use of pruned summaries L5 Main Theorem: If PickGoodPost satisfies the soundness condition, then reusing learnt summaries produces the same analysis results as analyzing from scratch. L6 B2 L7 L8 A2 B3 15 4/13/2025

  16. Experimental Setup Implemented Polymer using bddbddb Datalog solver and Chord framework for analyzing Java bytecode Applied to two analyses: call-graph analysis flow- and context-insensitive pointer analysis flow- and context-sensitive, on-the-fly call graph Evaluated on 10 Java programs from Dacapo benchmark suite (208-419 KLOC each) 16 4/13/2025

  17. Evaluation Methodology Speedup Baseline: Uses no training data Ideal: Uses best possible training data Actual: Uses training data from a realistic corpus Sensitivity to training data Functionality: Train on subset of library modules Abstraction: Train on deeper library modules 17 4/13/2025

  18. Benchmark Characteristics classes app 109 78 277 181 189 193 173 348 165 42 methods app 873 523 2,651 1,461 2,441 1,316 1,119 2,590 1,328 372 bytecode(KB) app 81 35 195 101 190 99 77 186 117 28 KLOC app 26 16 59 53 96 38 33 46 25 9 total 1,091 1,062 1,269 1,756 1,341 1,175 1,157 1,357 1,894 1,036 total 7,220 6,905 9,133 11,450 10,223 7,741 7,601 9,105 13,356 6,772 total total 224 214 258 366 322 237 231 247 419 208 antlr avrora bloat chart hsqldb luindex lusearch pmd sunflow xalan 467 423 586 778 670 487 477 578 934 417 18 4/13/2025

  19. Pointer Analysis: Speedup 19 4/13/2025

  20. Pointer Analysis: Varying Library Functionality 20 4/13/2025

  21. Pointer Analysis: Varying Library Abstraction Sunflow 21 4/13/2025

  22. Conclusion Practical programs share many large modules of code Polymer accelerates declarative analyses on such programs Requires analysis-specific functions PickGoodPre and PickGoodPost to learn summaries Soundly reuses these summaries to accelerate the analysis of unseen programs Polymer achieves average speedups of 2.6x for call graph analysis and 5.2x for the points-to analysis A step towards scaling analyses for Big Code applications 22 4/13/2025

Related


More Related Content