Predicting a Correct Program in Programming by Examples

Predicting a Correct Program in Programming by Examples
Slide Note
Embed
Share

In this research, Rishabh Singh and Sumit Gulwani from Microsoft Research delve into the realm of predicting correct programs using Programming by Examples (PBE). They explore the challenges, ambiguities, and solutions involved in this innovative approach, shedding light on the nuances of Excel functions, FlashFill, handling ambiguity, and structuring the hypothesis space with sharing in Version-space Associative Expressions.

  • Programming
  • Examples
  • Research
  • Microsoft
  • Excel

Uploaded on Feb 17, 2025 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Predicting a Correct Program in PBE Rishabh Singh, Microsoft Research Sumit Gulwani, Microsoft Research

  2. Programming By Examples Ambiguity! Intuitive Natural Accessible

  3. Excel Forums 300_w1_aniSh_c1_b w1 =MID( 300_w1_aniSh_c1_b ,5,2)

  4. Excel Forums 300_w30_aniSh_c1_b w30 =MID($B:$B,FIND( _ ,$B:$B)+1, FIND( _ ,REPLACE($B:$B,1,FIND( _ ,$B:$B), ))-1)

  5. FlashFill[Gulwani POPL2011][Gulwani,Harris,Singh CACM 2012] Benchmarks Heuristics Program VSA DSL

  6. Benchmarks Ranking Program VSA DSL

  7. Handling Ambiguity Input Input Output Output Mr. Rick Rick Rashid Satya Nadella

  8. Prefer non-constants Input Input Output Output Mr. Rick Ms. Ms. Satya Satya Rick Rashid Satya Nadella Prefer smaller substrings as constants

  9. Prefer smaller constants Input Input Output Output S. Nadella Satya Nadella Bill Gates 2nd word, last word, 2nd capital followed by 2ndlowercase string .

  10. With great power comes great responsibility. Machine Learning for Ranking

  11. Three Challenges Labelled Training Data Machine Learning Algorithm Efficient Ranking Algorithm

  12. Training Data Generation Training Data Generation Input Input Output Output Mr. Rashid Mr. Nadella Mr. Lee Rick Rashid Satya Nadella Peter Lee

  13. Structuring Hypothesis Space with Sharing in Version-space Associative Expressions f(e1, f(e2, f(e3, e4))) DAG-based sharing Fixed-arity Expressions f(e1, e2, e3, e4) Set-based sharing

  14. Ranking Function f(p) Assume Linear Function f(p) = w1* f1 + w2*f2+ + wk*fk

  15. Learning To Rank Listwise Approach Logistic Regression All relevant pages over irrelevant Didn t work well Too strong a constraint

  16. Training Phase Input Input Output Output Mr. Rick Mr. Satya Mr. Lee Rick Rashid Satya Nadella Peter Lee Goal: Find ranking function f(p) over program features that ranks positive programs higher than negative programs Lower 1st uppercase letter Constant r Lower 2nd upper case letter .

  17. Learn DAGs ?50 ?10 ?9 ?13 ?12 ?11 3 6 8 0 1 2 4 5 7 ?1 ?2 ?4 ?5 ?3 ?8 ?6 ?7 Rick Rashid Mr. Rashid ?50 ?13 ?12 ?11 3 6 8 0 1 2 4 5 7 ?1 ?2 ?4 ?5 ?3 ?8 ?6 ?7 Satya Nadella Mr. Satya

  18. Intersect DAGs Rick Rashid Mr. Rick Satya Nadella Mr. Satya

  19. Assign Positive Labels Rick Rashid Mr. Rick Satya Nadella Mr. Satya

  20. Assign Negative Labels Rick Rashid Mr. Rick Satya Nadella Mr. Satya

  21. Rick Rashid Mr. Rick Satya Nadella Mr. Satya Learn ranking function f(p) that ranks programs higher than programs.

  22. Training Phase Rank any positive program over all negative programs Negative Programs Positive Programs

  23. Hierarchical Ranking Substring Expression Frequency of tokens, context, neighborhood, Atomic Expression Length of substring, input, output, constant, Concat Expression Number of Arguments, sum, max, min, prod

  24. Evaluation 175 benchmarks 30-70 train-test partition Baseline (Occam s razor): Smallest & Simplest programs

  25. Ranking Evaluation Number of Examples for Learning the Test Task 6 Baseline LearnRank Number of Examples 5 4 3 2 1 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100103106109112115118121 Benchmarks LearnRank learns from 1 example for 79% benchmarks

  26. Efficiency of Ranking

  27. Ranking for PBE Machine Learning + Synthesis General Loss Function for PBE VSA Sharing Formalization Efficient Features & Algorithms Thanks! risin@microsoft.com

More Related Content