Predicting a Correct Program in Programming by Examples

Slide Note

In this research, Rishabh Singh and Sumit Gulwani from Microsoft Research delve into the realm of predicting correct programs using Programming by Examples (PBE). They explore the challenges, ambiguities, and solutions involved in this innovative approach, shedding light on the nuances of Excel functions, FlashFill, handling ambiguity, and structuring the hypothesis space with sharing in Version-space Associative Expressions.

terr_606 Follow

Uploaded on Feb 17, 2025 | 1 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Predicting a Correct Program in PBE Rishabh Singh, Microsoft Research Sumit Gulwani, Microsoft Research

Programming By Examples Ambiguity! Intuitive Natural Accessible

Excel Forums 300_w1_aniSh_c1_b w1 =MID( 300_w1_aniSh_c1_b ,5,2)

Excel Forums 300_w30_aniSh_c1_b w30 =MID($B:$B,FIND( _ ,$B:$B)+1, FIND( _ ,REPLACE($B:$B,1,FIND( _ ,$B:$B), ))-1)

FlashFill[Gulwani POPL2011][Gulwani,Harris,Singh CACM 2012] Benchmarks Heuristics Program VSA DSL

Benchmarks Ranking Program VSA DSL

Handling Ambiguity Input Input Output Output Mr. Rick Rick Rashid Satya Nadella

Prefer non-constants Input Input Output Output Mr. Rick Ms. Ms. Satya Satya Rick Rashid Satya Nadella Prefer smaller substrings as constants

Prefer smaller constants Input Input Output Output S. Nadella Satya Nadella Bill Gates 2nd word, last word, 2nd capital followed by 2ndlowercase string .

With great power comes great responsibility. Machine Learning for Ranking

Three Challenges Labelled Training Data Machine Learning Algorithm Efficient Ranking Algorithm

Training Data Generation Training Data Generation Input Input Output Output Mr. Rashid Mr. Nadella Mr. Lee Rick Rashid Satya Nadella Peter Lee

Structuring Hypothesis Space with Sharing in Version-space Associative Expressions f(e1, f(e2, f(e3, e4))) DAG-based sharing Fixed-arity Expressions f(e1, e2, e3, e4) Set-based sharing

Ranking Function f(p) Assume Linear Function f(p) = w1* f1 + w2*f2+ + wk*fk

Learning To Rank Listwise Approach Logistic Regression All relevant pages over irrelevant Didn t work well Too strong a constraint

Training Phase Input Input Output Output Mr. Rick Mr. Satya Mr. Lee Rick Rashid Satya Nadella Peter Lee Goal: Find ranking function f(p) over program features that ranks positive programs higher than negative programs Lower 1st uppercase letter Constant r Lower 2nd upper case letter .

Learn DAGs ?50 ?10 ?9 ?13 ?12 ?11 3 6 8 0 1 2 4 5 7 ?1 ?2 ?4 ?5 ?3 ?8 ?6 ?7 Rick Rashid Mr. Rashid ?50 ?13 ?12 ?11 3 6 8 0 1 2 4 5 7 ?1 ?2 ?4 ?5 ?3 ?8 ?6 ?7 Satya Nadella Mr. Satya

Intersect DAGs Rick Rashid Mr. Rick Satya Nadella Mr. Satya

Assign Positive Labels Rick Rashid Mr. Rick Satya Nadella Mr. Satya

Assign Negative Labels Rick Rashid Mr. Rick Satya Nadella Mr. Satya

Rick Rashid Mr. Rick Satya Nadella Mr. Satya Learn ranking function f(p) that ranks programs higher than programs.

Training Phase Rank any positive program over all negative programs Negative Programs Positive Programs

Hierarchical Ranking Substring Expression Frequency of tokens, context, neighborhood, Atomic Expression Length of substring, input, output, constant, Concat Expression Number of Arguments, sum, max, min, prod

Evaluation 175 benchmarks 30-70 train-test partition Baseline (Occam s razor): Smallest & Simplest programs

Ranking Evaluation Number of Examples for Learning the Test Task 6 Baseline LearnRank Number of Examples 5 4 3 2 1 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100103106109112115118121 Benchmarks LearnRank learns from 1 example for 79% benchmarks

Efficiency of Ranking

Ranking for PBE Machine Learning + Synthesis General Loss Function for PBE VSA Sharing Formalization Efficient Features & Algorithms Thanks! risin@microsoft.com

Predicting a Correct Program in Programming by Examples

Download Presentation

Presentation Transcript

Related

More Related Content