Synthesizing Programs with Semantically Distinct Solutions
Explore the challenges of synthesizing programs under a distribution, emphasizing the underspecification problem where multiple distinct programs can meet the input-output examples. Key elements like program matching and length minimization are discussed, along with bottom-up and top-down explicit search strategies. Symbolic search methods are also highlighted for finding correct solutions while allowing some incorrect ones.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Lecture 21 Synthesizing under a distribution Armando Solar-Lezama
Programming by Example [ ??0???0, ??1???1, (???|????)] ? ??? ???? Problem is hopelessly underspecified Many semantically distinct programs can satisfy the examples ? ? ??????? ?) ??? ??? ??? ???? ? ? )
Key elements Any program that does not match the I/O examples has P=0 All programs that match the I/O examples have the same ??? ??? ???? ? ? )
Length minimization Shortest programs are better than longer programs 1 ? ? ???(?) ?? ? ??????? ?? ? ? ????? ?? ???????? 0 ? ? = ?? ??????
Bottom Up Explicit Search When discarding observationally equivalent programs, keep shortest one. Naturally finds shortest programs first Key property: All subprograms of the shortest program will themselves be the shortest programs
Top Down Explicit Search Achieving minimality is trickier and more expensive Algorithm keeps a wavefront of partially completed programs Need to expand that wavefront in best-first manner
expr = var Top Down Explicit Search | ??.???? | filterexpr expr | mapexpr expr | foldlexpr expr expr | boolExpr | arithExpr dropmins in = expr ? ?. expr filter expr expr in map expr expr fold expr expr expr boolExpr arithExpr ? ?. expr filter expr expr in map expr expr fold expr expr expr boolExpr arithExpr Wavefront of partially expanded programs that have not been rejected. At each step there is a choice of which program to try to expand.
Symbolic Search ?. ??.? ?,?? ?. ??.? ?,?? ? ? ? ? ? ?. ??.? ?,?? ? ??.? ? ,?? ? ? ? ?
Symbolic Search ? ??.? ? ,?? ? ? ? ? ?. ??.? ?,?? ? ? ??.? ? ,?? ? ? ? ? ?. ?? ?.? ?,?? More permissive, allows all correct ?, but also some incorrect ones. ? ? ? ? ? ? Works if we can ensure ? only contains ? that satisfy ??.? ? ,??
Symbolic Search ? ??.? ? ,?? ? ? ? ? ?. ??.? ?,?? ? ? ??.? ? ,?? ? ? ? ? ?. ?? ?.? ?,?? Works if we can ensure ? only contains ? that satisfy ??.? ? ,?? More permissive, allows all correct ?, but also some incorrect ones. ? ? ? ? ? ? ? ?? ? max ? ?
Extended CEGIS Synthesize Check ? ?(?,???) ?? {???}
Extended CEGIS Synthesize Check ?(?,???) ? ?(?,???) ?? {???}
Extended CEGIS Synthesize Check ? ?(?,???) ?(?,???) ?? {???}
Extended CEGIS Synthesize Check ?(?,???) ? ?(?,???) ?(?,???) ?? {???}
Extended CEGIS Synthesize Check ?(?,???) ? ?(?,???) ?(?,???) ????? < ?(?) UNSAT ?? {???} ????? = ? ? ????? = ?
Extended CEGIS Synthesize Check ?(?,???) ? ?(?,???) ?(?,???) ????? < ?(?) UNSAT UNSAT ?? {???} ????? = ? ? ????? = ? ?????
Probabilistic Grammars expr = expr + expr | expr expr | expr * expr | ??? | ???
Probabilistic Grammars P(expr) expr = expr + expr | expr expr | expr * expr | ??? | ??? 0.1 0.1 0.05 0.25 0.5
Probabilistic Grammars 0.1 0.1 + + 0.1 0.1 0.05 0.1 + + + * x 2 x 3 x x x 3 0.25 0.25 0.5 0.5 0.5 0.25 0.25 0.25 P(exp) = 7.81 10 6 P(exp) = 7.81 10 6
Search with P G Probability is compositional ?(?0) ? ???? = ? ?0 ? ?1 ?(?2) + ?(?2) ?(?1) + * x 2 x 3 Same strategies as length-based metric work
Context sensitive Probabilities ???? = ???? + ???? | ???? ???? | ???? * ???? | ??? | ???
Context sensitive Probabilities A B C D E ???? = ????1 + ????2 | ????3 ????4 | ????5 * ????6 | ??? | ??? A B C D E Root 0.2 0.2 0.2 0.2 0.2 1 0.2 0.2 0.2 0.2 0.2 2 0 0 1/3 1/3 1/3 3 0.2 0.2 0.2 0.2 0.2 4 0 0 1/3 1/3 1/3 5 0.2 0.2 0.2 0.2 0.2 6 0.25 0.25 0 0.25 0.25
Search Top-down and Constraint-based search are similar Bottom up search is significantly more challenging
Bottom-up search x=5 => out=7 + * x x x 2 We know they are equivalent Which one do we keep?
Bottom-up search x=5 => out=7 + * x x 0.33 x 2 0.2 0.25 0.2 Is this exponentially slower? Root 0.01 Root 0.013 1 0.01 1 0.013 2 0.017 2 0 3 0.01 3 0.013 4 0.017 4 0 5 0.01 5 0.013 6 0 6 0.0165
Bottom-up search x=5 => out=7 + * x x 0.33 x 2 0.2 0.25 0.2 Is this exponentially slower? Root 0.01 Root 0.013 1 0.01 1 0.013 No! Given K contexts, we need at most k representatives for each equivalence class 2 0.017 2 0 3 0.01 3 0.013 4 0.017 4 0 5 0.01 5 0.013 6 0 6 0.0165
Bottom-up search x=5 => out=7 * * Keeping x*2 and x+x does not force us to keep all programs generated from both + 2 2 0.2 * 0.2 0.25 0.25 x x 0.33 x 2 0.2 0.25 0.2 Is this exponentially slower? Root 0.0005 Root 0.0006 1 0.0005 1 0.0006 No! Given K contexts, we need at most k representatives for each equivalence class 2 0.0008 2 0.001 3 0.0005 3 0.0006 4 0.0008 4 0.001 5 0.0005 5 0.0006 6 0 6 0
Learning a pCFG ????? ? ??????) Probability of a tree/string given the probabilities of individual tokens/chars Usually a simple function that traverses the tree/string and multiplies probabilities (adds their logs)
Learning a pCFG ? ?????? = A B C D E ???? = ????1 + ????2 | ????3 ????4 | ????5 * ????6 | ??? | ??? A B C D E ?0,? ?0,? ?0,? ?0,? ?0,? Root ?1,? ?1,? ?1,? ?1,? ?1,? 1 ?2,? ?2,? ?2,? ?2,? ?2,? 2 ?3,? ?3,? ?3,? ?3,? ?3,? 3 ?4,? ?4,? ?4,? ?4,? ?4,? 4 ?5,? ?5,? ?5,? ?5,? ?5,? 5 ?6,? ?6,? ?6,? ?6,? ?6,? 6
Learning a pCFG Given a set of sample trees T = ti i<N and a parametric distribution ?????? the goal is to find parameters that maximize the probability of the samples. ? ? ? argmax ? ? ?????? |?????? = argmax ?log??????? |?????? ? ? |?? ?,?????? ? = argmax ? ?log? ?? ? ? is node ? of tree ? in the dataset and ?? ? is the context where node ? of tree ? appears where ??
Learning a pCFG Dataset ?=C ?=Root ?0 ?0 * ?=C ?=5 ?1 ?1 2 * ?=6 ?=E ?4 ?4 x ?2 2 ?=5 ?=D ?2 ?=E ?=6 ?3 ?3 Tree ? =log?0,?+ log?5,?+ log?5,?+ log?6,?+ log?6,? ? log??????? |??????
Neural Networks Standard neural network model: very general parametric function ?? ?? ?1 ??? ?0 ?? ???? ???? Constant 1 Matrices of parameters to be discovered
DeepCoder Learn a conditional distribution for a simple probabilistic grammar ?? ??? ??????? ??????? Vector representing the specification Latent representation of spec Vector representing probabilities for each component in the grammar
Sampling What if we do not want the best? Instead we want to sample from the distribution
Sampling with Constraints Space of all programs Correct programs Rejection sampling Sample according to ?(?) Reject anything where ? ???????? ? = 0 Very inefficient
Sampling with Constraints ? | ? ?(?) ? | ?. ? = ? ?(?) : If is measure preserving, sampling uniformly from is equivalent to sampling uniformly from ? and then solving for ? s.t. ? = (?)
Sampling with Constraints Approach: Partition Sample uniformly from the partitions to pick ?? Solve for ? s.t. ? ? ? ??
Sampling with Constraints Approach: Partition Sample uniformly from the partitions to pick ?? Solve for ? s.t. ? ? ? ?? Tradeoff Coarse partition efficient but poor quality sampling Fine partition expensive but high quality sampling Approximates rejection sampling in the limit of very fine partition