Abductive Reasoning Using Random Examples
Dive into the concept of abductive reasoning through the exploration of models, algorithms, and the challenges faced in deducing conjunctions. Discover the need for new models that can handle complex scenarios and learn about an innovative approach that involves reasoning from random examples.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Learning abductive reasoning using random examples Brendan Juba Washington University in St. Louis
Outline 1. Models of abductive reasoning 2. An abductive reasoning algorithm 3. Not-easiness of abducing conjunctions
Abductive reasoning: making plausible guesses Abductive reasoning: Given a conclusion c, find a plausible h that implies/leads to/ c Two varieties of plausibility in common use Logical plausibility: a small h from which c follows Bayesian plausibility: a h which has large posterior probability when given c In symbols Pr[h|c true] > Requires a prior distribution over representations
Why might we want a new model? Existing models only tractable in simple cases E.g. Horn rules (a b c d no negations), nice (conjugate) priors The choice of logical rules, prior distribution, etc. really matters And, they are difficult to specify by hand
New model: abductive reasoning from random examples Fix a set of attributes (propositional variables x1, x2, , xn) An environment is modeled by an arbitrary, unknowndistributionD over examples, i.e., settings of the n propositional variables. Task: for a conclusion c, find a h such that 1. Plausibility: Prx D[h(x)=1] (for some given ) 2. h almost entails c: Prx D[c(x)=1|h(x)=1] 1-
Example: identifying a subgoal Consider: blocks world. For t=1,2, ,T Propositional state vars. ( fluents ) ONt(A,B), ONt (A,TABLE), ONt (C,A), etc. Actions also encoded by propositional vars. PUTt(B,A), PUTt(C,TABLE), etc. Given many examples of interaction Our goal c: ONT(A,TABLE) ONT(B,A) ONT(C,B) A perhaps plausibly good subgoal h: [ONT-1(B,A) PUTT(C,B)] [PUTT-1 (B,A) PUTT (C,B)] B A Or, even given by examples, not explicitly formulated
in pictures x {0,1}n x: h(x)=1 c(x)=1 c(x)=1 c(x)=0 h(x)=1 c: goal/observation/ h: explanation/solution/
Formally: abductive reasoning from random examples for a class H Fix a class of Boolean representations H Given Boolean formula c; , , (0,1); independent examples x(1), ,x(m) D, Suppose that there exists a h* H such that 1. Plausibility: Prx D[h*(x)=1] 2. h* entails c: Prx D[c(x)=1|h*(x)=1] = 1 Find a h (ideally in H) such that with prob. 1- , 1. Plausibility: Prx D[h(x)=1] C((1- ) /n)d for some C, d 2. h almost entails c: Prx D[c(x)=1|h(x)=1] 1-
Outline 1. Models of abductive reasoning 2. An abductive reasoning algorithm 3. Not-easiness of abducing conjunctions
Theorem 1. If there is a k-DNF h* such that 1. Plausibility: Prx D[h*(x)=1] 2. h* entails c: Prx D[c(x)=1|h*(x)=1] = 1 then using m = O(1/ (nk+log1/ )) examples, in time O(mnk) we can find a k-DNF h such that with probability 1- , 1. Plausibility: Prx D[h(x)=1] 2. h almost entails c: Prx D[c(x)=1|h(x)=1] 1- k-DNF: an OR of terms of size k ANDs of at most k literals attributes or their negations
Algorithm for k-DNF abduction Start with h as an OR over all terms of size k For each example x(1), ,x(m) If c(x(i)) = 0, delete all terms T from h such that T(x(i)) = 1 Return h Simple algorithm, 1st proposed by J.S. Mill, 1843 Running time is clearly O(mnk)
Analysis pt 1: PrxD[h(x)=1] We are given that some k-DNF h* has 1. Plausibility: Prx D[h*(x)=1] 2. h* entails c: Prx D[c(x)=1|h*(x)=1] = 1 Initially, every term of h* is in h Terms of h* are never true when c(x)=0 by 2. every term of h* remains in h h* implies h, so Prx[h(x)=1] Prx[h*(x)=1] .
Analysis pt 2: Prx[c(x)=1|h(x)=1] 1- Rewrite conditional probability: Prx D [c(x)=0 h(x)=1] Prx D [h(x)=1] We ll show: Prx D [c(x)=0 h(x)=1] ( Prx D [h(x)=1] by part 1) Consider any h s.t. Prx[c(x)=0 h (x)=1] > Since each x(i) is drawn independently from D Prx D [no i has c(x(i))=0 h (x(i))=1] < (1- )m A term of h is deleted when c=0 andh =1 So, h is only possibly output w.p. < (1- )m
Analysis pt 2, contd: Prx[c(x)=1|h(x)=1] 1- We ll show: Prx D [c(x)=0 h(x)=1] Consider any h s.t. Prx[c(x)=0 h (x)=1] > h is only possibly output w.p. < (1- )m There are only 2O(nk) possible k-DNF h Since (1-z)1/z e-1, m = O(1/ (nk+log1/ )) ex s suffice to guarantee that each such h is only possible to output w.p. < /2O(nk) w.p. >1- , our h hasPrx[c(x)=0 h(x)=1]
Theorem 1. If there is a k-DNF h* such that 1. Plausibility: Prx D[h*(x)=1] 2. h* entails c: Prx D[c(x)=1|h*(x)=1] = 1 then using m = O(1/ (nk+log1/ )) examples, in time O(mnk) we can find a k-DNF h such that with probability 1- , 1. Plausibility: Prx D[h(x)=1] 2. h almost entails c: Prx D[c(x)=1|h(x)=1] 1- k-DNF: an OR of terms of size k ANDs of at most k literals attributes or their negations
Extension: Agnostic Abductive Reasoning Given only that some h* achieves 1. Plausibility: Prx D[h*(x)=1] 2. h* almost entails c: Prx D[c(x)=1|h*(x)=1] 1- Find an h such that for some other & , 1. Plausibility: Prx D[h(x)=1] 2. h almost entails c: Prx D[c(x)=1|h(x)=1] 1- Extension of algorithm for k-DNF achieves = , =O(nk )
Outline 1. Models of abductive reasoning 2. An abductive reasoning algorithm 3. Not-easiness of abducing conjunctions
Theorem 2. Suppose that a polynomial-time algorithm exists for learning abduction from random examples for conjunctions. Then there is a polynomial-time algorithm for PAC-learning DNF. So what? 1. Central open problem in computational learning theory raised in original paper by Valiant (1984) 2. Recent work by Daniely and Shalev-Shwartz (2016) shows that algorithms for PAC-learning DNF would have other surprising consequences. In summary an algorithm for our problem would constitute an unlikely breakthrough.
PAC Learning w.p. 1- over D x =(x 1,x 2, ,x n) C C c(x ) (x(1), c(x(1))) (x(2)1,c(x(2))) w.p. 1- over (x(m),c(x(m))) f(x ) C f
Key learning technique: Boosting (Schapire, 1990) Given a weak learner a polynomial-time algorithm that, given examples of c C, w.p. 1- produces a circuit f s.t. Prx[f(x)=c(x)] > +1/poly(n) we can construct a polynomial-time PAC-learning algorithm for the class C. i.e., using the ability to produce such weak f s, we produce a g for which Prx[g(x)=c(x)] is boosted to any 1- we require.
Sketch of learning DNF using conjunctive abduction If c is a DNF and Pr[c(x)=1]> , then some term T of c has Pr[T(x)=1]>1/4 (# terms in c) and otherwise f 0 satisfies Pr[f(x)=c(x)]> + Note: Pr[c(x)=1|T(x)=1]=1, T is a conjunction Abductive reasoning finds some h such that Pr[h(x)=1]>1/poly(n) and Pr[c(x)=1|h(x)=1]> Return f: f(x)=1 whenever h(x)=1; if h(x)=0, f(x)=majx:h(x)=0{c(x)} Pr[f(x)=c(x)]> +1/4poly(n)
Stronger evidence that ideal algorithms for conjunctions don t exist Theorem 3. If a (randomized) polynomial time algorithm solves abductive reasoning for conjunctions by producing conjunctions with plausibility (1- ) , then there exist randomized polynomial time algorithms for every problem in NP.
Recap: learning abduction from random examples Abductive reasoning problem: Find an h s.t. 1. Plausibility: Prx D[h(x)=1] 2. h almost entails c: Prx D[c(x)=1|h(x)=1] 1- Efficient algorithms for k-DNF Algorithms for conjunctions would enable PAC-learning DNF believed intractable