
Complexity of Agnostic Learning and Approximate Resilience
Discover the intricacies of agnostic learning, approximate resilience, and the complexity involved in the field of machine learning. Explore the challenges and advancements discussed by leading researchers and experts in this domain.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Approximate resilience, Monotonicity, and the Complexity of agnostic learning Vitaly Feldman IBM Research Almaden Dana Dachman-Soled Li-Yang Tan Andrew Wan UMD Simons Inst. IDA Duquesne Karl Wimmer SODA, 2015
Learning from examples - + - - Learning algorithm + + - - - - - + + - + + Labeled examples Classifier
Agnostic learning [V 84; H 92; KSS 94] - - + Example: ?, ~ ? - + + - - ?: distribution over ? { 1,+1} - - + - - + + + min ? ? ?, ?[? ? ] Pr Agnostic learning of a class of functions ? with excess error? : ?, output w.h.p. Pr ?, ?[ ? ] Opt?? + ? 3
Complexity of agnostic learning Too hard Conjunctions over 0,1?: ??( ?)for ? = 1[KKMS 05] but can t prove lower bounds
Complexity of agnostic learning Distribution-specific learning over distribution ? Marginal of ? on ? equals to a fixed ? For uniform distribution ? over 0,1? Conjunctions: ??(log 1/?) Halfspaces: ??(1/?4)[KKMS 05]; ? ?(1/?2)[FKV 14] Monotone juntas?
Our characterization Pol? : polynomials of degree ? over ? vars ?,Pol? = min ?1= E? ? ? ? ? ??? ?1 ?,Pol? 2 THM: For degree ? let ? = max ? ? Any SQ algorithm for agnostically learning ? over ? with excess error < ? needs ? (?) time * ? is closed under renaming of variables; ? 3?
Polynomial ?1 regression [KKMS 05] ?,Pol? 2 For degree ? let ? = max ? ? There exists a SQ algorithm for learning ? over ? with excess error ? + ? in time poly(??/?)
Statistical queries [Kearns 93] ?1 ?1 ?2 ?2 ? ?? ?? SQ learning algorithm SQ oracle ??:? { 1,1} 1,1 , ?? ?????, ? is tolerance of the query Complexity ?: at most ? queries each of tolerance at least 1/?
SQ algorithms PAC/agnostic learning algorithms (except Gaussian elimination) Convex optimization (Ellipsoid, iterative methods) Expectation maximization (EM) SVM (with kernel) PCA ICA ID3 ?-means method of moments MCMC Na ve Bayes Neural Networks (backprop) Perceptron Nearest neighbors Boosting [K 93, BDMN 05, CKLYBNO 06, FPV 14]
Roadmap Proof: approximate resilience Tools: analysis of approximate resilience Application: monotone functions Open problems
Approximate resilience ? is ?-resilient if for any degree-? polynomial ? E?? ? ? ? All Fourier coefficients of degree at most ? are 0 = 0 ? is ?-approximately ?-resilient if exists ?-resilient ?: 0,1? [ 1,1] such that ? ?1 ? THM: a Boolean ? is ?-approximately ?-resilient if and only if ?,Pol? 1 ?
From approx. resilience to SQ hardness ? closed under variable renaming Exist ? = ? (?) functions ?1, ,?? such that corresponding ?1, ,?? are uncorrelated ? ?1 ? ?? such that ?, ??[? ? ] ?/2 Pr Complexity of any SQ algorithm with error 1 least ?1/3[BFJKMR 94; F 09] 1 ?1/3 is at 2 Excess error: 1 ?1/3 ? 1 2 = ?,Pol? 1 2 ?1/3 2
Bounds on approximate resilience 1. Convert bounds on ?2 distance to unbounded ?-resilient function to ?1 distance to [ 1,1] bounded ?-resilient function 2. Amplify resilience degree via composition If ?1 is ?1-approximately ?1-resilient ?2 is ?2-approximately ?2-resilient ?1 has low noise sensitivity Then ? ?1,?2, ,?? = ?1?2?1, ,?2?? ?3-approximately (?1?2)-resilient is
Monotone functions of ? variables Known results: ?/?2[BT 96] ?? ? (1/?2)[KKMS 05] THM: SQ complexity ? ? for ? = 1/2 ?(1) Show existence of ? 1 -approximately mon. func. Lower bounds on Talagrand s DNF [BBL 98] ? -resilient
Explicit constructions ? 1 -approximately 2log ? -resilient mon. func. ? TRIBES: disjunction of log ?conjunctions of log? variables +amplification 2log ? /loglog ?-resilient Boolean function ? 1 -close to mon. Boolean func. CycleRun function [Wieder 12] +amplification
Conclusions and open problems Characterization can be extended to general product distributions over more general domains Does not hold for non-product distributions [FK14] Can the lower bound be obtained via a reduction to a concrete problem? E.g. learning noisy parities Other techniques for proving bounds on approximate resilience (or ?1 approximation by polynomials) Complexity of distribution-independent agnostic SQ learning?