Sample Complexity Bounds on Differentially Private Learning via Communication Complexity

Slide Note

This study explores sample complexity bounds on differentially private learning through communication complexity. It delves into the intersection of PAC learning model, privacy, and more, providing insights on the cost of privacy and related results in the field.

kate_555 Follow

Uploaded on Mar 09, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Sample Complexity Bounds on Differentially Private Learning via Communication Complexity Vitaly Feldman IBM Research Almaden David Xiao CNRS, Universite Paris 7 ITA, 2015

Learning model Learner has ? i.i.d. examples: ?1,?1, , ??,?? over ? {0,1} PAC model [V 84]: each ?? ? and ??= ?(??) for unknown ? and ? ? For every ? ?,?,? > 0, given examples, with prob. 3/4 output : ????,? = Pr ? ? ? ? ? ?

Privacy Each example is created from personal data of an individual (??,??)= (GTTCACG TC, YES ) Differential Privacy [DMNS 06] (Randomized) algorithm ? is ?-differentially private if for any two data sets ?,? such that ?,? = 1: ?? ? ? ?? Pr ? range ? , Pr ?? ? ?

What is the cost of privacy? SCDP(?) = sample complexity of PAC learning ? with ? = 1/4 and 1-differential privacy Thr? Points log|?| [KLNRS 08] VCDIM(?) SCDP(?) Points: ???? ? ?}; ????? = 1 iff ? = ? SCDP Points = ?(1)[F 09, BKN 10] Thr?: ? = 1, ,2?; ???? = 1 iff ? ? ; Thr?= ??? ? ?}; VCDIM Thr? = 1; log Thr? = ?

Our results: lower bounds Thr? Line? Point? log|?| [KLNRS 08] LDIM(?) SCDP(?) VCDIM(?) LDIM(?): Littlestone s dimension. Number of mistakes in online learning Corollaries: SCDP Thr? = ?(?)[L 87] For HS? SCDP HS? ?= linear separators over 1, ,2? ? ?= (?2?)[MT 94] Line?: ? = ?? LDIM Line? = 2;SCDP Line? = (log ?) 2, Line?= ? ?,? ?? such that ? ?,? = 1 iff ? ?? + ? mod ?}

Our results: characterization ? ? ? ? ? Alicia Roberto ? ? ?,? ?, Pr ? ? ? 1/4 Eval-?:? ? 0,1 with Eval ? ?,? = ?(?) Private coins: ?? Eval ? Public coins: ?? ,pubEval ? SCDP ? = ?? ,pubEval ?

Related results Distributional assumptions/Label privacy only/Count only labeled VCDIM ? [CH 11, BNS 15] Characterization in terms of distribution independent covers: SCDP ? = RCOVER ?[BNS 13a]

Distribution-independent covers ??-covers ? over distr. ? if ? s.t. ????,? ? ? is a distribution-independent (DI) ?-cover for ? if ? ? and distr. ?, ??-covers ? over distr. ? ? is DI 1/4-cover for ?} COVER ? = min log ? Thm: SCDP ? = O(COVER ? )[KLNRS 08, BKN 10] Proof: exponential mechanism [MT 07]

Randomized DI covers Let ? be a distribution over sets of hypotheses ? is a DI (?,?)-cover for ? if ? ? and ?, Pr ? ?? ? covers ? over ? 1 ? size(?)= ? supp(?)|?| max 1 4,1 ? is DI 4-cover for ?} RCOVER ? = min log size ? RCOVER ? = (SCDP ? )[BNS 13a]

From covers to CC ? ? and distr. ?, ? s.t. ????,? 1/4 von Neumann minimax ? ?, distribution ?? over ? s.t. ? ? Pr ?? ? ? ?? ? ? ? 1/4 ? ? (?) CC ? = log|?| ? ?,? ?, Pr ? ? ? 1/4

From covers to CC COVER ? = ?? Eval ? RCOVER ? = ?? ,pubEval ? ?? Eval ? ?? ,pubEval ? + ?(loglog ? ? ) [N 91]

Lower bound tools Information theory [BJKS 02] 1. Find hard distribution over inputs to Eval-? 2. Low communication low (mutual) information 3. Low information large error ? Augmented Index [BJKK 04, BIPW 10] ?0 ?1 0 1 0 0 0 1 0 1 0 1 ?0..0 ?0..1 ?1..0 ?1..1 0 1 0 0 0 ?0..00 ?0..01 ?0..10 ?0..11 ?1..00 ?1..01 ?1..10 ?1..11 LDIM(?) mistake tree

Our results: upper bounds Relaxed (?,?)-differential privacy ? is (?,?)-differentially private if for any two data sets ?,? such that ?,? = 1: ?? ? ? ?? Pr ? range ? ,Pr ?? ? ? + ? SCDP?Thr? = ?(16log ? log (1/?))[BNS 13b] SCDP?Line? = ?(log (1/?)) An efficient ?,? -DP algo that learns Line? using ?(log (1/?)/(??)) examples

Conclusions and open problems 1. Characterization in terms of communication 1. Tools from information theory 2. Additional applications Is sample complexity of ?,? -diff. private learning different from VCDIM? What is the sample complexity of efficient DP learning of HS? ??

Sample Complexity Bounds on Differentially Private Learning via Communication Complexity

Download Presentation

Presentation Transcript

Related

More Related Content