Sample Complexity Bounds on Differentially Private Learning via Communication Complexity

Sample Complexity Bounds on Differentially Private Learning via Communication Complexity
Slide Note
Embed
Share

This study explores sample complexity bounds on differentially private learning through communication complexity. It delves into the intersection of PAC learning model, privacy, and more, providing insights on the cost of privacy and related results in the field.

  • Privacy
  • Learning
  • Complexity
  • Differentially Private Learning
  • Communication Complexity

Uploaded on Mar 09, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Sample Complexity Bounds on Differentially Private Learning via Communication Complexity Vitaly Feldman IBM Research Almaden David Xiao CNRS, Universite Paris 7 ITA, 2015

  2. Learning model Learner has ? i.i.d. examples: ?1,?1, , ??,?? over ? {0,1} PAC model [V 84]: each ?? ? and ??= ?(??) for unknown ? and ? ? For every ? ?,?,? > 0, given examples, with prob. 3/4 output : ????,? = Pr ? ? ? ? ? ?

  3. Privacy Each example is created from personal data of an individual (??,??)= (GTTCACG TC, YES ) Differential Privacy [DMNS 06] (Randomized) algorithm ? is ?-differentially private if for any two data sets ?,? such that ?,? = 1: ?? ? ? ?? Pr ? range ? , Pr ?? ? ?

  4. What is the cost of privacy? SCDP(?) = sample complexity of PAC learning ? with ? = 1/4 and 1-differential privacy Thr? Points log|?| [KLNRS 08] VCDIM(?) SCDP(?) Points: ???? ? ?}; ????? = 1 iff ? = ? SCDP Points = ?(1)[F 09, BKN 10] Thr?: ? = 1, ,2?; ???? = 1 iff ? ? ; Thr?= ??? ? ?}; VCDIM Thr? = 1; log Thr? = ?

  5. Our results: lower bounds Thr? Line? Point? log|?| [KLNRS 08] LDIM(?) SCDP(?) VCDIM(?) LDIM(?): Littlestone s dimension. Number of mistakes in online learning Corollaries: SCDP Thr? = ?(?)[L 87] For HS? SCDP HS? ?= linear separators over 1, ,2? ? ?= (?2?)[MT 94] Line?: ? = ?? LDIM Line? = 2;SCDP Line? = (log ?) 2, Line?= ? ?,? ?? such that ? ?,? = 1 iff ? ?? + ? mod ?}

  6. Our results: characterization ? ? ? ? ? Alicia Roberto ? ? ?,? ?, Pr ? ? ? 1/4 Eval-?:? ? 0,1 with Eval ? ?,? = ?(?) Private coins: ?? Eval ? Public coins: ?? ,pubEval ? SCDP ? = ?? ,pubEval ?

  7. Related results Distributional assumptions/Label privacy only/Count only labeled VCDIM ? [CH 11, BNS 15] Characterization in terms of distribution independent covers: SCDP ? = RCOVER ?[BNS 13a]

  8. Distribution-independent covers ??-covers ? over distr. ? if ? s.t. ????,? ? ? is a distribution-independent (DI) ?-cover for ? if ? ? and distr. ?, ??-covers ? over distr. ? ? is DI 1/4-cover for ?} COVER ? = min log ? Thm: SCDP ? = O(COVER ? )[KLNRS 08, BKN 10] Proof: exponential mechanism [MT 07]

  9. Randomized DI covers Let ? be a distribution over sets of hypotheses ? is a DI (?,?)-cover for ? if ? ? and ?, Pr ? ?? ? covers ? over ? 1 ? size(?)= ? supp(?)|?| max 1 4,1 ? is DI 4-cover for ?} RCOVER ? = min log size ? RCOVER ? = (SCDP ? )[BNS 13a]

  10. From covers to CC ? ? and distr. ?, ? s.t. ????,? 1/4 von Neumann minimax ? ?, distribution ?? over ? s.t. ? ? Pr ?? ? ? ?? ? ? ? 1/4 ? ? (?) CC ? = log|?| ? ?,? ?, Pr ? ? ? 1/4

  11. From covers to CC COVER ? = ?? Eval ? RCOVER ? = ?? ,pubEval ? ?? Eval ? ?? ,pubEval ? + ?(loglog ? ? ) [N 91]

  12. Lower bound tools Information theory [BJKS 02] 1. Find hard distribution over inputs to Eval-? 2. Low communication low (mutual) information 3. Low information large error ? Augmented Index [BJKK 04, BIPW 10] ?0 ?1 0 1 0 0 0 1 0 1 0 1 ?0..0 ?0..1 ?1..0 ?1..1 0 1 0 0 0 ?0..00 ?0..01 ?0..10 ?0..11 ?1..00 ?1..01 ?1..10 ?1..11 LDIM(?) mistake tree

  13. Our results: upper bounds Relaxed (?,?)-differential privacy ? is (?,?)-differentially private if for any two data sets ?,? such that ?,? = 1: ?? ? ? ?? Pr ? range ? ,Pr ?? ? ? + ? SCDP?Thr? = ?(16log ? log (1/?))[BNS 13b] SCDP?Line? = ?(log (1/?)) An efficient ?,? -DP algo that learns Line? using ?(log (1/?)/(??)) examples

  14. Conclusions and open problems 1. Characterization in terms of communication 1. Tools from information theory 2. Additional applications Is sample complexity of ?,? -diff. private learning different from VCDIM? What is the sample complexity of efficient DP learning of HS? ??

More Related Content