Private Learning Algorithms and Hierarchy in Machine Learning

what can we learn privately n.w
1 / 27
Embed
Share

Explore the concept of private learning algorithms, differential privacy, PAC learnability, and hierarchy of private learning algorithm classes in machine learning. Understand the goal of minimizing error and maintaining privacy while learning from data for effective decision-making.

  • Machine Learning
  • Private Learning
  • Differential Privacy
  • PAC Learnability
  • Algorithm Classes

Uploaded on | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. What Can We Learn Privately? Shiva Prasad Kasiviswanathan, Homin K. Lee, Kobbi Nissim, Sofya Raskhodnikova Adam Smith First published 2008 Presentation by Or Shahaf

  2. ML reminder: problem ?: inputs ?: labels ? = 0,1 : labels ?:? ? ? ? ?1 1 ?1= ???:27,??????:male, ?3 ?2 ?2 0 ?3 size ? = ? bits Denote ? = ?,? ? :? ? ? ?

  3. ML reminder: sample ? = ?1, ,??: database ? = ?1, ,??: database ?1= ?1,?1 ?(?1) = ?1,?1 = ???:27,??????:male, , ??:0 = ???:27,??????:male, , ??:0 ?2= ?2,?2 ? ?2 = ?2,?2 ??= ??,?? ? ?? = ??,?? ?,? ~? ? is an i.i.d. sample of size ? from ?

  4. ML reminder: goal Hypothesis class: ? The true error of hypothesis on distribution ?: ?? ?,? ~? ? ? Pr Goal: find algorithm ?: ? that minimizes ??? ?

  5. Adding Privacy ? is private. ? ? is public. We want ? to also be ?-differentially private: : ?,? ? ? Pr ? ? = ??Pr ? ? = 1= 1:

  6. Hierarchy of Private Learning Algorithm Classes Each class is a certain set of classification problems. PAC* = PPAC* We ll prove part of this hierarchy. LI* = SQ* LNI* = NASQ*

  7. ML Reminder: PAC learnable ? is a set of distributions over ? ?,? is PAC learnable using if some ? achieves arbitrarily small error at arbitrarily high probability, with only polynomial ?. ?,? is PAC learnable using if there exists a (possibly random) algorithm ?( PAC learner ) and ???? , , s.t. for any ? ,? ?, any distribution ? ? and any ?,? 0, there exists ? = ???? ?, 1 ?,log Pr ??? ? > ? ? where the probability is over choices of ?of size ? and the randomness of ?. 12: 1 ? s.t.

  8. ML Reminder: agnostically PAC learnable ?,? is PAC learnable using if some ? achieves error that s arbitrarily close to optimal at arbitrarily high probability, with only polynomial ?. ?,? is agnosticallyPAC learnable using if there exists a (possibly random) algorithm ?( PAC learner ) and ???? , , s.t. for any ? ,? ?, any distribution ? ? and any ?,? 0, there exists ? = ???? ?, 1 ?,log Pr ??? ? ?? where the probability is over choices of ?of size ? and the randomness of ?. 12: 1 ? s.t. ???> ? ? If some ? runs in time polynomial to ? then ?,? is efficiently PAC learnable.

  9. PAC* = PPAC* PAC (inefficiently) agnostically PAC learnable PPAC ?,? PAC : ?: ? has an ? differentially private PAC learner ?,? : ?,? is PAC* = PPAC* LI* = SQ* We ll show that PAC = PPAC . LNI* = NASQ*

  10. Generic Private Agnostic Learner Define the utility function: ? ?, ? ? : ?? ?? ?: Algorithm ?? ?? = exp ? Pr ?? ? ?, 2 ? is ?-differentially private. Theorem. ?? Proof.?? = 1. Thus, ?? ? implements ??,?, with ?-privacy. ? is an agnostic PAC learner using . Theorem. Assume = exp ???? ? . Then ?? Also, it uses 1 ? 1 1 ? = ? ln + ln max ??, examples. ?2 ???? ?

  11. ML reminder: errors The empirical error of on ?: ? ? ?? ?? =1 ?? Pr ? ? : ?? ?? ? ???, ????? The minimal true/empirical errors: ??

  12. Generic Private Agnostic Learner: Utility Proof 1 ???> ? . We need Pr ? ?. ?? Proof. Consider the event ? = ???? Define the event ? = : ?? ?? ? ?? ?? ?? ?? Thus, ? ? ? ?? ?? Pr ? Pr ? + Pr ?? ?? From Chernoff-Hoeffding bounds: : Pr ?? ?? Pr ? = Pr : ?? ?? ?? < ? . Observe that for ?? ?? we have ??? ? ? ? because: ??? ?? ?? ??? ? + ? ? = ? + ?? ?? ???> ? ? and so ???> ? ? ? ? 2exp 2??2 ? 2 exp 2??2

  13. Generic Private Agnostic Learner: Utility Proof 2 Also, for any : Pr ?? exp exp ???? ???? exp ???? ?? ?? ????? ?? ?? ?? ?? ? 2? 2 ???> ? ? = ?? = |? ?? ?? 2 2 : 2 ??? ? 2 ???? 2 max exp = exp < exp < exp 2 ???> ? ? ? ?? Pr ???? ?? ???> ? ? ?? = |? ?? ?? Pr ?? < exp ?? ? 2? 2

  14. Generic Private Agnostic Learner: Utility Proof 3 Set ? ?/3 and finally: ???> Pr ? Pr ? + Pr ?? ?? < 2exp ? 2? 3 ? ??? 6 2??29 + exp This is satisfied with 1 ? 1 ??,1 ? 6 ln + ln max ?2

  15. Generic Private Learner vs. Non-private ? is an agnostic PAC learner using . Theorem. Assume = exp ???? ? . Then ?? Also, it uses 1 ? 1 1 ? = ? ln + ln max ??, examples. ?2 ???? ? Implication:If a class is learnable then it s privately learnable: PAC = PPAC ?2 Sample complexity without privacy: ? = ? Non-private differs by a factor of ? ln + ln 1 ? ? ? if ? < ?. Otherwise, no difference. Time complexity of ? is ? ? = ? ????? ?? Possibly exponential! Can efficiently learnable problems be efficiently learnable privately?

  16. The Local Model An ?-local randomizer? is an ?-DP function that accepts databases of size ? = 1. Since ? = 1, any ? ? ? are neighbors. ? ?: ?,? ?: Pr ? ? = ? exp ? Pr ? ? = ? An ?-local algorithm can only access ? through local randomizers, s.t. if ??1, ,??? are the local randomizers of ? and each ??? is an ??-local randomizer, then ??? ?. ?-local algorithms are ?-DP. (Proof: composition). If a local algorithm prepares all its queries before it receives any answer, it s called non-interactive. Otherwise, it s interactive. ?? ??1 ??1?? ??1 Local Algorithm ?? ??1??

  17. The Statistical Query Model Let ? distribution over ?. An SQ oracle??? , takes ?:? ?,? and a tolerance ? 0,1 and outputs ? such that ? ?? ~?? ? Note: ?doesn t have to be efficiently computable. An SQ algorithm accesses the distribution ? only via the ???. If an SQ algorithm prepares all its queries to ???before it receives any answer, it s called non-adaptive. Otherwise, it s adaptive. ? ?1,?1 ?? ~??1? ?2,?2 ??? Oracle SQ Algorithm ?? ~??2?

  18. Local model vs. SQ model We show that if ?1, ,?? are drawn i.i.d. from ?then the models are equivalent . Moreover, the expected query complexity is preserved up to a polynomial factor.

  19. Simulate SQ Algorithm by Local Algorithm Local Algorithm ?1 ?,? ??? Oracle ??? Oracle Simulator SQ ?? ?2 Algorithm ?? ~?? ? ???2

  20. Simulate SQ Oracle by Local Randomizers Let ?:? ?,? . Define ?? s.t. ??? = ? ? + ? where ?~Lap ? 2? ?? is an ?-local randomizer. Let ? ? . We construct a local algorithm ???,? : return 1 2? ? . ? ? ?????. ?? is ?-local. For each ? ? , the ? was queried at most once. If so, it was queried with the ?- local randomizer ??. ??? ? : ? ?? ~?? ? > ? , i.e. the set of invalid SQ oracle outputs. log 1 ? ?2 ?2?2 and ?1, ,?? are drawn i.i.d. from ?. Then Pr ???,? ??? ? Theorem. Say |?| =

  21. ?? is ?-local with probability 1 ?: proof 1 ? ?2 ?2?2 log Theorem. Say ? = and ?1, ,?? are drawn i.i.d. from ?. Then Pr ???,? ??? ? 1 ? ???. Proof.?? returns 1 ? ?1, ,? ?? are i.i.d. in the range ?,? and expectancy ?? ~?? ? . From Chernoff-Hoeffding bound: 1 ? ? ? ? ??? is the average of ? i.i.d. RV ~Lap 1 ? ? ? Pr ? ?? ?? + ? ?? ?? + 2exp ?2? ? ? ?? ?? ~?? ? Pr 8?2 2 1 2? ? . Thus: = exp ?2?2? ?? ? Pr 64?2 2 ? 2exp ?2? + exp ?2?2? 1 1 ? ??? ?? ~?? ? 8?2 64?2 ?/2 ?/2

  22. Simulate SQ Algorithm by Local Algorithm Let ??? be an SQ algorithm that makes at most ? queries to ??? on a certain input. Define local algorithm ?? that simulates ???: Each query ??,?? of ??? is simulated by ?? by running ?????,? where Local Algorithm ?1,? ?1,?1 ??? Oracle Simulator ?1 ??1?1,? ?? ~??1? SQ Algorithm ?2,? ?2,?2 ??? Oracle Simulator 1 ? ?2 ?2?? log ? ? ?, ?? = , ?2 2 ??2?2,? ?? ~??2? and the ??s are pairwise disjoint. ?? is ?-local. Proof: ? ? there s up to one query ????? and ??? is an ?-local randomizer.

  23. Simulate SQ Algorithm by Local Algorithm Theorem. Suppose an invocation of ??? makes at most ? queries to ???, each with ? log ? ? ?2 ?2?2 1 ?. tolerance at least ?. If ? = then ?? outputs a valid output of ??? w.p. Proof. From union bound: Pr ? ? : ?????,? ??? ? ? = ?. Also, log 1 ? ?2 ?2?2 ? ? ?2 ?2?2 ? log ? ? ?? = ?=1 ? ?=1 ?? = ? = . ? ? ?2 ?2?2 ? log Also, the number of queries increase from ? to ? = polynomial factor. Notice that if ??? is non-adaptive then ?? can prepare its responses before returning any of them to ???, so ?? can be non-interactive. . Increased by a

  24. Local and SQ Learnability Let ? be some set of distributions over ?. ?,? is (non-interactively) locally learnable using if ?,? is PAC learnable using For all ? > 0, there s a ?-local (non-interactive) PAC learner that makes ???? ?,size ? , 1 ? , 1 ?,log 1 ? queries ???? ?,? is (non-adaptively) SQ learnable using if ?,? is PAC learnable using For all ? > 0, there s an (non-adaptive) SQ algorithm PAC learner that makes ???? ?,size ? , 1 ? ,log 1 ? queries ?????,?? where ??:? ? 1,1 and 1 ???? ?,size ? , 1 ? ,log 1 ? ?? 1

  25. LI*, SQ* LI locally learnable SQ learnable LNI non interactively locally learnable NASQ ?,? : ?,? is non adaptively SQ learnable ?,? : ?,? is interactively PAC* = PPAC* ?,? : ?,? is adaptively SQ ?,? : ?,? is LI* = SQ* LNI* = NASQ*

  26. Proving SQ LI Let ???an SQ PAC learner of ?,?using , i.e. 12: ?,? ?,? ?: ?,? 0, 1 ?,log 1 ? queries ?????,??, (??,?? as allowed) we For ? = ???? ?,size ? , have: ???> ? ? Pr ?? ??? ?? ? ? ?2 ?2?2 ? log Let ?? be the simulator of ??? that we defined. ? = valid output of ??? w.p. 1 ?. Then Pr ????? ?? Pr ?? ??? ?? and ?? outputs a ???> ? ???> ? + Pr[?? returns an invalid ] 2?

  27. Implications Learning with noise (SQ*) is worse than learning with privacy (PPAC*). Much of the attention that local models received were for the benefits of non- interactive models. Sadly, the article proves that LI LNI .

More Related Content