Understanding Differential Privacy: Lecture 8 Overview

coe 426 data privacy n.w
1 / 22
Embed
Share

Explore the concept of differential privacy in Lecture 8, covering topics such as the definition, global sensitivity, Laplace Mechanism, achieving DP, and more. Dive into examples, properties, and mechanisms to safeguard privacy in data processing.

  • Privacy
  • Data
  • Lecture
  • Differential Privacy
  • Laplace Mechanism

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. COE 426 Data Privacy Lecture 8: Differential Privacy

  2. Outline Review of differential privacy definition Global sensitivity Laplace Mechanism Examples Properties of differential privacy 2 COE426: Lecture 8

  3. Definition of Differential Privacy Randomized sanitization function ? has ?-differential privacy if for all neighboring data sets D and D' and all possible outputs of ?, Pr ? ? = ? e?Pr ? ? = ? Pr ? ? = ? Pr ? ? = ? ?? ? > 0 ??? ? ?????(?) Small ? -> better privacy 3 COE426: Lecture 8

  4. Check Our Understanding of DP How to construct randomized algorithm ?? And what does Pr ? ? = ? mean? Give me an example? For a database of n records, how many possible neighboring databases are there? Are there any restriction on the added/removed records? In Werner's procedure, how did we achieve neighboring databses? What should be the value of ?? What does it mean exactly? Are there any drawback of DP? Why only big companies are using it? Is DP sensitive to the size of the table? 4 COE426: Lecture 8

  5. How to Achieve DP? Randomized Response Laplace Mechanism Exponential Mechanism Report Noisy Max 5 COE426: Lecture 8

  6. Global Sensitivity Recall No-Free-Lunch Theorem Sensitivity is a measure of how one record affects the possible output of the query Global sensitivity ???= ? = max ???? ?????? ?,? ? ? ? ? 1 is the first-norm ( 1- norm) Example ? ?1= ?=1 |?? ??| where p and q are vectors 1 Where . ? 6 COE426: Lecture 8

  7. Global Sensitivity: Example ?: # individuals with salary > $30K? ???= ? = 1 ID 1 2 3 4 5 5 6 ID 1 2 3 4 Salary 45K 20K 15K 50K Salary 45K 20K 15K 50K 60K 60K 75K ? : Frequency histogram of sensitive attribute Global Sensitivity of f = 1 4 4 3 3 2 2 1 1 0 0 7 COE426: Lecture 8

  8. Laplace Mechanism Laplace distribution ???(?;?) has density 1 2?? |? ?| ? ? : mean 2?2: variance When ?=0, ???(?) ? |?| ? Theorem: A mechanism ? that adds a random noise drawn from Laplace distribution with variance ? =??? (i.?.? ? = ? ? + ??? ? ) is ?-differentially private. ? 8 COE426: Lecture 8

  9. Laplace Mechanism Example # individuals with salary > $30K? Query f x1 xn ??? ? Database D Analyst ?(?) + ??? Recall that count queries has ???= 1 1 ? ?(?) + ??? Satisfies ?-differential privacy Small ? -> more noise -> bad accuracy 9 COE426: Lecture 8

  10. Laplace Mechanism: Sketch of Proof Recall, randomized function ? is ?-differentially private if Pr ? ? = ? e?Pr ?(? ) = ? Or Pr ? ? =? Pr ? ? =? e? For any possible output ?, the probability density is proportional to the probability density of the added noise ? ? , because ? ? = ? ? + ??? ? = ? |? ? ?| ? ? ? ? ??? ? ? ? ? ? Then Pr ? ? +???(?)=? Pr ? ? +???(?)=? ? ? ? ? ? ? ? ? ? Recall ? =??? ? ??? ? = ?? ? 10 COE426: Lecture 8

  11. Example: COUNT query Number of people having disease True answer = 3 Sensitivity = 1 D Disease (Y/N) Y Y N Y N N Answer: 3 + ?, where ? is drawn from ???(1/?) Mean = 0 Variance = 2/?2 11 COE426: Lecture 8 36

  12. Example: Sum (Average) query Sum of Age (suppose Age is in [a,b]) Sensitivity = ? Answer: 36 + ?, where ? is drawn from ???(?/?) Mean = 0 Variance = 2?2/?2 12 COE426: Lecture 8

  13. Properties of Differential Privacy Differential privacy has three properties that makes it very practical 1. Post-Processing invariance 2. Composition Sequential composition Parallel composition 3. Group privacy 1. 2. 13 COE426: Lecture 8

  14. Post-processing If ? is an ?-differentially private algorithm that accesses a private database D, then ? ? ? , where ? is any arbitrary function, is also satisfies ?- differential privacy In other words, differential privacy is robust against further process of a previous database output # individuals with salary > $30K? Query f x1 xn =? ?2 ? ? ? Database D Analyst 14 COE426: Lecture 8

  15. Composability Composability is the ability to join the output of two (or more) differentially privacy mechanisms There are two compositions Sequential composition Parallel composition 1. 2. Sequential composition Parallel composition 15 COE426: Lecture 8

  16. Composition guarantees for DP Arbitrary composition (sequential and/or parallel) of k differentially private algorithms is still differentially private 0.1 0.1 0.2 0.2 0.2 0.2 0.3 0.3 Sequential composition ??? differential privacy 0.8 Parallel composition max( i) differential privacy 0.3 16 COE426: Lecture 8

  17. Proof of Sequential Composability Claim: The algorithm A(D) = (A1(D), A2(D)) is ( 1+ 2)-differentially private. A1(D) 1-DP A2(D) 2-DP Proof (for discrete probability distributions): Fix neighboring D, D' and any two outputs r1 in the range of A1, and r2 in the range of A2. 17 COE426: Lecture 8

  18. Why Composition? Reasoning about privacy of a complex algorithm is hard Helps software design process If building blocks are proven to be private, it would be easy to reason about privacy of a complex algorithm built entirely using these building blocks 18 COE426: Lecture 8

  19. Differentially Private Histogram Histogram queries are an example of the parallel composition The addition or removal of a row can only change the count in a bucket by 1 This means we need only perturb the count in each bucket according to ???( Answer k queries by adding Lap(1/?) to each answer; Gives ? -DP over all ? ?) = ???(1 ?) Given queries q1,...qk: 1. Compute qi(D), i=1...k 2. Output qi(D) + Lap(1/?) 19 COE426: Lecture 8

  20. Differential privacy: an example Originalrecords 20 Perturbed histogram with differentialprivacy Originalhistogram COE426: Lecture 8

  21. Group Privacy Differential privacy is designed to protect the privacy of a single individual By considering neighboring databases differing in exactly one record Differential privacy can also protect the privacy neighboring databases differing in ? records Pr ? ? = ? Pr ? ? = ? ??? How to achieve ?-differential privacy for a group of ? records? 21 COE426: Lecture 8

  22. Next Attraction Exponential Mechanism Applying differential privacy Practicing Differential privacy with Python 22 COE426: Lecture 8

Related


More Related Content