Exploiting Unintended Feature Leakage in Collaborative Learning
Collaborative learning presents a new attack surface where model updates can inadvertently leak sensitive information such as users' identities. This leakage occurs when the predictions made by the model are not directly related to the learning task, revealing underlying data properties. By exploiting these unintended feature leakages, adversaries can infer details like gender, which are typically uncorrelated with the learning task. This research delves into the implications and challenges posed by such leakage in collaborative learning scenarios.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Exploiting Unintended Feature Leakage in Collaborative Learning Luca Melis Congzheng Song Emiliano De Cristofaro Vitaly Shmatikov UCL Cornell Cornell Tech 1
Overview Overview Collaborative learning new attack surface Predict gender Not correlated to learning task model updates leak info about training data Example: users' identities 2
Deep Learning Background Deep Learning Background Map input ? to layers of features ?, then to output ?, connected by ? ? = Female ?? Learn parameters to minimizes loss: ? = ????????(?,?) ?? ?? Gradient descent on parameters: In each iteration, train on a batch Update ? based on gradient of ? ?? ?? ? = Gradients reveal information about the data 3
Distributed / Federated Learning Distributed / Federated Learning Multiple clients want to collaboratively train a model Server Gboard Tensorflow Federated 4
Distributed / Federated Learning Distributed / Federated Learning Multiple clients want to collaboratively train a model In each iteration Server Download global model from server 5
Distributed / Federated Learning Distributed / Federated Learning Multiple clients want to collaboratively train a model In each iteration Server Share model updates Sharing updates rather than data... how much privacy gained? Train on a batch 6
Threat Model Threat Model Malicious participant or server has WHITE-BOX access to model updates! What information do updates leak? Server = ? Gradients based on a batch of data iter ? iter ? + ? 7
Leakage from model updates Leakage from model updates How to infer properties from observed updates? Model updates from gradient descent: Gradient updates reveal ?: ?? ??=?? ?? ?? ??=?? ? = ? , ?? ? if adversary has examples of data with these properties ? = features of ? learned to predict ? leaks properties of ? which are UNCORRELATED with ? Use supervised learning! e.g. gender and facial IDs 8
Property Inference Attacks Property Inference Attacks Inferring properties from observation = learning problem Labeled with property Server iter ? iter ? iter ? iter ? + ? iter ? Infer Train Labeled updates Labeled updates Labeled updates Property Classifier 9
Infer Property (Two Infer Property (Two- -Party Experiment) Party Experiment) Labeled Faces in the Wild: Participant trains on facial images with certain attributes Target label Property Correlation Attack AUC Main task and property are not correlated! 10
Infer Occurrence (Two Infer Occurrence (Two- -Party Experiment) Party Experiment) FaceScrub: target=gender, property=facial IDs Participant trains on faces of different people in different iterations Probability of predicting a facial ID Infer when images of a certain person appear and disappear in training data 11
Active Attack Works Even Better Active Attack Works Even Better FaceScrub: target=gender, property=facial IDs Adversary uses multi-task learning to create a model that Predicts task label Predicts property Strength of active attack Adversary can actively bias the model to leak property by sending crafted updates! 12
Multi Multi- -Party Experiments Party Experiments Observe aggregated updates Accurate inference for some properties! Yelp Review: target=review score property=authorship Unique properties are not affected For Yelp, unique word combinations identify authorship Performance drops as number of participants increases 13
Visualize Leakage in Feature Space Visualize Leakage in Feature Space Main task Inferred property Blue points = Black Red points = Not black Circle points = Female Triangle points = Male Pool1 Layer Final Layer Pool2 Layer 14
Takeaways Takeaways Models unintentionally learn all sorts of features Collaborative training reveals white-box model updates As a result, model updates leak information about training data Thank you! Code available at: https://github.com/csong27/property-inference-collaborative-ml 15