Eunomia: Real-time Privacy Compliance Firewall for Alexa Skills Research

eunomia a real time privacy compliance firewall n.w
1 / 28
Embed
Share

Explore the challenges and research efforts in ensuring privacy compliance in smart homes, focusing on the Eunomia firewall for Alexa skills. Learn about current issues with privacy policies and the gap in compliance detection, skill verification, and malicious skill prevention.

  • Privacy
  • Compliance
  • Alexa
  • Research
  • Smart Homes

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Eunomia: A Real-time Privacy Compliance Firewall for Alexa Skills Javaria Ahmad , Fengjun Li , Razvan Beuran , and Bo Luo CISA, University of Central Missouri, Warrensburg, MO, USA. Email: ahmad@ucmo.edu EECS and I2S, The University of Kansas, Lawrence, KS, USA. Email: fli@ku.edu; bluo@ku.edu Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, Japan. 1

  2. Smart Homes IoT Devices and Control Control Via Alexa Skills IoT Devices in Smart Homes 2 K. Adeyeye, et al. Integrating Photovoltaic Technologies in Smart Homes. In 2018 International Conference on Advances in Big Data, Computing and Data Communication Systems (icABCD), IEEE (2018); https://hdhtech.com; https://iotdesignpro.com;

  3. Outline 1 Background and Motivation i Current Challenges Research Efforts ii iii Need for Investigating the Privacy Compliance in Smart Homes 3

  4. Background and Motivation: Current Challenges Privacy policies: a standard practice to notify users about the data collection, management, and/or sharing operations. Main challenges: Policies are difficult to comprehend Policies are incomplete or inconsistent with the actual practices No disclosure Inaccurate disclosure 4

  5. Background and Motivation: Research Efforts SOTA research efforts can be roughly categorized into following directions for smart homes (source code accessible vs no code): 1. Privacy policy comprehension (1): read the policies through automation1 2. Privacy compliance gap detection (gap between (1) and (2)): gap between privacy policies and data practices for websites and mobile apps2,3 3. Skill compliance check (gap between (1) and (3)): skill policy content against skill descriptions and skill actions using interaction model4,5,6 4. Malicious skills and certification: malicious skills and weak Amazon Alexa certification process, lack of re-verification, skill squatting attacks7,8 1. 2. 3. 4. 5. 6. 7. D. Kumar, et al. Skill squatting attacks on amazon alexa In USENIX Security Symposium (USENIX), 2018. 8. L. Cheng, et al. Dangerous skills got certified in ACM SIGSAC Conference on Computer and Communications Security (CCS), 2020. B. Andow, et al. PolicyLint: Investigating Internal Privacy Policy Contradictions on Google Play. In Proceedings of the USENIX Security Symposium (USENIX), August 2019. S. Zimmeck, et al. Maps: Scaling privacy compliance analysis to a million apps. Proc. Priv. Enhancing Tech. (PETS), 2019. B. Andow, et al. Actions speak louder than words: Entity-sensitive privacy policy and data flow analysis with policheck. In USENIX Security Symposium (USENIX), 2020. Z. Guo, et. al. Skill explorer: understanding the behavior of skills in large scale. In 29th USENIX Security Symposium (USENIX), 2020. F. Xie, et al. Scrutinizing privacy policy compliance ... In 37th IEEE/ACM International Conference on Automated Software Engineering (IEEE/ACM), 2022. J. Young, et al. Skill detective: automated policy-violation detection of voice assistant applications ... In USENIX Security Symposium (USENIX), 2022. 5

  6. Background and Motivation: Need for Real-time Compliance Study Amazon Alexa voice assistant (VA) apps: called Skills, enhance VA capabilities, but pose concerns Sensitive data may transmit to 3rd parties Amazon enforces policy requirements leniently Users may not be able to review policies Policies may be incomplete or inconsistent Skills run outside the VA hardware User and skill interaction: Privacy policy of the skill: Existing research validates privacy compliance by interacting with the skills to trigger all their behaviors Time-consuming, relies on coverage, less effective for logic bombs/time bombs. There is a need for real-time monitoring Only adequate solution for effective compliance checks, real-time firewall defends against all non-compliant data collection practices 6

  7. Outline 2 Proposed Solution for Smart Homes Privacy Compliance without Source Code: Eunomia i Research Questions Architecture and Components ii Prototype, Implementation Details and Design Choices iii iv Evaluation and Analysis v Case Studies vi Discussions and Contributions 7

  8. Eunomia: Research Questions We develop Eunomia, a real-time monitor for skill actions and policy consistency analysis. Eunomia aims to answer the following questions. 1. How effective is Eunomia in providing defense to the users? 2. What is the overall compliance status of Alexa skills? 3. Which particular compliance gaps are there in Alexa skills? 4. Which type of compliance issues are more common than others? 8

  9. Eunomia: Policy Analysis Eunomia s architecture has the following components: 1. Policy Analysis 9

  10. Eunomia: Ontology Definition Ontology definition and domain adaptation Identified ontologies or is-a relationships Annotated 150 policies with data objects and is-a relationships Trained a Tok2Vec relation classifier to predict relationships Fine-tuned BERT model on 2000 policies Applied BERT and Tok2Vec to extract objects/relationships From ontologies, created data ontology graph Identified synonyms for data objects Examples of data ontologies A part of data ontology graph 10

  11. Eunomia: Tuning a SOTA BERT Model for Domain Adaptation Fine-tuned a pre-trained transformer-based BERT modelfor policy analysis and compliance validation Improved PoliCheck/PolicyLint for policy analysis State-of-the-Art NLP Model: used BERT instead of CNN Domain-specific ontology: used domain-specific corpus Final model has 87.94% precision and 88.09% recall in identifying data objects 11

  12. Eunomia: Policy Analysis The policy analysis module extracts the privacy policies from Amazon skills store, performs NLP-based policy analysis, identifies declared practices. Policy analysis steps: 1. Tuning a SOTA BERT model for domain adaptation 2. Creating dependency-based parse trees 3. Recognizing negative practices 4. Representing practices in tuples 12

  13. Eunomia: Privacy Practice Monitor Eunomia s architecture has the following components: 1. Policy Analysis 2. Privacy Practice Monitor 13

  14. Eunomia: Privacy Practice Monitor Examine skill s attempt to collect user information: Capture skill outputs (text/audio) from the VA device Extract privacy practices from skill actions Extract practices by detecting data objects in skill responses Use fine-tuned BERT and keyword-based term matching Check the context of the output to identify data collection attempts 14

  15. Eunomia: Compliance Validation Eunomia s architecture has the following components: 1. Policy Analysis 2. Privacy Practice Monitor 3. Compliance validation 15

  16. Eunomia: Compliance Validation Validate the captured privacy practice against the declared practice Compare skill practice with each privacy policy tuple to identify among the four disclosure types: Compliant and Clear (Clear): practice exactly matches with the tuple Compliant but Unclear (Unclear): practice is a subset of the data object of the policy tuple Undisclosed and Non-compliant (Undisclosed): practice is irrelevant to the policy tuple Inaccurate and Non-compliant (Inaccurate): practice matches with the tuple but policy is negated Consistent disclosure Inconsistent disclosure 16

  17. Prototype and Implementation Details Build a prototype for communication between users and skills while Eunomia watches Simulated with Amazon s Alexa developer console Loads skills from AWS/other cloud servers Invokes Alexa voice service (AVS) for user/skill communication Eunomia runs on same physical device as VA simulator: embedded Deploy prototype on 4 Windows machines Evaluate all skills on Amazon s skill store Experiments ran approx. 4 months Simulate user activities with a chatbot 17

  18. Prototype and Implementation Details Intercepting the skill-user communication Selenium WebDriver to interact with VA simulator ( get method) Capture skill s response displayed on simulator ( getText() ) Compliance validation, three main methods used are: readPolicy(): loads and parses policies, extracts declared privacy practices readOutput(): parses Skills outputs obtained from WebDriver, uses BERT complianceCheck(): validates consistency between skill s practices and declarations. For non-compliance, invokes WebDriver to terminate skill 18

  19. Prototype and Implementation Details Efficiently loading privacy policies For a previously cached skill, first utilizes cached policy Also, the current policy is fetched Eunomia determines if policy is changed: compares the hash Reliably terminating non-compliant communication sendKeys() method of WebDriver to send stop / exit commands Non-compliant question asked to the successful termination of skill: 1.14 sec avg. Eunomia also disables the element that encloses user s text and audio inputs 19

  20. Design Choices Deployment of Eunomia: stand-alone model (I) Eunomia sits externally on a device (Raspberry Pi/PC) placed near a VA device Eunomia captures VA s output via PC s mic, converts audio to text If non-compliance, Eunomia generates stop/exit text, converts to speech Eunomia inside VA device: Embedded model (II) Eunomia has direct access to skill/user communication Requires support from Amazon and/or the device Amazon has now commercialized the AVS Device SDK 20

  21. Evaluation and Analysis: Privacy Policy Landscape Privacy policy landscape: 21

  22. Evaluation and Analysis: Performance Evaluation Performance evaluation: Eunomia s reaction is between 0.5 and 2 seconds with an average of 1.14 seconds Precision: Defined as correctly identified skills/data collection practices out of all skills/data collection practices Eunomia identifies clear and undisclosed skills and practices with 100% precision respectively X (Y): number of skills (number of data collection practices); Manual Validation #: randomly selected skills and data collection practices for manual validation; Correctly Verified #: number of skills and data collection practices correctly verified through manual validation. 22

  23. Evaluation and Analysis: Performance Evaluation Performance evaluation: Recall: Defined as the proportion of correctly detected information collection actions out of all private information collections SkillExplorer assembled a list of skills that collect private information, while SkillDetective adopted the list and kindly shared their lists with us. Eunomia correctly detects 36 out of 37 private information collection actions in 26 out of the 27 skills (recall=96.3%) 23

  24. Evaluation and Analysis: Compliance Landscape Compliance landscape: Only 25.9% practices are correctly disclosed 7.2% of data collection practices are compliant but unclear 64.3% practices are completely undisclosed 903 undisclosed collection practices belong to 622 skills. Only 129 out of 622 skills have a privacy policy link 24

  25. Case Studies Boston Bike Skill inquires the users about their address to find the bike stations close to them Inaccurate disclosure (non-compliant) The privacy policy states that the skill does not collect any data Body Mass Index Asks the users their health-related information to calculate the body mass index Omitted disclosure (non-compliant) The missing privacy policy makes the skill non-compliant 25

  26. Discussions: the Research Questions 1. How effective is the Eunomia firewall in protecting the users? Eunomia provides defense in real-time: stops non-compliance in 1.14 sec Eunomia is very effective in identifying private data actions: 96.3% recall Accurately identifies disclosure types with highprecision per category 2. What is the overall compliance status of the Alexa skills? Majority skill practices are non-compliant: 940/1405 3. Which particular compliance gaps are there in Alexa skills? Undisclosed practices mainly result from missing policies. The Inaccurate disclosures are rare: 37/1405 4. Which type of compliance issues are more common than others? Location-related and fitness activity violations occur most frequently 26

  27. Contributions Our primary contributions are as follows: 1. Eunomia is the first real-time privacy-compliance monitor for VAs: works on-the-fly 2. Uses state-of-the-art (SOTA) NLP algorithms, automated ontology, identifies fine- grained disclosure types, achieves outstanding performance 3. Provide privacy policy landscape and the compliance state of Alexa skills through an automated large-scale assessment 27

  28. Acknowledgements Bo Luo and Fengjun Li were supported in part by NSF IIS-2014552, DGE- 1565570, and the Ripple University Blockchain Research Initiative. The authors would like to thank the anonymous reviewers and the shepherd for their valuable comments and suggestions. We would also like to thank the authors of SkillExplorer and SkillDetective for generously sharing their data with us. 28

More Related Content