Relevant Technologies and Data Protection
In the realm of technology, ways to think about data, accessing, protecting, and connecting users to systems like NSDS are crucial. Challenges in data provision, privacy, and trust are highlighted along with unexplored areas of improvement such as metadata, secure servers, and efficient linkages. Understanding the processes involved in getting and safeguarding data, as well as enabling user access to National Statistical Data Systems, is essential for a successful data ecosystem.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Relevant Technologies Amy O Hara November 20, 2020
Ways to think about technology Getting data to/in NSDS Protecting the data Connecting users to NSDS Getting results out of NSDS
View as RDC director Modern technology lacking Contrast Census vs. NCHS data provisioning Census Integrated Research Environment lacks scalable compute No virtual access Pilot underway, no business model or resource planning Roadblocks for non-academic users Varying definitions and requirements for authorized users No investment to increase trust of data providers Need process automation, clear and repeatable process for data sharing, transparency about access and products No investment to increase public trust Need transparency about uses, results, and benefits A misplaced focus on a Census Act-driven view of privacy differential privacy, impact of noise
What RDC hasnt pursued/provided yet Metadata Synthetic data Secure query servers Double encryption Alternative authentication models Virtual enclaves Efficient, affordable linkages Federation with other data systems Smart contracts Small steps: Project metadata for active and completed Census projects
Getting data to/in NSDS Accessing data Directly (cleartext, in the clear) vs. encrypted Integrated vs. federated Required vs. voluntary, consented vs. not Terms and conditions of use Agreements (MOUs, DUAs, licenses, contracts), NDAs Automating the production, maintenance, and monitoring Transferring data Data with identifiers vs. other fields Protecting data in transit and at rest Multiparty computation Data preparation Linkage, matching, entity resolution De-identification to produce research files
Protecting the data Projects Proposals Review and approval process Provisioning People Vetting, training Reinstating/retaining status Non-citizens, non-academics, non-employees Settings Data storage, location Barriers Surveillance (employee, cameras) Encrypted computing
Connecting users to NSDS Authentication Something you have, you know, you are Token, challenge questions, biometrics Equipment Laptop, SD box Restricted data vs. synthetic data Any code vs. allowed queries Research team vs. inside programmer
Getting results out of NSDS Statistical output Traditional disclosure avoidance vs. formal privacy Differential privacy Single vs. blended sources Descriptive vs. modeled statistics Summary vs. aggregated data Synthetic data Validation server Microdata Trusted third party services Cleaned, harmonized, linked
Ripe for focus Who can be an authorized user What are authorized uses Transparency - for data controllers and subjects Combined technologies Build or buy Policy work - clarify what law says vs interpretation