GCP – Get started with differential privacy and privacy budgeting in BigQuery data clean rooms
We are excited to announce that differential privacy enforcement with privacy budgeting is now available in BigQuery data clean rooms to help organizations prevent data from being reidentified when it is shared.
Differential privacy is an anonymization technique that limits the personal information that is revealed in a query output. Differential privacy is considered to be one of the strongest privacy protections that exists today because it:
is provably private
supports multiple differentially private queries on the same dataset
can be applied to many data types
Differential privacy is used by advertisers, healthcare companies, and education companies to perform analysis without exposing individual records. It is also used by public sector organizations that comply with the General Data Protection Regulation (GDPR), the Health Insurance Portability and Accountability Act (HIPAA), the Family Educational Rights and Privacy Act (FERPA), and the California Consumer Privacy Act (CCPA).
What can I do with differential privacy?
With differential privacy, you can:
protect individual records from re-identification without moving or copying your data
protect against privacy leak and re-identification
use one of the anonymization standards most favored by regulators
BigQuery customers can use differential privacy to:
share data in BigQuery data clean rooms while preserving privacy
anonymize query results on AWS and Azure data with BigQuery Omni
share anonymized results with Apache Spark stored procedures and Dataform pipelines so they can be consumed by other applications
enhance differential privacy implementations with technology from Google Cloud partners Gretel.ai and Tumult Analytics
call frameworks like PipelineDP.io
So what is BigQuery differential privacy exactly?
BigQuery differential privacy is three capabilities:
Differential privacy in GoogleSQL – You can use differential privacy aggregate functions directly in GoogleSQL
Differential privacy enforcement in BigQuery data clean rooms – You can apply a differential privacy analysis rule to enforce that all queries on your shared data use differential privacy in GoogleSQL with the parameters that you specify
Parameter-driven privacy budgeting in BigQuery data clean rooms – When you apply a differential privacy analysis rule, you also set a privacy budget to limit the data that is revealed when your shared data is queried. BigQuery uses parameter-driven privacy budgeting to give you more granular control over your data than query thresholds do and to prevent further queries on that data when the budget is exhausted.
BigQuery differential privacy enforcement in action
Here’s how to enable the differential privacy analysis rule and configure a privacy budget when you add data to a BigQuery data clean room.
Subscribers of that clean room must then use differential privacy to query your shared data.
Subscribers of that clean room cannot query your shared data once the privacy budget is exhausted.
Get started with BigQuery differential privacy
BigQuery differential privacy is configured when a data owner or contributor shares data in a BigQuery data clean room. A data owner or contributor can share data using any compute pricing model and does not incur compute charges when a subscriber queries that data. Subscribers of a data clean room incur compute charges when querying shared data that is protected with a differential privacy analysis rule. Those subscribers are required to use on-demand pricing (charged per TB) or the Enterprise Plus edition (charged per slot hour).
Create a clean room where all queries are protected with differential privacy today and let us know where you need help.
Read More for the details.