GCP – Use IAM custom roles to manage access to your BigQuery data warehouse
When migrating a data warehouse to BigQuery, one of the most critical tasks is mapping existing user permissions to equivalent Google Cloud Identity and Access Management (Cloud IAM) permissions and roles. This is especially true for migrating from large enterprise data warehouses like Teradata to BigQuery. The existing Teradata databases commonly contain multiple user-defined roles that combine access permissions and capture common data access patterns. Mapping those Teradata roles to predefined or custom BigQuery IAM roles requires a deeper understanding of your organization’s common data access patterns.
Based on our experiences helping customers migrate to BigQuery, we’ve identified some common data access patterns that our customers define as roles in their Teradata environments. In this post, you’ll learn how to map those common Teradata user-defined roles to BigQuery IAM custom roles. Those roles may be helpful not only to users who migrate from Teradata but also to any data admins who manage data warehouses on BigQuery. Understanding this concept ahead of your migration can help save time and ensure that your users and data are protected throughout the process.
Teradata access rights codes and user-defined roles
In Teradata, access rights codes describe the user access privilege on a particular database, table, or column. There are some common access rights codes combinations that describe common actions that a user can perform on Teradata objects. For example, one user may only read and modify metadata, another user may read the data, and yet another user may read and modify that data.
Here are the common combinations of access rights codes with corresponding role name and description:
Note that to build views or stored procedures in both Teradata and BigQuery, a user should have access to objects that are referenced in those views or procedures in addition to the schema editor or developer role.
Cloud IAM equivalent permissions
Our Cloud IAM controls are used by some of our most security-conscious customers, and you can map many of the concepts you’re used to in Teradata into Google Cloud. You can grant permissions to access BigQuery by granting roles to a user, a group, or a service account. There are three types of roles in Cloud IAM:
- Predefined roles are managed by Google Cloud and meant to support common use cases.
- Custom roles are a user-specified list of permissions. You’ll leverage them to map BigQuery IAM to Teradata user-defined roles.
- Primitive roles existed prior to the introduction of Cloud IAM.
Below, you’ll see how to map identified Teradata roles to BigQuery Cloud IAM permissions:
Note that none of the above roles grant a user permissions to create datasets, or grant permissions to other users. Those actions are best performed by the data warehouse admin, for whom BigQuery specifies the predefined Cloud IAM role roles/bigquery.admin.
Create and assign Cloud IAM roles
Your next step is to create corresponding Cloud IAM custom roles with the privileges listed above. The fastest way to assign multiple permissions to a role is to use gcloud command, as described in the documentation.
In Google Cloud, you can create a custom role on a project or organization level. If you decide to create a role on the organization level, consider adding resourcemanager.projects.get and resourcemanager.projects.list permissions to the schema reader and schema editor roles. Those additional permissions authorize a user to see information about projects in your organization, which fosters openness and transparency in the cloud environment.
After you define the custom roles, the next step is to bind those roles to a Google group (groups offer a convenient method of assigning roles to users). These bindings of roles to groups form a policy, and you can attach this policy to Google Cloud resources at any level of your entire organization’s resource hierarchy (shown in the image below). Attaching policies in this way provides optimal resource sharing by limiting the need to duplicate data as a means for sharing data.
For example, an engineer in your organization may create stored procedures in the BI-dev project, which reads data stored in the Data-dev project, in addition to engineers running their BigQuery jobs from the Billed-dev project to easily gauge the engineering spend using the project-level total in your invoice. To implement this in Google Cloud, grant your engineering group these roles:
-
Developer role on BigQuery datasets in BI-dev
-
Data reader role on BigQuery datasets, tables, and views in Data-dev
-
Predefined BigQuery roles/bigquery.jobUser role or at least bigquery.jobs.create permission on Billed-dev project.
Give it a shot
In addition to trying out the roles we’ve described here, consider using BigQuery predefined roles, which are helpful in managing your data warehouse users.
Special thanks to Daryus Medora, who verified the permissions mapping and provided valuable feedback on this content.
Read More for the details.