, Author at Cloud bites from the grill

About

Posts by :

2024 07 10

GCP – How AlloyDB transformed Bayer’s data operations

Editor’s note: Bayer built its modern data solution, Field Answers, to store and analyze vast amounts of observational data. As part of the process of preparing to onboard new market segments, extensive load testing revealed that a new database solution was necessary to handle the dramatic increase in traffic. Migrating to AlloyDB for PostgreSQL has helped streamline operations, centralize solutions, and improve collaboration across the company.

Bayer uses the power of science to shape the future of farming. Our Global Data Assets team manages geospatial data for the company. We support hundreds of teams and applications with access to maps, phenotypic observations (observed physical characteristics of a plant), satellite imagery, and other environmental data like weather and soil strata.

Field Answers is a modern data solution we created to efficiently collect and compute billions of observations across field and greenhouse operations globally. This data is vital for the decisions made at various stages in our research and development (R&D) pipelines, including choosing the best seeds, optimizing the costs of production, and marketing our products to farmers. However, managing such a large-scale system presents its own challenges.

Weeding out database challenges

As we prepared to onboard a new market segment to Field Answers, we anticipated a dramatic increase in traffic to the tool. Field Answers is a distributed solution, and its sensitivity to order and replication lag can affect its performance. Based on extensive load testing, we knew our open-source PostgreSQL setup would not be able to meet latency and throughput demands. This would reduce access to the valuable datasets our teams require.

We needed a new database, and we needed it fast. After trying multiple products, AlloyDB for PostgreSQL emerged as our top choice. We received consistent support from the Google Cloud team as we tested AlloyDB, which assured us they would be there in case of any unforeseen migration issues. Because it was compatible with our existing Postgres database, we could migrate with zero application changes and hit our aggressive migration timelines. With the North American agricultural planting season just around the corner, that compatibility was huge!

Harvesting growth with AlloyDB

Migrating to AlloyDB has been transformative for our business. In our previous PostgreSQL setup, the primary writer was responsible for both write operations and replicating those changes to reader nodes. The anticipated increase in write traffic and reader count would have overwhelmed this node, leading to potential bottlenecks and increased replication lag. AlloyDB’s architecture, which utilizes a single source of truth for all nodes, significantly reduced the impact of scaling read traffic. After migrating, we saw a dramatic improvement in performance, ensuring our ability to meet growing demands and maintain consistently low replication delay. In parallel load tests, a smaller AlloyDB instance reduced response times by over 50% on average and increased throughput by 5x compared to our previous PostgreSQL solution.

By migrating to AlloyDB, we’ve ensured that our business growth won’t be hindered by database limitations, allowing us to focus on innovation. The true test of our migration came during our first peak harvest season, a time where performance is critical for product decision timelines. Due to agriculture’s seasonal nature, a delay of just a few days can postpone a product launch by an entire year. Our customers were understandably nervous, but thanks to Google Cloud and AlloyDB, the harvest season went as smoothly as we could have hoped for.

Cultivating a thriving data architecture

Partnering with Google Cloud has played a crucial role in implementing our data strategy, an adaptation of the data mesh approach where each asset serves a particular data domain. This approach allows us to decentralize data ownership and management, enabling domain-driven teams to take responsibility for their data while ensuring quality, accessibility, and governance.

To support our data strategy, we have adopted a consistent architecture across our Google Cloud projects. For a typical project, the stack consists of Google Kubernetes Engine (GKE) hosted pods and pipelines for publishing events and analytics data. While Bayer uses Apache Kafka across teams and cloud providers for data streaming, individual teams regularly use Pub/Sub internally for messaging and event-driven architectures. Data for analytics and reporting is generally stored in BigQuery, with custom processes for materialization once it lands. By using cross-project BigQuery datasets, we are able to work with a larger, real-time user group and enhance our operational capabilities.

Nurturing innovation through collaboration

Looking ahead, we’re excited about the potential to combine AlloyDB, Datastream, Pub/Sub, and BigQuery. With the built-in integrations between these tools, we see opportunities to reduce toil, increase reliability, and scale our applications more effectively. We’re also eager to explore AlloyDB’s integration with Vertex AI, which could open up new opportunities to use machine learning and advanced analytics.

As we continue our journey with Google Cloud, we’re confident we have the right tools to tackle the challenges and opportunities that lie ahead. By leveraging the power of AlloyDB and the Google Cloud ecosystem, we’re not only enhancing our own operational capabilities but also contributing to the future of farming. With more efficient and innovative solutions, we can help farmers make data-driven decisions, optimize their operations, and ultimately, feed the world more sustainably. The future of agriculture is digital, and we’re proud to be at the forefront of this transformation with Google Cloud by our side.

Next steps

Learn more about AlloyDB for PostgreSQL and get started for free today!

Discover the power of BigQuery.

Read More for the details.

2024 07 10

GCP – IAM so lost: A guide to identity in Google Cloud

Cloud, Google Cloud gcp

Identity and access management (IAM) can seem like a gentle hill at first. It’s not too difficult to manage, assigning logins and passwords, and granting access control based on a person’s job in an organization. However, that gentle hill can shift as you start to configure IAM. Suddenly, it can feel like a steep mountain slope: Terminology and concepts overlap and slip beneath your feet, which is a serious problem because security boundaries in the cloud are based on understanding crucial identity concepts.

To help you regain solid footing, let’s start by demystifying two foundational IAM access control principles: the concepts of least privilege and separation of duties. While often intermingled, they are distinct. Least-privilege is the idea that you must assign only the minimum rights needed to perform a task, while separation-of-duties requires that these rights are segregated by job responsibility.

Both are intended to reduce the impact of a negative act executed in your cloud environment. Just as when climbing Mount Everest, there are different climber classifications. A few of the climbers are guides, and even fewer are expedition leaders. While some of these functions might have overlapping responsibilities, specific jobs such as guides are focused on getting climbers to the top of the mountain.

For example, you wouldn’t want another random climber trying to lead the operation which could potentially put individuals in danger. You want an experienced expedition leader to be in charge.

How to summit the peak of access control

Now we can start to scale the mountain of access control. To build an IAM architecture that supports these concepts of least privilege and separation of duties, you need to map identities to the people in your organization.

That may seem obvious, but terms such as personas, principals, groups, roles, allow and deny policies can seem overwhelming. What do all these words mean? You don’t have to feel lost anymore, because we can help demystify these terms to help you ascend from base camp.

Part of the challenge of IAM is learning how to use I AM tools to apply IAM concepts.

Let’s review some terminology you’ll need to understand when configuring IAM in Google Cloud.

Principal: Something that needs access to perform an action on a resource in Google Cloud. These objects can be end-user or service accounts, groups, a Google Workspace account, or a Cloud Identity domain that can access a resource. To remember this term, use the mnemonic, “A principal is your pal in the cloud.”

Role: A collection of permissions that grants access to a resource.

Policy: A collection of roles attached to a resource. A policy will allow or deny a principal’s access to a resource.

Remember how we mentioned groups and personas? Fundamentally, these are the same thing. A persona is a job function. For example, you might have a persona called platform engineer or security analyst. These personas then become your groups when you map the humans in your organization to these job functions or personas. This matters because you want to assign permissions or roles to groups, but not directly to principals.

Managing IAM: Staying on top of the mountain

Let’s apply some of these concepts in a real-world scenario. Imagine you’re a platform engineer for a startup. Initially, you manually assign permissions (roles) to each individual based on their needs. Maybe you even use basic roles, because it seems easier to give everyone “editor” rights.

This approach works well when your team is small, but as your company expands and access control requirements increase, so does the complexity of managing all those permissions. New employees join, teams restructure, and responsibilities shift. Manually keeping up with these changes quickly turns into a nightmare, especially when you’re trying to achieve least-privilege and separation of duties.

Persona mapping can help address this scalability challenge with your IAM architecture. Instead of focusing on individual principals, you create groups based on job functions from your org chart. For example, you might have groups for developers, platform engineers, data scientists, and security engineers. Each group is then assigned the roles they need to perform their specific tasks.

By associating roles with groups, you simplify administration and ensure consistent access for users with similar job functions. Consider the scenario where Varsha, a software engineer, needs access to deploy a web application hosted in Google Cloud. With persona mapping, the IAM workflow would look like this:

Identify the persona: Varsha is classified as a software engineer for a web application deployed using Cloud Run.

Assign the group: Software engineers for this web application are mapped to the “web-app-developer” group.

Grant appropriate roles: The “web-app-developer” group is assigned the “Cloud Run Developer role,” providing the necessary read and write access to Cloud Run resources within Google Cloud.

This approach ensures that Varsha and other software engineers have the access they need while maintaining a manageable and scalable access control framework. Additionally, by assigning roles to groups, you create an IAM configuration that is self-documenting and easier to manage through automation.

A view of IAM building blocks.

To recap, by properly using the building blocks of IAM you will be able to:

Streamline onboarding: New employees are simply added to the appropriate group based on their job function, automatically granting them the necessary access.

Reduce administrative overhead: Manage permissions at the group level, saving you time and effort by creating a self-documenting IAM architecture.

Enhance security: By using consistent access control across similar roles, you minimize the risk of over-provisioning permissions for your principals.

Simplify auditing: Track access and identify potential issues more easily with clearly-defined group names based on job functions.

By understanding these foundational concepts and using techniques such as persona mapping, you can turn that IAM mountain back into a hill. Your organization will be on the path to better access control in the cloud, which is identity-first, using a scalable and adaptable IAM architecture that grows with your organization.

Read more about the building blocks of IAM in our documentation and in the manage identity and access section of the security, privacy, and compliance pillar of Google Cloud’s architecture framework. Also, remember our IAM policy intelligence tools, which can help you migrate to this modular, self-documenting approach.

Read More for the details.

2024 07 10

GCP – Faster API development with the Cloud Code plugin for API management

Cloud, Google Cloud gcp

Developers are at the heart of API delivery, and they need tools that make the process faster and easier. Apigee is a powerful API management platform that helps enterprises build, manage, monetize and scale APIs with tools that help developers simplify and speed up API development.

Apigee is accessible in the Google Cloud console, as well as via most commonly used IDEs via the Cloud Clode plug-in. With the Cloud Code plugin in VSCode, developers can create workspaces, build proxies, and even leverage Gemini Code Assist to streamline API delivery. To better understand the plug-in let’s first take a closer look at a workspace.

Workspace for organization

A workspace in Apigee is a collection of folders that is automatically created to help you develop APIs. Workspaces save time by providing you with a structured starting point to leverage all the tools you need for API development in one location. A workspace consists of:

API proxy: Proxies consist of a proxy endpoint (your API URL) and a target endpoint (connection point to your backend service). A proxy decouples your API from your backend. Additionally, you can enhance your proxy by extending functionality and management capabilities using policies which can be attached at different points as messages flow through your proxy.

Shared flow: A shared flow allows you to reuse common functionality like policies and resources across multiple APIs within an organization.

Environments: An environment is a dedicated space for creating, testing and deploying API proxies.

Test data: Test data allows you to bundle together a set of common resources that a developer might see in a production environment. Things like API products, developer accounts, test developer applications and values that your API would need to function.

Emulators: Emulators allow you to verify the functionality through unit and manual testing using a docker image (local runtime).

Gemini Code Assist for API management (Preview)

Gemini Code Assist unlocks the power of generative AI to help you build great software. Create an Open API specification from requirements that you describe in natural language within seconds, all from within the Cloud Code plugin. With tight integration with API Hub, you can register these specifications right from the plugin.

Harnessing the power of Gemini Code Assist starts with a prompt. A prompt allows you to describe your API specification in natural language. Gemini Code Assist offers contextual AI assistance because it understands objects in your organization’s API hub, which it uses to generate a tailored specification response in seconds based on your business requirements. It does this by reusing objects, metadata and security schemas from specifications that are already registered in your organization’s API hub, promoting consistency across all of your APIs.

Let’s say you want to create an API for placing new orders. In your prompt you may write, “Create an API for placing new orders.” Gemini then takes a look at all the APIs registered in API Hub and uses those business requirements to generate a draft. In the specification, Gemini pulls objects and schemas from the APIs in your registry to match the business requirements of other APIs. You can then refine the prompt until you get my desired specification using the editor. Additionally you can even see your prompt history all from the same screen.

Mock servers for increased velocity

Testing is a vital part of the API development process. The Cloud Code Plugin allows you to not only develop APIs but also test them locally or with others. Whenever an API specification is generated, a local mock server is also created alongside the specification. You can interact with your API from within the same UI you created your specification. Additionally you can modify parameters and insert mock values to simulate common scenarios and use cases. These auto generated mock servers allow for faster functionality verification of your API proxies.

In addition, you can easily deploy such a mock server to your Google Cloud project with a simple workflow. When you deploy your mock server to the cloud, you can share it with others on your team, speeding up the process of collecting feedback. For example, you as a service developer may share a mock service with your front-end counterparts who can try out the service before you write a single line of code implementing the API.

Get started

Ready to build and deploy an API with the Cloud Code plugin? Click here to get started. With Gemini Code Assist for API management you can reimagine API development.

Read More for the details.

2024 07 10

About

Posts by :

Weeding out database challenges

Harvesting growth with AlloyDB

Cultivating a thriving data architecture

Nurturing innovation through collaboration

How to summit the peak of access control

Managing IAM: Staying on top of the mountain

I Hate IAM: but I need it desperately

Workspace for organization

Gemini Code Assist for API management (Preview)

Mock servers for increased velocity

Get started