GCP – Get to know Workflows, Google Cloud’s serverless orchestration engine
Whether your company is processing e-commerce transactions, producing goods or delivering IT services, you need to manage the flow of work across a variety of systems. And while it’s possible to manage those workflows manually or with general-purpose tools, doing so is much easier with a purpose-built product.
Google Cloud has two workflow tools in its portfolio: Cloud Composer and the new Workflows. Introduced in August, Workflows is a fully managed workflow orchestration product running as part of Google Cloud. It’s fully serverless and requires no infrastructure management.
In this article we’ll discuss some of the use cases that Workflows enables, its features, and tips on using it effectively.
A sample workflow
First, consider the following workflow for generating an invoice:
A common way to orchestrate these steps is to call API services based on Cloud Functions, Cloud Run or a public SaaS API, e.g. SendGrid, which sends an e-mail with our PDF attachment. But real-life scenarios are typically much more complex than the example above and require continuous tracking of all workflow executions, error handling, decision points and conditional jumps, iterating arrays of entries, data conversions and many other advanced features.
Which is to say, while technically you can use general-purpose tools to manage this process, it’s not ideal. For example, let’s consider some of the challenges you’d face processing this flow with an event-based compute platform like Cloud Functions. First, the max duration of a Cloud Function run is nine minutes, but workflows—especially those involving human interactions—can run for days; your workflow may need more time to complete, or you may need to pause in between steps when polling for a response status. Attempting to chain multiple Cloud Functions together with for instance, Pub/Sub also works, but there’s no simple way to develop or operate such a workflow. First, in this model it’s very hard to associate step failures with workflow executions, making troubleshooting very difficult. Also, understanding the state of all workflow executions requires a custom-built tracking model, further increasing the complexity of this architecture.
In contrast, workflow products provide support for exception handling and give visibility on executions and the state of individual steps, including successes and failures. Because the state of each step is individually managed, the workflow engine can seamlessly recover from errors, significantly improving reliability of the applications that use the workflows. Lastly, workflow products often come with built-in connectors to popular APIs and cloud products, saving time and letting you plug into existing API interfaces.
Workflow products on Google Cloud
Google Cloud’s first general purpose workflow orchestration tool was Cloud Composer.
Based on Apache Airflow, Cloud Composer is great for data engineering pipelines like ETL orchestration, big data processing or machine learning workflows, and integrates well with data products like BigQuery or Dataflow . For example, Cloud Composer is a natural choice if your workflow needs to run a series of jobs in a data warehouse or big data cluster, and save results to a storage bucket.
However, if you want to process events or chain APIs in a serverless way—or have workloads that are bursty or latency-sensitive—we recommend Workflows.
Workflows scales to zero when you’re not using it, incurring no costs when it’s idle. Pricing is based on the number of steps in the workflow, so you only pay if your workflow runs. And because Workflows doesn’t charge based on execution time, if a workflow pauses for a few hours in between tasks, you don’t pay for this either.
Workflows scale up automatically with very low startup time and no “cold start” effect. Also, it transitions quickly between steps, supporting latency-sensitive applications.
Workflows use cases
When it comes to the number of processes and flows that Workflows can orchestrate, the sky’s the limit. Let’s take a look at some of the more popular use cases.
Processing customer transactions
Imagine you need to process customer orders and, in the case that an item is out of stock, trigger an inventory refill from an external supplier. During order processing you also want to notify your sales reps about large customer orders. Sales reps are more likely to react quickly if they get such notifications using Slack.
Here is an example workflow diagram.
The workflow above orchestrates calls to Google Cloud’s Firestore as well as external APIs including Slack, SendGrid or the inventory supplier’s custom API. It passes the data between the steps and implements decision points that execute steps conditionally, depending on other APIs’ outputs.
Each workflow execution—handling one transaction at a time—is logged so you can trace it back or troubleshoot it if needed. The workflow handles necessary retries or exceptions thrown by APIs, thus improving the reliability of the entire application.
Processing uploaded files
Another case you may consider is a workflow that tags files that users have uploaded based on file contents. Because users can upload text files, images or videos, the workflow needs to use different APIs to analyze the content of these files.
In this scenario, a Cloud function is triggered by a Cloud Storage trigger. Then, the function starts a workflow using the Workflows client library, and passes the file path to the workflow as an argument.
In this example, a workflow decides which API to use depending on the file extension, and saves a corresponding tag to a Firestore database.
Workflows under the hood
You can implement all of these use cases out of the box with Workflows. Let’s take a deeper look at some key features you’ll find in Workflows.
Steps
Workflows handles sequencing of activities delivered as ‘steps’. If needed, a workflow can also be configured to pause between steps without generating time-related charges.
In particular, you can orchestrate practically any API that is network-reachable and follows HTTP as a workflow step. You can make a call to any internet-based API, including SaaS APIs or your private endpoints, without having to wrap such calls in Cloud Functions or Cloud Run.
Authentication
When making calls to Google Cloud APIs, e.g., to invoke a Cloud function or read data from Firestore, Workflows uses built-in IAM authentication. As long as your workflow has been granted IAM permission to use a particular Google Cloud API, you don’t need to worry about authentication protocols.
Communication between workflow steps
Most real-life workflows require that steps communicate with one another. Workflows supports built-in variables that steps can use to pass the result of their work to a subsequent step.
Automatic JSON conversion
As JSON is very common in API integrations, Workflows automatically converts API JSON responses to dictionaries, making it easy for the following steps to access this information.
Rich expression language
Workflows also comes with a rich expression language supporting arithmetic and logical operators, arrays, dictionaries and many other features. The ability to perform basic data manipulations directly in the workflow further simplifies API integrations. Because Workflows accepts runtime arguments, you can use a single workflow to react to different events or input data.
Decision points
With variables and expressions, we can implement another critical component of most workflows: decision points. Workflows can use custom expressions to decide whether to jump to another part of the workflow or conditionally execute a step.
Conditional step execution
Frequently used parts of the logic can be coded as a sub-workflow and then called as a regular step, working similarly to routines in many programming languages.
Sometimes, a step in a workflow fails, e.g., due to a network issue or because a particular API is down. This, however, shouldn’t immediately make the entire workflow execution fail.
Workflows avoids that problem with a combination of configurable retries and exception handling that together allow a workflow to react appropriately to an error returned by the API call.
All features above are configurable as part of the Workflows source code. You can see practical examples of these configurationshere.
Get started with Workflows today
Workflows is a powerful new addition to Google Cloud’s application development and management toolset, and you can try it out immediately on all your projects.
Have a look at theWorkflows site or go right ahead to theCloud Console to build your first workflow. Workflows comes with a free tier so you can give it a try at no cost. Also, watch out for exciting Workflows announcements coming soon!
Happy orchestrating! 🙂
Read More for the details.