GCP – Cloud Run adds support for gradual rollouts and rollbacks
Like all developers, you want the confidence that your next deployment will be healthy.
But deploying immediately to 100% of the traffic can be risky in the event something goes wrong. A best practice is to roll out changes gradually and observe any regressions or errors that might occur in the new version of the code. In fact, at Google, we do rollouts over four days for many of our components.
Cloud Run, our fully managed container compute platform, now allows you to have more control over the rollout of your changes. As always, any change to the configuration of a Cloud Run service creates a new revision, and by default, Cloud Run automatically rolls 100% to newly created revisions. But now you can also decide to manually split traffic between revisions, allowing you to gradually roll out revisions or roll back to an older revision.
A gradual rollout
Let’s take a look at an example of a gradual rollout, a so-called “blue/ green” deployment. We’ll go gradually from the current revision (tagged “blue”) to a new revision (tagged “green”).
We build our new code into a container image that we tag with the hash of our git commit (“f5bd774” in our example):
gcloud builds submit . --tag gcr.io/project/image:f5bd774
We can now run integration tests on this container and deploy it to a staging environment (for example in a staging GCP project). Once this container has been vetted, it’s time to push it to production.
We deploy this new container to our production service, creating a new revision, but without sending any traffic to it. We also tag this revision as green.
gcloud beta run deploy myservice --image gcr.io/project/image:f5bd774 --no-traffic --tag green
The ”green” tag allows us to directly test the new revision at a specific URL, without even needing to migrate traffic to it.
As a developer, we can test the tagged revision at
https://green---myservice-abcdef.a.run.app
After confirming that the new revision works properly, we can start migrating traffic to it:
On the first day, we start by migrating 1% of the traffic to the new revision.
Here’s how to do that from the command line:
$ gcloud beta run services update-traffic myservice --to-tags green=1
Or from the Cloud Console:
We can now look at Cloud Monitoring, Logging or Error Reporting, filtering by revision to see if any new errors or latency show up.
The next day we migrate 10% of the traffic to the new revision and keep an eye on health indicators:
gcloud beta run services update-traffic myservice --to-tags green=10
On Day 3, we move 50% of traffic to the new revision:
gcloud beta run services update-traffic myservice --to-tags green=50
And on Day 4, we route all traffic to this revision:
gcloud beta run services update-traffic myservice --to-tags green=100
The rollout is now complete. The new code now serves all requests reaching the Cloud Run service.
Of course, you don’t need to perform a rollout over four days—you can gradually roll it out over a few hours using the same commands.
Rolling back
If at any time during the deployment or later something goes wrong, we can roll back the traffic back to the older stable revision tagged “blue” with:
gcloud beta run services update-traffic myservice --to-tags blue=100
Or, in the event the revision was not tagged, we can roll back to any specific revision by its name using gcloud or the user interface. Here’s the command line:
gcloud run services update-traffic myservice --to-revisions my-service-0002-joy=100
Here’s how to perform a rollback from the user interface:
Safety first
When adding new features to a service, the last thing you want to do is introduce errors. With support for gradual rollouts and rollback, you can feel confident using Cloud Run to host your most important, user-facing applications. Give Cloud Run a try today, and let us know what other features you’d like to see.
Read More for the details.