GCP – Centralized Multi Cluster Ingress with Anthos Service Mesh
We have never seen a proliferation of open source products sparked by a tool like we have seen with Kubernetes. These building blocks are what make Kubernetes the de facto container orchestrator to securely and reliably run services. Since organizations don’t run Kubernetes on its own, Kubernetes is more often than not deployed alongside other tools to solve real business challenges that Kuberntes doesn’t natively solve. Google offers a managed collection of these products under the Anthos umbrella.
In this article, we will demonstrate how organizations can leverage Anthos to centralize the management of internet traffic using Multi Cluster Ingress (MCI) and Anthos Service Mesh (ASM).
Problem statement
For large organizations, the separation of duty is a major concern. It’s this concern that drives the design of the cloud foundation. Projects are the boundaries to create least privileges strategy for most of the resources Google Cloud Platform provides. Hence, it’s fair to assume that separate teams would have separate projects to run their workloads.
This separation of duty applies also to the management of the networking. Shared VPC is one of the tools teams have to centrally manage networking. This allows multiple projects to share the same VPC. Having the same VPC is also required for some features Anthos provides.
Applications running in Kubernetes sometimes need to be exposed to the internet. Exposing the application can be achieved using a load balancer. The management of the lifecycle and configuration of the load balancer is done through annotations and custom resource definitions (CRD). MCI is a managed controller that manages the lifecycle of Google Cloud Global Load Balancer (GLBC) in order to expose services running on Kubernetes to the internet. Unfortunately, GLBC doesn’t support cross-project backends. Since projects are the foundation to implementing least privilege strategy, one solution is to deploy a GLBC per project. The management overhead that this would create for the network and security teams is substantial. The second solution is to deploy one GLBC using MCI and use ASM to route the traffic cross-projects. This can also enable security teams to achieve the required separation of duty with a seamless separation of responsibilities while minimizing confusion and accidental configuration errors.
Now that we have talked about the problem we are solving and what tools we are going to use, let’s deep dive into how we are going to do it.
Architecture Diagram
This architecture shows the suggested solution to support MCI in a multi project setup. Both the fleet project and the service project share the same VPC. The shared VPC host project is not depicted in the picture but it’s a separate project. The fleet project refers to the project where the Fleet API is enabled. All the clusters that are part of the same fleet are also part of the same mesh. One of the fleet GKE clusters hosts the MCI configuration, but the configurations are deployed in both Fleet GKE clusters for high availability. Deploying the same configuration in both clusters makes it both possible and easy to swap the configuration cluster.
The service projects on the other hand host workloads. Development teams would have access to a namespace to deploy services they manage.
How to Implement this Solution
To implement this solution, a deep understanding of how the traffic flows is key. The following steps describe the process to make sure everything is working. Before starting, take the following actions:
Make sure MCI is already set up.
Create a namespace for ingress gateway and other resources.
Deploy the multi cluster service resource and use the selector that would send the traffic to the ingress controller that was deployed. The cluster spec should only link Fleet GKE clusters because this is where the ASM ingress resources are deployed. The multi cluster service resource creates Network Endpoint Groups (NEG).
Conclusion
In this blog post, we have seen that even though MCI doesn’t support backends in different projects we can overcome this limitation using ASM. The combination of both products makes it easy to centralize traffic management and reuse the same GLBC for workloads running in different projects. Make sure to check the GKE Network Recipes repository for more GKE ingress examples.
Read More for the details.