GCP – Hands-on with Anthos on bare metal
Hands on with Anthos on Bare Metal
In this blog post I want to walk you through my experience of installing Anthos on bare metal (ABM) in my home lab. It covers the benefits of deploying Anthos on bare metal, necessary prerequisites, the installation process, and using Google Cloud operations capabilities to inspect the health of the deployed cluster. This post isn’t meant to be a complete guide for installing Anthos on bare metal, for that I’d point you to the tutorial I posted on our community site.
What is Anthos and Why Run it on Bare Metal?
We recently announced that Anthos on bare metal is generally available. I don’t want to rehash the entirety of that post, but I do want to recap some key benefits of running Anthos on your own systems, in particular:
- Removing the dependency on a hypervisor can lower both the cost and complexity of running your applications.
- In many use cases, there are performance advantages to running workloads directly on the server.
- Having the flexibility to deploy workloads closer to the customer can open up new use cases by lowering latency and increasing application responsiveness.
Environment Overview
In my home lab I have a couple of Intel Next Unit of Computing (NUC) machines. Each is equipped with an i7 processor, 32GB of RAM, and a single 250GB SSD. Anthos on bare metal requires 32GB of RAM and at least 128GB of free disk space.
Both of these machines are running Ubuntu Server 20.04 LTS, which is one of the supported distributions for Anthos on bare metal. The others are Red Hat Enterprise Linux 8.1 and CentOS 8.1.
One of these machines will act as the Kubernetes control plane, and the other will be my worker node. Additionally I will use the worker node to run bmctl
, the Anthos on bare metal command line utility used to provision and manage the Anthos on bare metal Kubernetes cluster.
On Ubuntu machines, Apparmor and UFW both need to be disabled. Additionally, since I’m using the worker node to run bmctl
I need to make sure that gcloud
, gsutils
, and Docker 19.03 or later are all installed.
On the Google Cloud side I need to make sure I have a project created where I have the owner
and editor
roles. Anthos on bare metal also makes use of three service accounts and requires a handful of APIs. Rather than creating the service accounts and enabling the APIs myself I chose to let bmctl
do that work for me.
Since I want to take a look at the Cloud Operations dashboards that Anthos on bare metal creates, I need to provision a Cloud Monitoring Workspace.
When you run bmctl
to perform installation, it uses SSH to execute commands on the target nodes. In order for this to work, I need to ensure I configured passwordless SSH between the worker node and the control plane node. If I was using more than two nodes I’d need to configure connectivity between the node where I run bmctl
and all the targeted nodes.
With all the prerequisites met, I was ready to download bmctl
and set up my cluster.
Deploying Your Cluster
To actually deploy a cluster I need to perform the following high-level steps:
- Install
bmctl
- Verify my network settings
- Create a cluster configuration file
- Modify the cluster configuration file
- Deploy the cluster using bmctl and my customized cluster configuration file.
Installing bmctl
is pretty straightforward. I used gsutil
to copy it down from a Google Cloud storage bucket to my worker machine, and set the execution bit.
Anthos on Bare Metal Networking
When configuring Anthos on bare metal, you will need to specify three distinct IP subnets.
Two are fairly standard to Kuberenetes: the pod network and the services network.
The third subnet is used for ingress and load balancing. The IPs associated with this network must be on the same local L2 network as your load balancer node (which in my case is the same as the control plane node). You will need to specify an IP for the load balancer, one for ingress, and then a range for the load balancers to draw from to expose your services outside the cluster. The ingress VIP must be within the range you specify for the load balancers, but the load balancer IP may not be in the given range.
The CIDR range for my local network is 192.168.86.0/24. Furthermore, I have my Intel NUCs all on the same switch, so they are all on the same L2 network.
One thing to note is that the default pod network (192.168.0.0/16) overlapped with my home network. To avoid any conflicts, I set my pod network to use 172.16.0.0/16. Because there is no conflict, my services network is using the default (10.96.0.0/12). It’s important to ensure that your chosen local network doesn’t conflict with the bmctl
defaults.
Given this configuration, I’ve set my control plane VIP to 192.168.86.99. The ingress VIP, which needs to be part of the range that you specify for your load balancer pool, is 192.168.86.100. And, I’ve set my pool of addresses for my load balancers to 192.168.86.100-192.168.86.150.
In addition to the IP ranges, you will also need to specify the IP address of the control plane node and the worker node. In my case the control plane is 192.168.86.51 and the worker node IP is 192.168.86.52.
Create the Cluster Configuration File
To create the cluster configuration file, I connected to my worker node via SSH. Once connected I authenticated to Google Cloud.
The command below will create a cluster configuration file for a new cluster named demo cluster
. Notice that I used the --enable-apis
and --create-service-accounts
flags. These flags tell bmctl
to create the necessary service accounts and enable the appropriate APis.
./bmctl create config -c demo-cluster
--enable-apis
--create-service-accounts
--project-id=$PROJECT_ID
Edit the Cluster Configuration File
The output from the bmctl create config
command is a YAML file that defines how my cluster should be built. I needed to edit this file to provide the networking details I mentioned above, the location of the SSH key to be used to connect to the target nodes, and the type of cluster I want to deploy.
With Anthos on bare metal, you can create standalone and multi-cluster deployments:
- Standalone: This deployment model has a single cluster that serves as a user cluster and as an admin cluster
- Multi-cluster: Used to manage fleets of clusters and includes both admin and user clusters.
Since I’m deploying just a single cluster, I needed to choose standalone.
Here are the specific changes I made to the cluster definition file.
Under the list of access keys at the top of the file:
- For the
sshPrivateKeyPath
variable I specified the path to my SSH private key
Under the Cluster definition:
- Changed the type to standalone
- Set the IP address of the control plane node
- Adjusted the CIDR range for the pod network
- Specified the control plane VIP
- Uncommented and specified the ingress VIP
- Uncommented the
addressPools
section (excluding actual comments) and specified the load balancer address pool
Under the NodePool definition:
- Specified the IP address of the worker node
For reference, I’ve created a GitLab snippet for my cluster definition yaml (with the comments removed for the sake of brevity).
Create the Cluster
Once I had modified the configuration file, I was ready to deploy the cluster using bmctl
using the create cluster
command.
./bmctl create cluster -c demo-cluster
bmctl
will complete a series of preflight checks before creating your cluster. If any of the checks fail, check the log files specified in the output.
Once the installation is complete, the kubeconfig file is written to /bmctl-workspace/demo-cluster/demo-cluster-kubeconfig
Using the supplied kubeconfig file, I can operate against the cluster as I would any other Kubernetes cluster.
Exploring Logging and Monitoring
Anthos on bare metal automatically creates three Google Cloud Operations (formerly Stackdriver) logging and monitoring dashboards when a cluster is provisioned: node status, pod status, and control plane status. These dashboards enable you to quickly gain visual insight into the health of your cluster. In addition to the three dashboards, you can use Google Cloud Operations Metrics Explorer to create custom queries for a wide variety of performance data points.
To view the dashboards, return to Google Cloud Console, navigate to the Operations section, and then choose Monitoring and Dashboards.
You should see the three dashboards in the list in the middle of the screen. Choose each of the three dashboards and examine the available graphs.
Conclusion
That’s it! Using Anthos on bare metal enables you to create centrally managed Kubernetes clusters with a few commands. Once deployed you can view your clusters in Google Cloud Console, and deploy applications as you would with any other GKE cluster. If you’ve got the hardware available, I’d encourage you to run through my hands-on tutorial.
Read More for the details.