Skip to content

Latest commit

 

History

History
957 lines (751 loc) · 47 KB

README.md

File metadata and controls

957 lines (751 loc) · 47 KB

A simple Kubernetes Operator to return VMware ESXi host information

This repository contains a very simple Kubernetes Operator that uses VMware's govmomi to return some simple ESXi host information through the status field of a Custom Resource (CR), which is called HostInfo. This will require us to extend Kubernetes with a new Custom Resource Definition (CRD). The code shown here is for education purposes only, showing one way in which a Kubernetes controller / operator can access the underlying vSphere infrastructure for the purposes of querying resources.

You can think of a CRD as representing the desired state of a Kubernetes object or Custom Resource, and the function of the operator is to run the logic or code to make that desired state happen - in other words the operator has the logic to do whatever is necessary to achieve the object's desired state.

What are we going to do in this tutorial?

In this example, we will create a CRD called HostInfo. HostInfo will contain the name of an ESXi host in its specification. When a Custom Resource (CR) is created and subsequently queried, we will call an operator (logic in a controller) whereby the Total CPU and Free CPU from the ESXi host will be returned via the status fields of the object through govmomi API calls.

The following will be created as part of this tutorial:

  • A Customer Resource Definition (CRD)

    • Group: Topology
      • Kind: HostInfo
      • Version: v1
      • Specification will include a single item: Spec.Hostname
  • One or more HostInfo Custom Resource / Object will be created through yaml manifests, each manifest containing the hostname of an ESXi host that we wish to query. The fields which will be updated to contain the relevant information from the ESXi host (when the CR is queried) are:

    • Status.TotalCPU
    • Status.FreeCPU
  • An Operator (or business logic) to retrieve the Total and Free CPU from the ESXi host specified in the CR will be coded in the controller for this CR.

Note: A similar exercise creates an operator to query virtual machine information. This can be found here. A third operator creates an operator that gets First Class Disk (FCD) information from a PV that is deployed on vSphere storage. The FCD operator is available here.

What is not covered in this tutorial?

The assumption is that you already have a working Kubernetes cluster. Installation and deployment of a Kubernetes is outside the scope of this tutorial. If you do not have a Kubernetes cluster available, consider using Kubernetes in Docker (shortened to Kind) which uses containers as Kubernetes nodes. A quickstart guide can be found here:

The assumption is that you also have a VMware vSphere environment comprising of at least one ESXi hypervisor which is managed by a vCenter server. While the thought process is that your Kubernetes cluster will be running on vSphere infrastructure, and thus this operator will help you examine how the underlying vSphere resources are being consumed by the Kubernetes clusters running on top, it is not necessary for this to be the case for the purposes of this tutorial. You can use this code to query any vSphere environment from Kubernetes.

What if I just want to understand some basic CRD concepts?

If this sounds even too daunting at this stage, I strongly recommend checking out the excellent tutorial on CRDs from my colleague, Rafael Brito. His RockBand CRD tutorial uses some very simple concepts to explain how CRDs, CRs, Operators, spec and status fields work.

Step 1 - Software Requirements

You will need the following components pre-installed on your desktop or workstation before we can build the CRD and operator.

  • A git client/command line
  • Go (v1.15+) - earlier versions may work but I used v1.15.
  • Docker Desktop
  • Kubebuilder
  • Kustomize
  • Access to a Container Image Repositor (docker.io, quay.io, harbor)
  • A make binary - used by Kubebuilder

If you are interested in learning more about Golang basics, I found this site very helpful.

Step 2 - KubeBuilder Scaffolding

The CRD is built using kubebuilder. I'm not going to spend a great deal of time talking about KubeBuilder. Suffice to say that KubeBuilder builds a directory structure containing all of the templates (or scaffolding) necessary for the creation of CRDs. Once this scaffolding is in place, this turorial will show you how to add your own specification fields and status fields, as well as how to add your own operator logic. In this example, our logic will login to vSphere, query and return ESXi host CPU statistics via a Kubernetes CR / object / Kind called HostInfo, the values of which will be used to populate status fields in our CRs.

The following steps will create the scaffolding to get started.

mkdir hostinfo
$ cd hostinfo

Next, define the Go module name of your CRD. In my case, I have called it hostinfo. This creates a go.mod file with the name of the module and the Go version (v1.15 here).

$ go mod init hostinfo
go: creating new go.mod: module hostinfo
$ ls
go.mod
$ cat go.mod
module hostinfo

go 1.15

Now we can proceed with building out the rest of the directory structure. The following kubebuilder commands (init and create api) creates all the scaffolding necessary to build our CRD and operator. You may choose an alternate domain here if you wish. Simply make note of it as you will be referring to it later in the tutorial.

kubebuilder init --domain corinternal.com

Here is what the output from the command looks like:

$ kubebuilder init --domain corinternal.com
Writing scaffold for you to edit...
Get controller runtime:
$ go get sigs.k8s.io/controller-runtime@v0.5.0
Update go.mod:
$ go mod tidy
Running make:
$ make
go: creating new go.mod: module tmp
go: found sigs.k8s.io/controller-tools/cmd/controller-gen in sigs.k8s.io/controller-tools v0.2.5
/usr/share/go/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."
go fmt ./...
go vet ./...
go build -o bin/manager main.go
Next: define a resource with:
$ kubebuilder create api
$

As the output from the previous command states, we must now define a resource. To do that, we again use kubebuilder to create the resource, specifying the API group, its version and supported kind. My group is called topology, my kind is called HostInfo and my initial version is v1.

kubebuilder create api \
--group topology       \
--version v1           \
--kind HostInfo        \
--resource=true        \
--controller=true

Here is the output from that command:

$ kubebuilder create api --group topology --version v1 --kind HostInfo --resource=true --controller=true
Writing scaffold for you to edit...
api/v1/hostinfo_types.go
controllers/hostinfo_controller.go
Running make:
$ make
go: creating new go.mod: module tmp
go: found sigs.k8s.io/controller-tools/cmd/controller-gen in sigs.k8s.io/controller-tools v0.2.5
/usr/share/go/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."
go fmt ./...
go vet ./...
go build -o bin/manager main.go

Our operator scaffolding (directory structure) is now in place. The next step is to define the specification and status fields in our CRD. After that, we create the controller logic which will watch our Custom Resources, and bring them to desired state (called a reconcile operation). More on this shortly.

Step 3 - Create the CRD

Customer Resource Definitions CRD are a way to extend Kubernetes through Custom Resources. We are going to extend a Kubernetes cluster with a new custom resource called HostInfo which will retrieve information from an ESXi host placed whose name is specified in a Custom Resource. Thus, I will need to create a field called hostname in the CRD - this defines the specification of the custom resource. We also add two status fields, as these will be used to return information like TotalCPU and FreeCPU from the ESXi host.

This is done by modifying the api/v1/hostinfo_types.go file. Here is the initial scaffolding / template provided by kubebuilder:

// HostInfoSpec defines the desired state of HostInfo
type HostInfoSpec struct {
        // INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
        // Important: Run "make" to regenerate code after modifying this file

        // Foo is an example field of HostInfo. Edit HostInfo_types.go to remove/update
        Foo string `json:"foo,omitempty"`
}

// HostInfoStatus defines the observed state of HostInfo
type HostInfoStatus struct {
        // INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
        // Important: Run "make" to regenerate code after modifying this file
}

// +kubebuilder:object:root=true

This file is modified to include a single spec.hostname field and to return two status fields. There are also a number of kubebuilder fields added, which are used to do validation and other kubebuilder related functions. The shortname "hi" will be used later on in our controller logic. This can also be used with kubectl, e.g kubectl get hi rather kubectl get hostinfo. Also, when we query any Custom Resources created with the CRD, e.g. kubectl get hostinfo, we want the output to display the hostname of the ESXi host.

Note that what we are doing here is for education purposes only. Typically what you would observe is that the spec and status fields would be similar, and it is the function of the controller to reconcile and differences between the two to achieve eventual consistency. But we are keeping things simple, as the purpose here is to show how vSphere can be queried from a Kubernetes Operator. Below is a snippet of the hostinfo_types.go showing the code changes. The code-complete hostinfo_types.go is here.

// HostInfoSpec defines the desired state of HostInfo
type HostInfoSpec struct {
        Hostname string `json:"hostname"`
}

// HostInfoStatus defines the observed state of HostInfo
type HostInfoStatus struct {
        TotalCPU int64 `json:"totalCPU"`
        FreeCPU  int64 `json:"freeCPU"`
}

// +kubebuilder:validation:Optional
// +kubebuilder:resource:shortName={"hi"}
// +kubebuilder:printcolumn:name="Hostname",type=string,JSONPath=`.spec.hostname`
// +kubebuilder:object:root=true
// +kubebuilder:subresource:status

We are now ready to create the CRD. There is one final step however, and this involves updating the Makefile which kubebuilder has created for us. In the default Makefile created by kubebuilder, the following CRD_OPTIONS line appears:

# Produce CRDs that work back to Kubernetes 1.11 (no version conversion)
CRD_OPTIONS ?= "crd:trivialVersions=true"

This CRD_OPTIONS entry should be changed to the following:

# Produce CRDs that work back to Kubernetes 1.11 (no version conversion)
CRD_OPTIONS ?= "crd:preserveUnknownFields=false,crdVersions=v1,trivialVersions=true"

Now we can build our CRD with the spec and status fields that we have place in the api/v1/hostinfo_types.go file.

make manifests && make generate

Here is the output from the make:

$ make manifests && make generate
go: creating new go.mod: module tmp
go: found sigs.k8s.io/controller-tools/cmd/controller-gen in sigs.k8s.io/controller-tools v0.2.5
/usr/share/go/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
go: creating new go.mod: module tmp
go: found sigs.k8s.io/controller-tools/cmd/controller-gen in sigs.k8s.io/controller-tools v0.2.5
/usr/share/go/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."

Step 4 - Install the CRD

The CRD is not currently installed in the Kubernetes Cluster.

$ kubectl get crd
NAME                                                               CREATED AT
antreaagentinfos.clusterinformation.antrea.tanzu.vmware.com        2020-11-18T17:14:03Z
antreacontrollerinfos.clusterinformation.antrea.tanzu.vmware.com   2020-11-18T17:14:03Z
clusternetworkpolicies.security.antrea.tanzu.vmware.com            2020-11-18T17:14:03Z
traceflows.ops.antrea.tanzu.vmware.com                             2020-11-18T17:14:03Z

To install the CRD, run the following make command:

make install

The output should look something like this:

$ make install
go: creating new go.mod: module tmp
go: found sigs.k8s.io/controller-tools/cmd/controller-gen in sigs.k8s.io/controller-tools v0.2.5
/usr/share/go/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
kustomize build config/crd | kubectl apply -f -
customresourcedefinition.apiextensions.k8s.io/hostinfoes.topology.corinternal.com created

Now check to see if the CRD is installed running the same command as before.

$ kubectl get crd
NAME                                                               CREATED AT
antreaagentinfos.clusterinformation.antrea.tanzu.vmware.com        2020-11-18T17:14:03Z
antreacontrollerinfos.clusterinformation.antrea.tanzu.vmware.com   2020-11-18T17:14:03Z
clusternetworkpolicies.security.antrea.tanzu.vmware.com            2020-11-18T17:14:03Z
hostinfoes.topology.corinternal.com                                2020-12-31T15:30:17Z
traceflows.ops.antrea.tanzu.vmware.com                             2020-11-18T17:14:03Z

Our new CRD hostinfoes.topology.corinternal.com is now visible. Another useful way to check if the CRD has successfully deployed is to use the following command against our API group. Remember back in step 2 we specified the domain as corinternal.com and the group as topology. Thus the command to query api-resources for this CRD is as follows:

$ kubectl api-resources --api-group=topology.corinternal.com
NAME         SHORTNAMES   APIGROUP                   NAMESPACED   KIND
hostinfoes   hi           topology.corinternal.com   true         HostInfo

Step 5 - Test the CRD

At this point, we can do a quick test to see if our CRD is in fact working. To do that, we can create a manifest file with a Custom Resource that uses our CRD, and see if we can instantiate such an object (or custom resource) on our Kubernetes cluster. Fortunately kubebuilder provides us with a sample manifest that we can use for this. It can be found in config/samples.

$ cd config/samples
$ ls
topology_v1_hostinfo.yaml
$ cat topology_v1_hostinfo.yaml
apiVersion: topology.corinternal.com/v1
kind: HostInfo
metadata:
  name: hostinfo-sample
spec:
  # Add fields here
  foo: bar

We need to slightly modify this sample manifest so that the specification field matches what we added to our CRD. Note the spec: above where it states 'Add fields here'. We have removed the foo field and added a spec.hostname field, as per the api/v1/hostinfo_types.go modification earlier. Thus, after a simple modification, the CR manifest looks like this, where esxi-dell-e.rainpole.com is the name of the ESXi host that we wish to query.

$ cat topology_v1_hostinfo.yaml
apiVersion: topology.corinternal.com/v1
kind: HostInfo
metadata:
  name: hostinfo-host-e
spec:
  # Add fields here
  hostname: esxi-dell-e.rainpole.com

To see if it works, we need to create this HostInfo Custom Resource.

$ kubectl create -f topology_v1_hostinfo.yaml
hostinfo.topology.corinternal.com/hostinfo-host-e created
$ kubectl get hostinfo
NAME              HOSTNAME
hostinfo-host-e   esxi-dell-e.rainpole.com

Or use the shortcut, "hi":

$ kubectl get hi
NAME              HOSTNAME
hostinfo-host-e   esxi-dell-e.rainpole.com

Note that the hostname field is also printed, as per the kubebuilder directive that we placed in the api/v1/hostinfo_types.go. As a final test, we will display the CR in yaml format.

$ kubectl get hostinfo -o yaml
apiVersion: v1
items:
- apiVersion: topology.corinternal.com/v1
  kind: HostInfo
  metadata:
    creationTimestamp: "2020-12-31T15:48:49Z"
    generation: 1
    managedFields:
    - apiVersion: topology.corinternal.com/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          .: {}
          f:hostname: {}
      manager: kubectl
      operation: Update
      time: "2020-12-31T15:48:49Z"
    name: hostinfo-host-e
    namespace: default
    resourceVersion: "20716173"
    selfLink: /apis/topology.corinternal.com/v1/namespaces/default/hostinfoes/hostinfo-host-e
    uid: c7ff0546-b9f0-49b5-8ea6-748b1f10d039
  spec:
    hostname: esxi-dell-e.rainpole.com
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Step 6 - Create the controller / manager

This appears to be working as expected. However there are no Status fields displayed with our CPU information in the yaml output above. To see this information, we need to implement our operator / controller logic to do this. The controller implements the desired business logic. In this controller, we first read the vCenter server credentials from a Kubernetes secret (which we will create shortly). We will then open a session to my vCenter server, and get a list of ESXi hosts that it manages. I then look for the ESXi host that is specified in the spec.hostname field in the CR, and retrieve the Total CPU and Free CPU statistics for this host. Finally we will update the appropriate Status field with this information, and we should be able to query it using the kubectl get hostinfo -o yaml command seen previously.

Step 6.1 - Open a session to vSphere

Note: The initial version of the code was not very optomized as the code for creating the vCenter session was in the reconciler and was triggered on every reconcile request, which is not ideal. The login function has now been moved out of the reconciler function in the controller, and into main.go. Here is the new vlogin function for creating the vSphere session in main.go. One thing to note is that I am enabling insecure logins (true) by default. This is something that you may wish to change in your code.

func vlogin(ctx context.Context, vc, user, pwd string) (*vim25.Client, error) {

        //
        // Create a vSphere/vCenter client
        //
        //    The govmomi client requires a URL object, u.
        //    It is not just a string representation of the vCenter URL.
        //

        u, err := soap.ParseURL(vc)

        if u == nil {
                fmt.Println("could not parse URL (environment variables set?)")
        }

        if err != nil {
                setupLog.Error(err, "URL parsing not successful", "controller", "HostInfo")
                os.Exit(1)
        }

        u.User = url.UserPassword(user, pwd)

        // Share govc's session cache
        s := &cache.Session{
                URL:      u,
                Insecure: true,
        }

        c := new(vim25.Client)

        err = s.Login(ctx, c, nil)

        if err != nil {
                setupLog.Error(err, " login not successful", "controller", "HostInfo")
                os.Exit(1)
        }

        return c, nil
}

Within the main function, there is a call to the vlogin function with the parameter received from the environment variables shown below. There is also an updated HostInfoReconciler call with a new field (VC) which has the vSphere session details. This login info can now be used from within the HostInfoReconciler controller function, as we will see shortly.

        vc := os.Getenv("GOVMOMI_URL")
        user := os.Getenv("GOVMOMI_USERNAME")
        pwd := os.Getenv("GOVMOMI_PASSWORD")

        ctx, cancel := context.WithCancel(context.Background())
        defer cancel()

        c, err := vlogin(ctx, vc, user, pwd)

        if err = (&controllers.HostInfoReconciler{
                Client: mgr.GetClient(),
                VC:     c,
                Log:    ctrl.Log.WithName("controllers").WithName("HostInfo"),
                Scheme: mgr.GetScheme(),
        }).SetupWithManager(mgr); err != nil {
                setupLog.Error(err, "unable to create controller", "controller", "HostInfo")
                os.Exit(1)
        }

Click here for the complete main.go code.

Step 6.2 - Controller Reconcile Logic

Now we turn our attention to the business logic of the controller. Once the business logic is added in the controller, it will need to be able to run in a Kubernetes cluster. To achieve this, a container image to run the controller logic must be built. This will be provisioned in the Kubernetes cluster using a Deployment manifest. The deployment contains a single Pod that runs the container (it is called manager). The deployment ensures that the controller manager Pod is restarted in the event of a failure.

This is what kubebuilder provides as controller scaffolding - it is found in controllers/hostinfo_controller.go. We are most interested in the HostInfoReconciler function:

func (r *HostInfoReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
        _ = context.Background()
        _ = r.Log.WithValues("hostinfo", req.NamespacedName)

        // your logic here

        return ctrl.Result{}, nil
}

Considering the business logic that I described above, this is what my updated HostInfoReconciler function looks like. Hopefully the comments make is easy to understand, but at the end of the day, when this controller gets a reconcile request (something as simple as a get command will trigger this), the TotalCPU and FreeCPU fields in the status of the Custom Resource are updated for the specific ESXi host in the spec.hostname field. Note that I have omitted a number of required imports that also need to be added to the controller. Refer to the code for the complete hostinfo_controller.go code.

func (r *HostInfoReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
        ctx := context.Background()
        log := r.Log.WithValues("hostinfo", req.NamespacedName)

        hi := &topologyv1.HostInfo{}
        if err := r.Client.Get(ctx, req.NamespacedName, hi); err != nil {
                // add some debug information if it's not a NotFound error
                if !k8serr.IsNotFound(err) {
                        log.Error(err, "unable to fetch HostInfo")
                }
                return ctrl.Result{}, client.IgnoreNotFound(err)
        }

        msg := fmt.Sprintf("received reconcile request for %q (namespace: %q)", hi.GetName(), hi.GetNamespace())
        log.Info(msg)

        //
        // Create a view manager, using vCenter session detail passed to Reconciler
        //

        m := view.NewManager(r.VC)

        //
        // Create a container view of HostSystem objects
        //

        v, err := m.CreateContainerView(ctx, r.VC.ServiceContent.RootFolder, []string{"HostSystem"}, true)

        if err != nil {
                msg := fmt.Sprintf("unable to create container view for HostSystem: error %s", err)
                log.Info(msg)
                return ctrl.Result{}, err
        }

        defer v.Destroy(ctx)

        //
        // Retrieve summary property for all hosts
        //

        var hss []mo.HostSystem

        err = v.Retrieve(ctx, []string{"HostSystem"}, []string{"summary"}, &hss)

        if err != nil {
                msg := fmt.Sprintf("unable to retrieve HostSystem summary: error %s", err)
                log.Info(msg)
                return ctrl.Result{}, err
        }

        //
        // Print summary for host in HostInfo specification info only
        //

        for _, hs := range hss {
                if hs.Summary.Config.Name == hi.Spec.Hostname {
                        hi.Status.TotalCPU = int64(hs.Summary.Hardware.CpuMhz) * int64(hs.Summary.Hardware.NumCpuCores)
                        hi.Status.FreeCPU = (int64(hs.Summary.Hardware.CpuMhz) * int64(hs.Summary.Hardware.NumCpuCores)) - int64(hs.Summary.QuickStats.OverallCpuUsage)
                }
        }

        //
        // Update the HostInfo status fields
        //

        if err := r.Status().Update(ctx, hi); err != nil {
                log.Error(err, "unable to update HostInfo status")
                return ctrl.Result{}, err
        }

        return ctrl.Result{}, nil
}

With the controller logic now in place, we can now proceed to build the controller / manager.

Step 7 - Test the controller

Instead of building a container for the controller and adding our manager logic to it, we can test the manager in standalone mode. This will allow us to verify the functionality of our controller before we go to the trouble of building a container, pushing it to a repo, and then pulling it back from the repo when we deploy it in our Kubernetes cluster. We will do that in the next step.

To build the manager code locally, you can run the following make command:

make manager

The make output should resemble the following:

$ make manager
/usr/share/go/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."
go fmt ./...
go vet ./...
go build -o bin/manager main.go

This should have build the manager binary in bin/manager. Before running the manager in standalone code, we need to set three environmental variables to allow us to connect to the vCenter Server. There are:

export GOVMOMI_HOSTNAME=192.168.0.100
export GOVMOMI_USERNAME=administrator@vsphere.local
export GOVMOMI_PASSWORD='My_VC_Password'

These are retrieved in the main.go code which we built previously. The manager can now be started in standalone mode:

bin/manager

The output should be similar to the following:

2020-12-31T16:54:55.633Z        INFO    controller-runtime.metrics      metrics server is starting to listen    {"addr": "127.0.0.1:8080"}
2020-12-31T16:54:55.634Z        INFO    setup   starting manager
I1231 16:54:55.634543       1 leaderelection.go:242] attempting to acquire leader lease  hostinfo-system/0df5945b.corinternal.com...
2020-12-31T16:54:55.635Z        INFO    controller-runtime.manager      starting metrics server {"path": "/metrics"}
I1231 16:55:13.035397       1 leaderelection.go:252] successfully acquired lease hostinfo-system/0df5945b.corinternal.com
2020-12-31T16:55:13.035Z        DEBUG   controller-runtime.manager.events       Normal  {"object": {"kind":"ConfigMap","namespace":"hostinfo-system","name":"0df5945b.corinternal.com","uid":"f1f46185-77f5-43d2-ba33-192caed82409","apiVersion":"v1","resourceVersion":"20735459"}, "reason": "LeaderElection", "message": "hostinfo-controller-manager-6484c486ff-8vwsn_510f151d-4e35-4f42-966e-31ddcec34bcb became leader"}
2020-12-31T16:55:13.035Z        INFO    controller-runtime.controller   Starting EventSource    {"controller": "hostinfo", "source": "kind source: /, Kind="}
2020-12-31T16:55:13.135Z        INFO    controller-runtime.controller   Starting Controller     {"controller": "hostinfo"}
2020-12-31T16:55:13.135Z        INFO    controller-runtime.controller   Starting workers        {"controller": "hostinfo", "worker count": 1}
2020-12-31T16:55:13.136Z        INFO    controllers.HostInfo    received reconcile request for "hostinfo-host-e" (namespace: "default") {"hostinfo": "default/hostinfo-host-e"}
2020-12-31T16:55:13.625Z        DEBUG   controller-runtime.controller   Successfully Reconciled {"controller": "hostinfo", "request": "default/hostinfo-host-e"}
2020-12-31T16:55:13.625Z        INFO    controllers.HostInfo    received reconcile request for "hostinfo-host-e" (namespace: "default") {"hostinfo": "default/hostinfo-host-e"}

Monitor the startup messages from the manager. If there are no errors, jump to step 11.4 and see if the relevant host information is now available in the status fields of the CR. If the information is present, come back to step 8 where we will begin to build the controller container image.

Step 8 - Build the controller

At this point everything is in place to enable us to deploy the controller to the Kubernete cluster. If you remember back to the prerequisites in step 1, we said that you need access to a container image registry, such as docker.io or quay.io, or VMware's own Harbor registry. This is where we need this access to a registry, as we need to push the controller's container image somewhere that can be accessed from your Kubernetes cluster.

The Dockerfile with the appropriate directives is already in place to build the container image and include the controller / manager logic. This was once again taken care of by kubebuilder. You must ensure that you login to your image repository, i.e. docker login, before proceeding with the make commands, e.g.

$ docker login
Login with your Docker ID to push and pull images from Docker Hub. If you dont have a Docker ID, head over to https://hub.docker.com to create one.
Username: cormachogan
Password: `***********`
WARNING! Your password will be stored unencrypted in /home/cormac/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
$

Next, set an environment variable called IMG to point to your container image repository along with the name and version of the container image, e.g:

export IMG=docker.io/cormachogan/hostinfo-controller:v1

Next, to create the container image of the controller / manager, and push it to the image container repository in a single step, run the following make command. You could of course run this as two seperate commands as well, make docker-build followed by make docker-push if you so wished.

make docker-build docker-push IMG=docker.io/cormachogan/hostinfo-controller:v1

The output has been shortened in this example:

$ make docker-build docker-push IMG=docker.io/cormachogan/hostinfo-controller:v1
go: creating new go.mod: module tmp
go: found sigs.k8s.io/controller-tools/cmd/controller-gen in sigs.k8s.io/controller-tools v0.2.5
/usr/share/go/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."
go fmt ./...
go vet ./...
/usr/share/go/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
go test ./... -coverprofile cover.out
?       hostinfo        [no test files]
?       hostinfo/api/v1 [no test files]
ok      hostinfo/controllers    8.401s  coverage: 0.0% of statements
docker build . -t docker.io/cormachogan/hostinfo-controller:v1
Sending build context to Docker daemon  53.31MB
Step 1/14 : FROM golang:1.13 as builder
 ---> d6f3656320fe
Step 2/14 : WORKDIR /workspace
 ---> Running in 30a535f6a3de
Removing intermediate container 30a535f6a3de
 ---> 0f6c055c6fc8
Step 3/14 : COPY go.mod go.mod
 ---> 11d0f2eda936
Step 4/14 : COPY go.sum go.sum
 ---> ccec3c47ed5a
Step 5/14 : RUN go mod download
 ---> Running in a25193d9d72c
go: finding cloud.google.com/go v0.38.0
go: finding github.com/Azure/go-ansiterm v0.0.0-20170929234023-d6e3b3328b78
go: finding github.com/Azure/go-autorest/autorest v0.9.0
go: finding github.com/Azure/go-autorest/autorest/adal v0.5.0
.
. <-- snip!
.
go: finding sigs.k8s.io/controller-runtime v0.5.0
go: finding sigs.k8s.io/structured-merge-diff v1.0.1-0.20191108220359-b1b620dd3f06
go: finding sigs.k8s.io/yaml v1.1.0
Removing intermediate container a25193d9d72c
 ---> 7e556d5ee595
Step 6/14 : COPY main.go main.go
 ---> 1f0a5564360d
Step 7/14 : COPY api/ api/
 ---> 658146b97c2e
Step 8/14 : COPY controllers/ controllers/
 ---> 5c494bc11a2d
Step 9/14 : RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -a -o manager main.go
 ---> Running in 39a4ae69c02d
Removing intermediate container 39a4ae69c02d
 ---> 465a22e4df85
Step 10/14 : FROM gcr.io/distroless/static:nonroot
 ---> aa99000bc55d
Step 11/14 : WORKDIR /
 ---> Using cache
 ---> 8bcbc4c15403
Step 12/14 : COPY --from=builder /workspace/manager .
 ---> 9323cb1f88c5
Step 13/14 : USER nonroot:nonroot
 ---> Running in 0d85b3457944
Removing intermediate container 0d85b3457944
 ---> 7d038e0d82f5
Step 14/14 : ENTRYPOINT ["/manager"]
 ---> Running in 5f5569796b9a
Removing intermediate container 5f5569796b9a
 ---> 05133c0de2d9
Successfully built 05133c0de2d9
Successfully tagged cormachogan/hostinfo-controller:v1
docker push docker.io/cormachogan/hostinfo-controller:v1
The push refers to repository [docker.io/cormachogan/hostinfo-controller]
5758f4a008b9: Pushed
7a5b9c0b4b14: Pushed
v1: digest: sha256:f970a9610304c885ffd03edc0c7ddd485fb399279511054a578ade406224ad6b size: 739
$

The container image of the controller is now built and pushed to the container image registry. But we have not yet deployed it. We have to do one or two further modifications before we take that step.

Step 9 - Modify the Manager manifest to include environment variables

Kubebuilder provides a manager manifest scaffold file for deploying the controller. However, since we need to provide vCenter details to our controller, we need to add these to the controller/manager manifest file. This is found in config/manager/manager.yaml. This manifest contains the deployment for the controller. In the spec, we need to add an additional spec.env section which has the environment variables defined, as well as the name of our secret (which we will create shortly). Below is a snippet of that code. Here is the code-complete config/manager/manager.yaml).

    spec:
      .
      .
        env:
          - name: GOVMOMI_USERNAME
            valueFrom:
              secretKeyRef:
                name: vc-creds
                key: GOVMOMI_USERNAME
          - name: GOVMOMI_PASSWORD
            valueFrom:
              secretKeyRef:
                name: vc-creds
                key: GOVMOMI_PASSWORD
          - name: GOVMOMI_URL
            valueFrom:
              secretKeyRef:
                name: vc-creds
                key: GOVMOMI_URL
      volumes:
        - name: vc-creds
          secret:
            secretName: vc-creds
      terminationGracePeriodSeconds: 10

Note that the secret, called vc-creds above, contains the vCenter credentials. This secret needs to be deployed in the same namespace that the controller is going to run in, which is hostinfo-system. Thus, the namespace and secret are created using the following commands, with the environment modified to your own vSphere infrastructure obviously:

$ kubectl create ns hostinfo-system
namespace/hostinfo-system created
$ kubectl create secret generic vc-creds \
--from-literal='GOVMOMI_USERNAME=administrator@vsphere.local' \
--from-literal='GOVMOMI_PASSWORD=My_VC_Password' \
--from-literal='GOVMOMI_URL=192.168.0.100' \
-n hostinfo-system
secret/vc-creds created

We are now ready to deploy the controller to the Kubernetes cluster.

Step 10 - Deploy the controller

To deploy the controller, we run another make command. This will take care of all of the RBAC, cluster roles and role bindings necessary to run the controller, as well as pinging up the correct image, etc.

make deploy IMG=docker.io/cormachogan/hostinfo-controller:v1

The output looks something like this:

$ make deploy IMG=docker.io/cormachogan/hostinfo-controller:v1
go: creating new go.mod: module tmp
go: found sigs.k8s.io/controller-tools/cmd/controller-gen in sigs.k8s.io/controller-tools v0.2.5
/usr/share/go/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
cd config/manager && kustomize edit set image controller=docker.io/cormachogan/hostinfo-controller:v1
kustomize build config/default | kubectl apply -f -
namespace/hostinfo-system unchanged
customresourcedefinition.apiextensions.k8s.io/hostinfoes.topology.corinternal.com configured
role.rbac.authorization.k8s.io/hostinfo-leader-election-role created
clusterrole.rbac.authorization.k8s.io/hostinfo-manager-role created
clusterrole.rbac.authorization.k8s.io/hostinfo-proxy-role created
clusterrole.rbac.authorization.k8s.io/hostinfo-metrics-reader created
rolebinding.rbac.authorization.k8s.io/hostinfo-leader-election-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/hostinfo-manager-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/hostinfo-proxy-rolebinding created
service/hostinfo-controller-manager-metrics-service created
deployment.apps/hostinfo-controller-manager created

Step 11 - Check controller functionality

Now that our controller has been deployed, let's see if it is working. There are a few different commands that we can run to verify the operator is working.

Step 11.1 - Check the deployment and replicaset

The deployment should be READY. Remember to specify the namespace correctly when checking it.

$ kubectl get rs -n hostinfo-system
NAME                                     DESIRED   CURRENT   READY   AGE
hostinfo-controller-manager-66bdb8f5bd   1         1         0       9m48s

$ kubectl get deploy -n hostinfo-system
NAME                          READY   UP-TO-DATE   AVAILABLE   AGE
hostinfo-controller-manager   1/1     1            1           14m

Step 11.2 - Check the Pods

The deployment manages a single controller Pod. There should be 2 containers READY in the controller Pod. One is the controller / manager and the other is the kube-rbac-proxy. The kube-rbac-proxy is a small HTTP proxy that can perform RBAC authorization against the Kubernetes API. It restricts requests to authorized Pods only.

$ kubectl get pods -n hostinfo-system
NAME                                           READY   STATUS    RESTARTS   AGE
hostinfo-controller-manager-6484c486ff-8vwsn   2/2     Running   0          72s

If you experience issues with the one of the pods not coming online, use the following command to display the Pod status and examine the events.

kubectl describe pod hostinfo-controller-manager-6484c486ff-8vwsn -n hostinfo-system

Step 11.3 - Check the controller / manager logs

If we query the logs on the manager container, we should be able to observe successful startup messages as well as successful reconcile requests from the HostInfo CR that we already deployed back in step 5. These reconcile requests should update the Status fields with CPU information as per our controller logic. The command to query the manager container logs in the controller Pod is as follows:

kubectl logs hostinfo-controller-manager-6484c486ff-8vwsn -n hostinfo-system manager

The output should be somewhat similar to this:

$ kubectl logs hostinfo-controller-manager-6484c486ff-8vwsn -n hostinfo-system manager
2020-12-31T16:54:55.633Z        INFO    controller-runtime.metrics      metrics server is starting to listen    {"addr": "127.0.0.1:8080"}
2020-12-31T16:54:55.634Z        INFO    setup   starting manager
I1231 16:54:55.634543       1 leaderelection.go:242] attempting to acquire leader lease  hostinfo-system/0df5945b.corinternal.com...
2020-12-31T16:54:55.635Z        INFO    controller-runtime.manager      starting metrics server {"path": "/metrics"}
I1231 16:55:13.035397       1 leaderelection.go:252] successfully acquired lease hostinfo-system/0df5945b.corinternal.com
2020-12-31T16:55:13.035Z        DEBUG   controller-runtime.manager.events       Normal  {"object": {"kind":"ConfigMap","namespace":"hostinfo-system","name":"0df5945b.corinternal.com","uid":"f1f46185-77f5-43d2-ba33-192caed82409","apiVersion":"v1","resourceVersion":"20735459"}, "reason": "LeaderElection", "message": "hostinfo-controller-manager-6484c486ff-8vwsn_510f151d-4e35-4f42-966e-31ddcec34bcb became leader"}
2020-12-31T16:55:13.035Z        INFO    controller-runtime.controller   Starting EventSource    {"controller": "hostinfo", "source": "kind source: /, Kind="}
2020-12-31T16:55:13.135Z        INFO    controller-runtime.controller   Starting Controller     {"controller": "hostinfo"}
2020-12-31T16:55:13.135Z        INFO    controller-runtime.controller   Starting workers        {"controller": "hostinfo", "worker count": 1}
2020-12-31T16:55:13.136Z        INFO    controllers.HostInfo    received reconcile request for "hostinfo-host-e" (namespace: "default") {"hostinfo": "default/hostinfo-host-e"}
2020-12-31T16:55:13.625Z        DEBUG   controller-runtime.controller   Successfully Reconciled {"controller": "hostinfo", "request": "default/hostinfo-host-e"}
2020-12-31T16:55:13.625Z        INFO    controllers.HostInfo    received reconcile request for "hostinfo-host-e" (namespace: "default") {"hostinfo": "default/hostinfo-host-e"}

Step 11.4 - Check if CPU statistics are returned in the status

Last but not least, let's see if we can see the CPU information in the status fields of the HostInfo object created earlier.

$ kubectl get hostinfo hostinfo-host-e -o yaml
apiVersion: topology.corinternal.com/v1
kind: HostInfo
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"topology.corinternal.com/v1","kind":"HostInfo","metadata":{"annotations":{},"name":"hostinfo-host-e","namespace":"default"},"spec":{"hostname":"esxi-dell-e.rainpole.com"}}
  creationTimestamp: "2020-12-31T15:48:49Z"
  generation: 1
  managedFields:
  - apiVersion: topology.corinternal.com/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:kubectl.kubernetes.io/last-applied-configuration: {}
      f:spec:
        .: {}
        f:hostname: {}
    manager: kubectl
    operation: Update
    time: "2020-12-31T16:46:51Z"
  - apiVersion: topology.corinternal.com/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        .: {}
        f:freeCPU: {}
        f:totalCPU: {}
    manager: manager
    operation: Update
    time: "2020-12-31T16:55:13Z"
  name: hostinfo-host-e
  namespace: default
  resourceVersion: "20735464"
  selfLink: /apis/topology.corinternal.com/v1/namespaces/default/hostinfoes/hostinfo-host-e
  uid: c7ff0546-b9f0-49b5-8ea6-748b1f10d039
spec:
  hostname: esxi-dell-e.rainpole.com
status:
  freeCPU: 40514
  totalCPU: 43980

Success!!! Note that the output above is showing us freeCPU and totalCPU as per our business logic implemented in the controller. How cool is that? You can now go ahead and create additional HostInfo manifests for different hosts in your vSphere environment managed by your vCenter server. This is done by specifying different hostnames in these additional manifests. This will allow you to get free and total CPU from those ESXi hosts as well.

Cleanup

To remove the hostinfo CR, operator and CRD, run the following commands.

Remove the HostInfo CR

$ kubectl delete hostinfo hostinfo-host-e
hostinfo.topology.corinternal.com "hostinfo-host-e" deleted

Removed the Operator/Controller deployment

Deleting the deployment will removed the ReplicaSet and Pods associated with the controller.

$ kubectl get deploy -n hostinfo-system
NAME                          READY   UP-TO-DATE   AVAILABLE   AGE
hostinfo-controller-manager   1/1     1            1           2d8h
$ kubectl delete deploy hostinfo-controller-manager -n hostinfo-system
deployment.apps "hostinfo-controller-manager" deleted

Remove the CRD

Next, remove the Custom Resource Definition, hostinfoes.topology.corinternal.com.

$ kubectl get crds
NAME                                                               CREATED AT
antreaagentinfos.clusterinformation.antrea.tanzu.vmware.com        2021-01-14T16:31:58Z
antreacontrollerinfos.clusterinformation.antrea.tanzu.vmware.com   2021-01-14T16:31:58Z
clusternetworkpolicies.security.antrea.tanzu.vmware.com            2021-01-14T16:31:59Z
hostinfoes.topology.corinternal.com                                2021-01-14T16:52:11Z
traceflows.ops.antrea.tanzu.vmware.com                             2021-01-14T16:31:59Z
$ make uninstall
go: creating new go.mod: module tmp
go: found sigs.k8s.io/controller-tools/cmd/controller-gen in sigs.k8s.io/controller-tools v0.2.5
/home/cormac/go/bin/controller-gen "crd:preserveUnknownFields=false,crdVersions=v1,trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
kustomize build config/crd | kubectl delete -f -
customresourcedefinition.apiextensions.k8s.io "hostinfoes.topology.corinternal.com" deleted
$ kubectl get crds
NAME                                                               CREATED AT
antreaagentinfos.clusterinformation.antrea.tanzu.vmware.com        2021-01-14T16:31:58Z
antreacontrollerinfos.clusterinformation.antrea.tanzu.vmware.com   2021-01-14T16:31:58Z
clusternetworkpolicies.security.antrea.tanzu.vmware.com            2021-01-14T16:31:59Z
traceflows.ops.antrea.tanzu.vmware.com                             2021-01-14T16:31:59Z

The CRD is now removed. At this point, you can also delete the namespace created for the exercise, in this case hostinfo-system. Removing this namespace will also remove the vc_creds secret created earlier.

What next?

One thing you could do it to extend the HostInfo fields and Operator logic so that it returns even more information about the ESXi host. You could add additional Status fields that return memory, host type, host tags, etc. There is a lot of information that can be retrieved via the govmomi HostSystem API call.

You can now use kusomtize to package the CRD and controller and distribute it to other Kubernetes clusters. Simply point the kustomize build command at the location of the kustomize.yaml file which is in config/default.

kustomize build config/default/ >> /tmp/hostinfo.yaml

This newly created hostinfo.yaml manifest includes the CRD, RBAC, Service and Deployment for rolling out the operator on other Kubernetes clusters. Nice, eh?

You cna also check my other operators tutorials which interact with vSphere. There is a VM Operator here. This allows you to query virtual machine information from K8s. There is also a First Class Disk (FCD) operator which gets information from a PV that is deployed on vSphere storage. This is available here.

Finally, if this exercise has given you a desire to do more exciting stuff with Kubernetes Operators when Kubernetes is running on vSphere, check out the vmGroup operator that my colleague Micheal Gasch created. It will let you deploy and manage a set of virtual machines on your vSphere infrastructure via a Kubernetes operator. Cool stuff for sure.