Platforms on k8s with Golang - Watch any CRD

Written by ryandawsonuk | Published 2020/11/12
Tech Story Tags: kubernetes | golang | custom-resource-definitions | platform | k8s | operators | programming | coding

TLDR A common pattern is to extend the Kubernetes API by creating your own Custom Resource Definition. The MongoDB operator manages the custom resource and ensures the right kind of MongoDB instance will be created in response. The best options are kubebuilder or operator-sdk (which is in the process of refactoring to be based on kubbuilder) One way to work with custom resources in Go is to use a ClientSet as an easy way to import code.via the TL;DR App

Let’s say you want to do more with Kubernetes than run off-the-shelf apps. Perhaps you want to stitch apps together into a bespoke platform. Imagine that when your user clicks a button you want to provision a new database or open up a new public-facing endpoint.
If you’re only running off-the-shelf tools and apps, then probably k8s yaml manifests and helm charts have you covered. Interacting with Kubernetes itself is the realm of custom resources and operators. Let’s understand this before getting into a particular custom resource platform-type use case.

Custom Resources and Operators

Let’s say that what you provide is more of a tool than an app. Maybe you’ve got one department running one instance of the tool and you want more departments to automatically provision their own instances. Databases, monitoring tools, and storage tools might all be cases like this. A common pattern is to extend the Kubernetes API by creating your own Custom Resource Definition.
MongoDB offers the below example of a custom resource to create a MongoDB instance:
apiVersion: mongodb.com/v1
kind: MongoDB
metadata:
  name: my-standalone
spec:
  version: 4.4.0-ent
  service: my-service

  opsManager:
    configMapRef:
      name: my-project
  credentials: my-credentials
  type: Standalone

  persistent: true
This can be created with
kubectl apply
. The MongoDB operator manages the custom resource and ensures the right kind of MongoDB instance will be created in response. There are many other such operators listed on operatorhub.
If you’re looking to create an operator then the best support for doing so is with Golang. The best options are kubebuilder or operator-sdk (which is in the process of refactoring to be based on kubebuilder). I’m not going to talk about creating an operator as there’s good material out there - kubebuilder has an official ebook on it.
What is harder to find info on is projects that interact with multiple custom resources. Imagine you want a platform to provision instances of monitoring tools and also expose new external endpoints for them. Or watch for when new databases are created and automatically add monitoring around them. Then you’re going to have to go beyond kubebuilder’s scaffolding tools or even what the kubebuilder book tells you.

Interacting with Multiple CRDs

For true operators you want to keep the responsibility isolated to managing a single CRD. But you might need to interact with other CRDs that the operator doesn’t manage. For example, maybe you’re providing a way to provision instances of a content management system and the content management system itself uses mongodb as a dependency. Then you might want your content management operator to submit mongobdb instances to the kubernetes API.
Working with multiple custom resources in Golang code can get interesting.
One way to work with custom resources in Go is to use a ClientSet. Many projects offer a ClientSet as an easy way to import code to work with the custom resources. For example, the Seldon project provides custom resources to deploy machine learning models to Kubernetes and it currently exposes a ClientSet. I can then import the package, instantiate a ClientSet and use it to list or create SeldonDeployments:
import (
	"context"

	machinelearningv1 "github.com/seldonio/seldon-core/operator/apis/machinelearning.seldon.io/v1"
	seldonclientset "github.com/seldonio/seldon-core/operator/client/machinelearning.seldon.io/v1/clientset/versioned"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	_ "k8s.io/client-go/plugin/pkg/client/auth"
	"k8s.io/client-go/tools/clientcmd"
)

var clientset *seldonclientset.Clientset

func init() {
	clientset, _ = GetSeldonClientSet()
}

func GetSeldonClientSet() (*seldonclientset.Clientset, error) {
	config, err := clientcmd.BuildConfigFromFlags("", "")
	if err != nil {
		return nil, err
	}
	kubeClientset, err := seldonclientset.NewForConfig(config)
	if err != nil {
		return nil, err
	}
	return kubeClientset, nil
}

func ListSeldonDeployments(namespace string) (result *machinelearningv1.SeldonDeploymentList, err error) {
	return clientset.MachinelearningV1().SeldonDeployments(namespace).List(context.TODO(), metav1.ListOptions{})
}

func CreateSeldonDeployment(deployment *machinelearningv1.SeldonDeployment, namespace string) (sdep *machinelearningv1.SeldonDeployment, err error) {
	return clientset.MachinelearningV1().SeldonDeployments(namespace).Create(context.TODO(), deployment, metav1.CreateOptions{})
}
Read and update are very similar. But what if you want to track changes to resources of the type? Perhaps you want to know immediately when the resource is ready. That calls for a watch and its implementation is a bit different.
Watching a SeldonDeployment is possible by importing:
github.com/seldonio/seldon-core/operator/client/machinelearning.seldon.io/v1/informers/externalversions
and using an Informer. Simplifying a little, that basically looks like:
import (
	SeldonVersion "github.com/seldonio/seldon-core/operator/apis/machinelearning.seldon.io/v1"
	SeldonInformers "github.com/seldonio/seldon-core/operator/client/machinelearning.seldon.io/v1/informers/externalversions"
)

seldonFactory := SeldonInformers.NewSharedInformerFactoryWithOptions(s.Settings.SeldonAPI.Clientset, 0, SeldonInformers.WithNamespace(namespace))
	runSeldonInformer(seldonFactory)

func runSeldonInformer(seldonFactory SeldonInformers.SharedInformerFactory) {
	seldoninformer := seldonFactory.Machinelearning().V1().SeldonDeployments().Informer()
	stopper := make(chan struct{})
	defer close(stopper)
	seldoninformer.AddEventHandler(cache.ResourceEventHandlerFuncs{
		AddFunc: func(obj interface{}) {
			d := obj.(*SeldonVersion.SeldonDeployment)
			//...
		},
		DeleteFunc: func(obj interface{}) {
			d := obj.(*SeldonVersion.SeldonDeployment)
			//...
		},
		UpdateFunc: func(oldObj, newObj interface{}) {
			d := newObj.(*SeldonVersion.SeldonDeployment)
			//...
		},
	})

	seldoninformer.Run(stopper)
}
Note here that we have an import called SeldonVersion as well as the SeldonInformers. The SeldonVersion here gives us Go code to represent what a SeldonDeployment custom resource is. It’s the Go type, whereas the ClientSet is the way to interact with instances of the type.
There’s also a helpful fake clientset for testing:
(
github.com/seldonio/seldon-core/operator/client/machinelearning.seldon.io/v1/clientset/versioned/fake
).
All of this effectively comes from kubebuilder as the clientset code being imported was generated by kubebuilder. However, kubebuilder no longer generates clientsets.

Interacting with CRDs - The New Way

Now the suggestion from kubebuilder is to use the Kubernetes controller-runtime package for interacting with CRDs. The controller-runtime package provides ways to interact with Kubernetes that are a bit more flexible but also more technical than kubernetes client-go. (FWIW controller-runtime client uses client-go.)
It is still suggested to import the Go types for representing custom resources. Typically the types are in a package named after the API version for that type. What's different now is using controller-runtime for the interactions instead of a ClientSet. Here is an example for v1 of the SeldonDeployment custom resource with functions to list and create SeldonDeployments.
import (
  "context"
  "log"

  SeldonVersion "github.com/seldonio/seldon-core/operator/apis/machinelearning.seldon.io/v1"
  "k8s.io/apimachinery/pkg/runtime"
  ctrl "sigs.k8s.io/controller-runtime"
  "sigs.k8s.io/controller-runtime/pkg/client"
)

var kclient client.Client

func init() {
  kclient = GetClient()
}

func GetClient() client.Client {
  scheme := runtime.NewScheme()
  SeldonVersion.AddToScheme(scheme)
  kubeconfig := ctrl.GetConfigOrDie()
  controllerClient, err := client.New(kubeconfig, client.Options{Scheme: scheme})
  if err != nil {
     log.Fatal(err)
     return nil
  }
  return controllerClient
}

func ListSeldonDeployments(namespace string) (result *SeldonVersion.SeldonDeploymentList, err error) {
  list := &SeldonVersion.SeldonDeploymentList{}
  err = kclient.List(context.TODO(), list, &client.ListOptions{Namespace: namespace})
  return list, err
}

func CreateSeldonDeployment(deployment *SeldonVersion.SeldonDeployment) (sdep *SeldonVersion.SeldonDeployment, err error) {
  err = kclient.Create(context.TODO(), deployment)
  return deployment, err
}
The controller-runtime client also has a fake implementation available for test purposes -
sigs.k8s.io/controller-runtime/pkg/client/fake
. There’s a NewFakeClientWithScheme function that can be used to create a fake client that’s the equivalent for unit tests of the real client from GetClient above.
A watch without a clientset is quite different from a watch with a clientset. Most notably, the informer needs to be told which type to watch and the objects it tells us about are generic unstructured types that have to be converted to the intended type.
import (
  "fmt"

  SeldonVersion "github.com/seldonio/seldon-core/operator/apis/machinelearning.seldon.io/v1"

  corev1 "k8s.io/api/core/v1"
  "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
  "k8s.io/apimachinery/pkg/runtime"
  "k8s.io/apimachinery/pkg/runtime/schema"
  "k8s.io/client-go/dynamic"
  "k8s.io/client-go/dynamic/dynamicinformer"
  "k8s.io/client-go/informers"
  "k8s.io/client-go/tools/cache"
  ctrl "sigs.k8s.io/controller-runtime"
)

func GetDynamicInformer(resourceType string) (informers.GenericInformer, error) {
  cfg := ctrl.GetConfigOrDie()

  // Grab a dynamic interface that we can create informers from
  dc, err := dynamic.NewForConfig(cfg)
  if err != nil {
     return nil, err
  }
  // Create a factory object that can generate informers for resource types
  factory := dynamicinformer.NewFilteredDynamicSharedInformerFactory(dc, 0, corev1.NamespaceAll, nil)
  // "GroupVersionResource" to say what to watch e.g. "deployments.v1.apps" or "seldondeployments.v1.machinelearning.seldon.io"
  gvr, _ := schema.ParseResourceArg(resourceType)
  // Finally, create our informer for deployments!
  informer := factory.ForResource(*gvr)
  return informer, nil
}

func (s *SeldonDeployServer) seldonCRDWatcher(namespace string) {
  //dynamic informer needs to be told which type to watch
  seldoninformer, _ := GetDynamicInformer("seldondeployments.v1.machinelearning.seldon.io")
  stopper := make(chan struct{})
  defer close(stopper)
  runSeldonCRDInformer(stopper, seldoninformer.Informer(), namespace)
}

func runSeldonCRDInformer(stopCh <-chan struct{}, s cache.SharedIndexInformer, namespace string) {
  handlers := cache.ResourceEventHandlerFuncs{
     AddFunc: func(obj interface{}) {

        d := &SeldonVersion.SeldonDeployment{}
        // try following https://erwinvaneyk.nl/kubernetes-unstructured-to-typed/
        err := runtime.DefaultUnstructuredConverter.
           FromUnstructured(obj.(*unstructured.Unstructured).UnstructuredContent(), d)
        if err != nil {
           fmt.Println("could not convert obj to SeldonDeployment")
           fmt.Print(err)
           return
        }
        // do what we want with the SeldonDeployment/event
     },
     DeleteFunc: func(obj interface{}) {
        // convert the obj as above do what we want with the SeldonDeployment/event
     },
     UpdateFunc: func(oldObj, newObj interface{}) {
        // convert the obj as above do what we want with the SeldonDeployment/event
     },
  }
  s.AddEventHandler(handlers)
  s.Run(stopCh)
}

Handling Go Dependencies

One upside of the controller-runtime approach to handling CRDs is it gives you more flexibility when importing Go libraries. If you don't use a ClientSet then you can import the types for a CRD with minimal coupling to a k8s version.
ClientSets are generated to use a particular version of client-go and that can be subject to breaking changes across versions, such as when the context parameter was introduced on many key functions. Importing multiple ClientSets can lead to dependency clashes. This can be avoided with the controller-runtime approach.

Summary

Using Go gives us a lot of flexibility for interacting with Kubernetes. We can manage custom resources and interact with custom resources we don’t manage. We could even interact with custom resources that we know some properties of but don’t know their full definitions.
Here we’ve only covered cases where we know the full type as that’s more common but there are articles out there for cases where you don’t know the types either. The range of options can be a little confusing. Hopefully, this article helps with navigating the choices.
(Title image Image from Binoculars Water View by kisistvan77 on Pixabay.)

Written by ryandawsonuk | Principal Data Consultant at ThoughtWorks. Hackernoon Contributor of the Year - Engineering.
Published by HackerNoon on 2020/11/12