Confidence is one of the many benefits of adopting a GitOps-based approach. By storing manifests in a declarative fashion, one has the assurance that the state of an environment has been accurately captured and can be applied and enforced as needed. However, how do you know that the captured manifests are syntactically valid? Are required fields missing? Are there invalid data types present? In many cases, these types of errors only become known at runtime. When using a templating tool, such as Kustomize or Helm, this process becomes even more difficult. Wouldn’t it be nice to be able to verify the rendered manifests beforehand to avoid heartache later on? Fortunately, with the help of specialized tools and a proactive approach toward ensuring stability, it is possible to verify Kubernetes manifests before they are applied to a cluster or made available to a GitOps tool. This article introduces strategies for validating Kubernetes manifests along with the tools that make it possible to increase the stability of managing a Kubernetes environment and the surrounding ecosystem of components.

On the surface, verifying the state of Kubernetes manifests may seem like a trivial task. Even the Kubernetes CLI (kubectl) has the ability to verify resources before they are applied to a cluster. (This is the default functionality, though it can be disabled by specifying the --validate=false option). This can be achieved by using the --dry-run=client option when specifying the kubectl create or kubectl apply commands which will perform the validation without applying the resources to the cluster. However, one of the key challenges with using this approach is that a connection against a running Kubernetes cluster is required to obtain the schema for the set of resources being validated. This presents an issue when incorporating manifest verification into a CI process and requires managing connectivity and credentials against a running environment to perform the validation.

Schemas are a key component to the verification process in Kubernetes, providing the structure along with the fields defined for each API resource. Schemas are not unique to Kubernetes alone. A number of tools in the Kubernetes ecosystem also use schemas. The Helm package manager, for example, provides the ability to leverage JSON schema validation to define the set of values that must be specified when invoking a chart. This optional feature lets the author specify a values.schema.json file at the root of their chart containing the set of required components and their structure. Because the extensibility of the Kubernetes ecosystem and the kubectl alone cannot easily validate resources at scale, we need to leverage another tool.

Kubeval is a command line tool developed with the sole purpose of validating Kubernetes manifests. In addition, a running Kubernetes environment is not required to use the tool. Verification is performed against the supplied manifests using JSON schemas generated from the OpenAPI specifications for a particular Kubernetes version. All you need to do is point a single manifest, directory, or external set of resources piped to kubeval to confirm the validity, similar to the following:

$ curl -sL https://raw.githubusercontent.com/kubernetes/examples/master/guestbook/all-in-one/guestbook-all-in-one.yaml | kubeval
PASS - stdin contains a valid Service (redis-master)
PASS - stdin contains a valid Deployment (redis-master)
PASS - stdin contains a valid Service (redis-slave)
PASS - stdin contains a valid Deployment (redis-slave)
PASS - stdin contains a valid Service (frontend)
PASS - stdin contains a valid Deployment (frontend)

Sounds great, right? However, kubeval only validates schemas for the default set of Kubernetes related resources. Recall that extensibility is one of the core principles of Kubernetes, and there will likely be API resources outside the included set on a given cluster. These often take the form of Custom Resource Definition (CRDs), but may also include platform-specific resources. When running in an OpenShift environment, these additional resources include Routes, Builds and DeploymentConfigs, just to name a few. Attempting to use kubeval to validate these types of resources will fail with an error similar to the following when using the OpenShift CLI to process one of the default OpenShift templates which are then piped into kubeval:

$ oc process openshift//postgresql-ephemeral | kubeval

ERR  - stdin: Failed initializing schema https://kubernetesjsonschema.dev/master-standalone/deploymentconfig-apps-v1.json: Could not read schema from HTTP, response status is 404 Not Found

So, how do we work around this issue? While kubeval does contain the option to ignore missing schemas through the use of the --ignore-missing-schemas flag, ignoring schemas failures goes against performing the validation process in the first place.

You may have noticed that within the error presented above, a schema could not be found on the site https://kubernetesjsonschema.dev for the DeploymentConfig resource. Kubeval will attempt to locate JSON schemas from several locations, both on the local file system or on a set of remote locations. kubernetesjsonschema.dev is a site that contains all of the Kubernetes JSON schemas for each released version.

The order and locations for which schemas are sourced are found below:

  1. A set of OpenShift schemas (remote location) when the --openshift flag is passed
  2. A location on disk as specified by the --schema-location flag
  3. A location on disk as specified by the KUBEVAL_SCHEMA_LOCATION environment variable
  4. kubernetesjsonschema.dev

While the --openshift flag may appear to be a viable option, it unfortunately uses an outdated schema definition that may not include recent changes to the OpenShift related resources. Most importantly, it will not include any of the custom resources found in your environment. Finally, it also relies on an external dependency that may not be available in some restricted environments.

Instead, an alternative approach to resolving these challenges is to build a collection of schemas based on your own environment. Benefits from this approach include:

  1. Elimination of any network dependencies at validation time
  2. Ability to leverage resources defined in your own environment, including those provided by OpenShift along with those associated with any Custom Resource Definitions

Building the Schema Collection

To build a collection of schemas for a particular OpenShift environment, one must be able to determine which resources are present within a cluster. Fortunately, OpenShift and the underlying Kubernetes platform expose an OpenAPI endpoint that provides the set of operations along with the model definitions that are available within the cluster. This authenticated endpoint is located at https://<api_server>/openapi/v2 and contains the following top level properties:

  • definitions
  • info
  • paths
  • security
  • securityDefinitions
  • swagger

The set of resources within the definitions property contain the schemas kubeval will ultimately use to verify our GitOps manifests. However, these definitions are currently in openapi format and not JSON schema format that kubeval requires. The good news is that a tool called openapi2jsonschema is available to assist with the conversion process between OpenAPI specification and JSON schema.

Before we can start to build our own set of schemas against an OpenShift environment, let’s ensure our machine has all of the required software. These include:

  • OpenShift command line tool (oc)
  • kubeval
  • openapi2jsonschema

First, download and install kubeval to your machine using the steps illustrated in this document

Next, install openapi2jsonschema using pip using the following command:

$ pip install openapi2jsonschema

This article provides an overview of the steps required to install pip.

Once installed, the next step is to obtain the OpenAPI specification for your cluster. Log in to your OpenShift cluster and execute the following command to download the file called openapi.json to your machine:

$ curl -kL -H "Authorization: Bearer $(oc whoami -t)" $(oc whoami --show-server)/openapi/v2 > openapi.json

As the file downloads, you may notice the sheer size of the document (>8MB for a minimal OpenShift environment). Given that the file contains every API endpoint and object definition for the entire cluster, the size is not entirely surprising.

With the OpenAPI schema downloaded, openapi2jsonschema can be used to convert it to JSON Schema.

$ openapi2jsonschema openapi.json
Downloading schema
Parsing schema
Generating shared definitions
Generating individual schemas
Processing alertmanager
Generating alertmanager.json
Processing alertmanagerlist
Generating alertmanagerlist.json
...
Generating schema for all types

By default, the resulting JSON schemas are written to a directory called schemas within the current working directory.

$ ls -1 schemas/
_definitions.json
actiondescriptor.json
affinity.json
aggregationrule.json
alertmanager.json
alertmanagerlist.json

openapi2jsonschema also contains special processors that detect Kubernetes specific OpenAPI nodes. Examples include x-kubernetes-int-or-string. To enable these processors, the --kubernetes flag can be specified when invoking the openapi2jsonschema command as shown below:

$ openapi2jsonschema --kubernetes openapi.json

Resolving Errors During JSON Schema Generation

While generating JSON schemas against a baseline installation of OpenShift may finish without error, there is a potential for errors when executing against an environment that may have additional operators or Custom Resource Definitions installed. This is because the OpenAPI endpoint may not have the schemas for all the custom resources defined.

Custom Resource Definitions let end users define new resources within an OpenShift environment. Within the Custom Resource Definition object itself is a section called openAPIV3Schema where creators can specify the OpenAPI schema for the custom resource, and this can then be used for validation purposes. Once defined, it is eligible to be exposed by the clusters’ OpenAPI endpoint.

Structural schemas through the use of the openAPIV3Schema property became mandatory with the apiextensions.k8s.io/v1 version of Custom Resource Definitions. Since schema validation was not required in the apiextensions.k8s.io/v1beta1 version, schemas may not be defined on all resources. Also, for CRDs that are converted from apiextensions.k8s.io/v1beta1 to apiextensions.k8s.io/v1, there is a good likelihood that the spec.preserveUnknownFields is set to true. When this value is true, any field on custom resources that are not part of the schema will continue to be retained. As a result, since strict conformance against the structural schema cannot be guaranteed, the resulting resource is not exposed by the cluster OpenAPI endpoint.

You can determine whether a schema is missing by inspecting the downloaded OpenAPI file and noting the omission of properties field defined within each resource within the definitions section.

Attempting to execute openapi2jsonschema when this type of condition is present will result in the following error:

$ openapi2jsonschema --kubernetes openapi-slb.json
Downloading schema
Parsing schema
Generating shared definitions
Traceback (most recent call last):
File "<kubeval_location>/bin/openapi2jsonschema", line 8, in <module>
  sys.exit(default())
File "<python_location>/site-packages/click/core.py", line 829, in __call__
  return self.main(*args, **kwargs)
File "<python_location>/site-packages/click/core.py", line 782, in main
  rv = self.invoke(ctx)
File "<python_location>/site-packages/click/core.py", line 1066, in invoke
  return ctx.invoke(self.callback, **ctx.params)
File "<python_location>/site-packages/click/core.py", line 610, in invoke
  return callback(*args, **kwargs)
File "<python_location>/site-packages/openapi2jsonschema/command.py", line 111, in default
  if "kind" in type_def["properties"]:
KeyError: 'properties'

There are a few options at play in order to resolve this issue:

  1. Ignore offending CRD’s
  2. Investigate resources that do not provide a schema definition via OpenAPI

The first option involves manipulating the retrieved OpenAPI definition and removing the resources that do not provide a valid schema so that openapi2jsonschema can process the file properly. Automation to perform this functionality can be found in a python script called build_schema.py in a GitHub repository along with several other assets used in  the remainder of this discussion.

https://github.com/sabre1041/k8s-manifest-validation 

Clone this repository to your local machine to have access to the assets

$ git clone https://github.com/sabre1041/k8s-manifest-validation
$ cd k8s-manifest-validation

The build_schema.py script is found within the scripts directory and will first connect to an OpenShift cluster, much in the same way performed previously. Download the OpenAPI specification and then execute the transformation.

To build the schema, navigate to the scripts directory and execute the following command:

$ python build_schema.py -u $(oc whoami --show-server) -t $(oc whoami -t)
Downloading and parsing API schema from: https://<api_url>:6443/openapi/v2
The following API resources do not have valid OpenAPI specifications:
io.cncf.cni.whereabouts.v1alpha1.OverlappingRangeIPReservation
io.metal3.v1alpha1.Provisioning
io.openshift.cloudcredential.v1.CredentialsRequest
Processing schemas into <directory>/openshift-json-schema/master-standalone
...

Once the script completes, the set of schemas will be stored in a directory called openshift-json-schema/master-standalone (Kubeval uses a certain directory structure during execution, but more about that naming convention later).

As denoted by the output above, three of the on-cluster resources do not provide schemas and will be removed from the resulting set of produced JSON schemas. If you do not anticipate validating against any of these resource types, no further action is required.

However, there are options available to you if you want to expose those resources via the OpenAPI exposed set of resources. As noted previously, any CRD that has been upconverted to the apiextensions.k8s.io/v1 version or has the spec.preserveUnknownFields set to true will not be exposed via the OpenAPI endpoint. You can patch the CRD and set spec.preserveUnknownFields to false in order to capture the schema.

Warning: Modifying CRDs, especially those provided and managed by OpenShift, should only be performed in a sandbox/lab environment with the sole purpose of capturing the schema definition. Once the schemas have been captured, it is recommended that any modifications revert to their original state.

To patch the spec.preserveUnknownFields field of a CRD to be false , execute the following command:

$ oc patch crd <crd> --type "json" -p '[{"op":"add","path":"/spec/preserveUnknownFields","value": false}]'

To confirm that the JSON schemas can be built without any of the resources being removed, you can use the --dry-run flag of the build_schema.py script as shown below:

$ python build_schema.py -u $(oc whoami --show-server) -t $(oc whoami -t) --dry-run
Downloading and parsing API schema from: https://api.ablock.devcluster.openshift.com:6443/openapi/v2
Dry run activated: Not building schema

At this point, you can remove the --dry-run flag and build the schema collection for all of the resources in the OpenShift cluster. This will enable the collection to be referenced by kubeval, which will be described in the following section.

Once complete, it is recommended modifications to the CRDs revert to their original state.

Schema Verification Using Kubeval

Now that the collection of schemas for the OpenShift cluster has been generated, let’s look at  kubeval, the tool that used to validate manifests and  managed by GitOps. As mentioned previously, kubeval can operate on Kubernetes manifests within individual files, directories, or piped into standard-in, ideally after being processed using templating tools, like Kustomize or Helm.

To be able to take advantage of the schema collection built in the prior section, the --schema-location flag allows for a file path or URL resource to be specified containing these resources. If you recall, the build_schema.py script that was used to build the collection placed files into a folder called openshift-json-schema/master-standalone. Kubeval allows for schemas for multiple Kubernetes versions to be stored as part of a collection library. For example, if there were multiple versions of OpenShift running in your environment, each version would have its own set of unique schemas. For OpenShift 4.6, Kubernetes version 1.18 would be specified and the resulting folder would be 1.18-standalone instead of master-standalone. Master is used when no version is specified. When invoking kubeval, the --kubernetes-version flag can be specified.

In addition, to further restrict the properties that can be used within manifests files, kubeval also contains a “strict” mode. To use this feature, the schema collection itself using the openapi2jsonschema must be configured appropriately. The build_schema.py tool can be used to pass down this instruction to the openapi2jsonschema and create the appropriate directory for which kubeval expects. Strict schemas are stored in a folder in the form <kubernetes_version>-standalone-strict.

With all of the pieces in place, validation can be performed on a set of Kubernetes resources. Execute the following command to perform the validation:

$ kubeval --schema-location=file://$(pwd)/openshift-json-schema

The set of provided manifests should then be validated successfully.

If you were unable to generate JSON schemas for all of your desired resources, you can also specify the --ignore-missing-schemas flag, which will present a warning when an associated schema is not found for one of the provided manifests instead of an error. As more and more CRDs graduate to the apiextensions.k8s.io/v1,this type of issue will become less common.

Validating Schemas in a CI/CD Pipeline

To fully take advantage of these tools in a GitOps context,validations should be performed at various points in time as part of a Continuous Integration and Continuous Delivery (CI/CD) pipeline. When a GitOps tool, such as ArgoCD, is being used to manage the configuration of a Kubernetes cluster, the last line of defense is just prior to when resources are integrated into a branch that is targeted by the tool. This typically takes the form of a pull (or merge) request (PR) in any collaborative development project. In this situation, when a PR is submitted, kubeval can target manifests contained within the repository and confirm they validate against the schemas for the target environment. Schemas for the target cluster can either be stored within the repository or sourced from an external location.

To demonstrate how to integrate manifest verification as part of a CI/CD pipeline, refer to the k8s-manifest-validation repository introduced in a prior section. Two folders are of note: schemas and ci.

The schemas folder contains a sample schema collection (within the openshift-json-schema directory) for an OpenShift cluster and will aid in the demonstration within the section. It was built using the same set of tools and processes described previously.

The ci directory contains artifacts to use within CI/CD tools. One of the examples provided demonstrates using Tekton (or OpenShift Pipelines), a cloud native CI/CD solution built for Kubernetes.

For the Tekton-based assets used in an OpenShift environment, OpenShift Pipelines must be first installed and deployed.

Login to the OpenShift Web Console, and from the administrator perspective, select OperatorHub. Search for OpenShift Pipelines Operator, click Install, confirm the appropriate version, and then click Install one more time. In a few moments, OpenShift Pipelines will be deployed and available within the cluster.

Once OpenShift Pipelines has been installed, navigate to the examples directory of the repository, which includes the following set of assets separated into their own respective directories:

  • argocd-operator - Deploys an instance of an Argo CD operator
  • argocd - Assets to stand up an instance of Argo CD and a sample Application.
  • example-app - A simple Apache HTTPD based application
  • example-app-helm - A simple Apache HTTPD based application using Helm.

We will not complete the process of deploying these assets into the cluster given that the goal here is to demonstrate how manifests, such as these, can be validated using OpenShift Pipelines.

An overview of the key components within OpenShift Pipelines can be found here.

First, using the OpenShift CLI, create a new project called manifest-validation to contain our Tekton resources

$ oc new-project manifest-validation

Next, navigate to the ci/tekton directory containing the Tekton resources. At a high level, the pipeline will perform two primary goals:

  1. Clone the repository
  2. Use kubeval to verify the manifests in the examples directory using the generated schema in the schema directory.

Now, create the custom Tekton Task called kubeval, which will perform the verification by executing the following command:

$ oc apply -f tasks/kubeval-task.yml

Next, create the pipeline that will clone the repository using the included git-clone ClusterTask and the previously created task:

$ oc apply -f pipelines/kubeval-pipeline.yml

Finally, since the pipeline makes use of a shared workspace between each task, create a new PersistentVolumeClaim to store the clone repository:

$ oc apply -f storage/pvc.yml

With the components of the pipeline now added to the cluster, let’s start a new invocation of the pipeline. Within the OpenShift Web Console, select Pipelines on the lefthand navigation bar, which will display the kubeval-pipeline previously created.

Select the kabob on the righthand side and click Start.

Leave the Git url and revision parameter at their defaults. Under the Workspaces section, select the dropdown for the git-source workspace, select PVC, and then select the git-tekton PVC previously created. Click Start to initiate the pipeline.

By clicking on the logs tab, you can track the progress of the pipeline as it clones down the source in the git-clone task and then perform the manifest validation in the kubeval task.

The resulting output of the kubeval task to confirm the successful validation of the manifests in the examples directory is shown below:

 Downloading kubeval and Helm

 Validating Manifests

Validating Kustomization: /workspace/source/examples/argocd-operator/base

PASS - stdin contains a valid Namespace (argocd)
PASS - stdin contains a valid OperatorGroup (argocd.argocd)
PASS - stdin contains a valid Subscription (argocd.argocd-operator)

 Validating Kustomization: /workspace/source/examples/argocd/base

 PASS - stdin contains a valid ClusterRoleBinding (argocd-application-controller-cluster-admin)
PASS - stdin contains a valid ConfigMap (argocd.argocd-gpg-keys-cm)
PASS - stdin contains a valid Application (argocd.example-app)
PASS - stdin contains a valid ArgoCD (argocd.argocd)

 Validating Kustomization: /workspace/source/examples/example-app

 PASS - stdin contains a valid Service (argocd.httpd-example)
PASS - stdin contains a valid DeploymentConfig (argocd.httpd-example)
PASS - stdin contains a valid BuildConfig (argocd.httpd-example)
PASS - stdin contains a valid ImageStream (argocd.httpd-example)
PASS - stdin contains a valid Route (argocd.httpd-example)
PASS - example-app-helm/templates/serviceaccount.yaml contains a valid ServiceAccount (example-app-helm)
PASS - example-app-helm/templates/service.yaml contains a valid Service (example-app-helm)
PASS - example-app-helm/templates/deployment.yaml contains a valid Deployment (example-app-helm)
PASS - example-app-helm/templates/build.yml contains a valid BuildConfig (example-app-helm)
PASS - example-app-helm/templates/imagestream.yml contains a valid ImageStream (example-app-helm)
PASS - example-app-helm/templates/route.yaml contains a valid Route (example-app-helm)
 Manifests successfully validated!

While Tekton was used as the tool of choice for this demonstration, just about any other Continuous Integration and Continuous Delivery tool can make use of this approach. A GitHub Actions workflow is also available in the k8s-manifest-validation repository and demonstrates how validation can be incorporated into the Pull Request process to further implement the principles of GitOps.

Wrapping Up

As more teams adopt a declarative approach to managing their Kubernetes resources, it becomes even more paramount to take  additional measures to ensure conformance and validity before they are applied. By using tools such as openapi2jsonschema and kubeval, errors can be reduced, and confidence can be gained. Both ultimately mean faster delivery of software and infrastructure.


About the author

Andrew Block is a Distinguished Architect at Red Hat, specializing in cloud technologies, enterprise integration and automation.

Read full bio