On the heels of releasing OpenShift 4.6, OpenShift Service Mesh 2.0 provides a significant update to OpenShift’s service mesh capabilities. OpenShift Service Mesh includes an ecosystem of upstream projects, bundled to provide additional value to OpenShift customers: Istio, Envoy, Kiali and Jaeger. In addition to the re-architected control plane, security, performance and resource utilization improvements, OpenShift Service Mesh 2.0 also brings in new features in Kiali and Jaeger to make it easier to observe and route traffic within the service mesh. This also includes expansion of the suite of wizards in Kiali that make it easy for developers and operators to manage traffic in the mesh.

Background

OpenShift Service Mesh provides a layer on top of OpenShift for securely connecting services in a consistent manner. This provides centralized control, security and observability across your services without having to modify your application(s).

This is achieved by connecting applications using lightweight Envoy proxies to form a mesh of communications - a “Service Mesh”. These proxies are managed by the Service Mesh Control Plane.

 

The ability to centrally view and manage service interactions provides multiple benefits, including:

  • Visibility into service communication with out of the box tracing, metrics and logging.
  • Automatically encrypt communications with mTLS, authentication and authorization.
  • The ability to enforce zero-trust networking, where services must opt-in to be able to communicate with each other.
  • Fine-grain control of service traffic, enabling a range of deployment options, A/B testing, chaos engineering, and many more scenarios.

A Centralized Control Plane

New in OpenShift Service Mesh 2.0, the upgrade from Istio 1.4 to 1.6 brings with it a significant architectural update to the control plane. The three central Istio Control Plane components - Pilot, Citadel and Galley have been consolidated into a single binary called “istiod” - D for daemon. Operators will notice that where there was a deployment for each of these services, there is now one single istiod deployment. This change brings with it many benefits including simplifying how Istio is configured, installed, upgraded and managed. It reduces the resource usage, startup time and the cost of coordinating between the different components of the network.

 

Improved ServiceMeshControlPlane Resource

OpenShift Service Mesh 2.0 includes an updated (v2) ServiceMeshControlPlane resource for installing and configuring the control plane. This includes configuring mutual TLS, custom signing keys, distributed tracing, Jaeger, Kiali and Grafana configuration, resource configuration and more.

This new resource includes an updated validation schema, which aims to catch configuration errors prior to deployment, reducing the probability of a “silent” runtime problem.

Improved Certificate Management with SDS

Secret Discovery Service (SDS) provides a significantly improved mechanism for delivering secrets - keys and identity certificates, to sidecar Envoy proxies. In previous releases (1.1.x and earlier) of Openshift Service Mesh, Kubernetes Secrets were used to mount these keys and certificates directly into proxy containers. This had significant drawbacks, as Kubernetes Secrets have many well known security risks. This also had a performance impact when rotating certificates, as the proxy containers needed to be redeployed to activate the new certificate. With SDS, a central server pushes the certificates directly to all Envoy proxies, and they can be used immediately without needing a redeployment.

Goodbye Mixer, Hello WebAssembly Extensions

Until recently, extensions to Istio have been made using the centralized Mixer Telemetry and Policy components. These extensions needed to be written in C++. Envoy and Istio now allow extensions using WebAssembly (“Wasm”) - a format that allows extensions to be written in more than 15 programming languages. As of Service Mesh 2.0, this is a tech-preview feature, with full support planned in a future release.

As Mixer extensions are now deprecated, they will be disabled by default in Service Mesh 2.0 and removed in a future release. Note that Red Hat’s 3Scale API Management adapter will continue to use Mixer extensions in Service Mesh 2.0. It will be converted to a WebAssembly extension prior to Mixer’s removal.

Improved Metrics Collection Latency

One of the big benefits of Service Mesh is the ability to automatically obtain metrics for services, without having to make source code changes. The metrics collection functionality has been re-architected, moving to Istio’s new “Telemetry V2” architecture, which is based on Wasm extensions (described above).

This brings performance improvements with it - significantly reducing metric collection latency. Note that there are a handful of feature gaps from the previous telemetry architecture, though these will be improved upon in future releases.

Dive into Kiali from the OpenShift Developer Console

Kiali provides a management console for OpenShift Service Mesh, including dashboards, observability, wizards and configuration validation to ensure the Service Mesh is functioning properly, and aid troubleshooting when problems occur.

Kiali can be accessed via the OpenShift Developer Console when OpenShift Service Mesh is enabled from the topology, project and service views. The screenshot below shows the project view.

 

This release includes several new features and enhancements to Kiali. In this blog post, we’ll focus on the most significant additions. An upcoming blog post will provide a deeper dive on new Kiali features in Service Mesh 2.0.

Distributed Tracing Topology view in Kiali

As services evolve, the impact of changes on communication patterns can be difficult to predict. Slow response times are often difficult to diagnose and bottlenecks can appear where you least suspect them. To help with this, Kiali now provides a distributed tracing overlay in the Graph view. This enables users to visualize the interactions that take place within the service mesh when particular requests occur.

 

From this view, you can inspect individual traces and drill down into the span oriented gantt-chart view in Jaeger that can help to identify bottlenecks in the request chain:

 

For a more complete demo of this functionality, see Kiali in Graph Tracing.

New Kiali Wizards: Fault Injection and Timeouts

The reality of microservices and distributed systems is that networks are unreliable - failures can and do occur, thus it’s important to understand how your system will respond when they occur - before they occur.

OpenShift Service Mesh provides the ability to simulate network faults and timeouts in Kiali with the new Fault Injection Wizard. This makes it easier for developers and operators to simulate faults without having to manipulate service mesh configuration files.

 

The fault injection wizard lets you specify whether you want to add a delay to the response of requests, and/or whether you want to inject faulty responses. In both cases, you can specify the approximate percentage of requests to be impacted. Behind the scenes, Kiali will create a VirtualService to enable the fault injection scenario.

The Fault Injection wizard joins an ever evolving suite of wizards in Kiali that make it easy for developers and operators to manage traffic in the mesh. These wizards provide the ability to:

  • Create resilient services without modifying code by configuring Request Timeouts, Retries and Circuit Breakers.
  • Conduct A/B testing, canary deployments and more using Traffic Shifting between versions of a service.
  • Manage Request Routing rules based on headers, URIs and more.
  • Expose services externally by connecting them to a Gateway.
  • Configure mTLS and load balancer settings.

Support for External ElasticSearch Instances

When installing OpenShift Service Mesh, users will install Red Hat’s ElasticSearch operator. ElasticSearch is used to provide persistent storage for Jaeger. That said, some users will have existing ElasticSearch clusters that they may want to utilize to reduce their overall storage costs. OpenShift Service Mesh now supports integration with external ElasticSearch instances.

Migrating to Service Mesh 2.0

Due to substantial control plane changes, as well as some API changes, OpenShift Service Mesh 2.0 is not backward compatible with 1.1.x deployments. Thus, a new control plane will have to be configured, with workloads migrated to the new service mesh. For a full list of the steps to move to Service Mesh 2.0, see the migration guide.

Going Forward

OpenShift Service Mesh brings significant new functionality, observability, and resiliency to your applications without adding any development work. The ecosystem of projects that contribute to OpenShift Service Mesh: Istio, Envoy, Kiali and Jaeger are evolving at a rapid pace promising even more functionality in the future. However, Red Hat’s deep integration with and contributors to the projects ensures that OpenShift Service Mesh will deliver that functionality to enterprises in a safe and secure way. In fact, Fortune 500 companies around the world are already taking advantage of the evolutionary impact of a mesh in their applications.

Get started now and get fine-grained traffic control, monitor the health of your services, improve security, and know the state of your applications from a single pane of glass.


About the author

Jamie Longmuir is a product manager at Red Hat focused on OpenShift Service Mesh. Prior to his journey as a product manager, Jamie spent much of his career as a software developer, often focusing on distributed systems and cloud infrastructure. Along the way, he has had stints as a field engineer and training developer working for both small startups and large enterprises.

Read full bio