Istio - Service Mesh to provide Traffic Control, Security and Observability for Kubernetes

Istio - Service Mesh to provide Traffic Control, Security and Observability for Kubernetes

You may be hearing the term "service mesh" more and more often in the context of container technology and cloud-native discussions. As per the survey "CNCF Survey 2020" conducted by Linux Foundation affiliated Cloud Native Computing Foundation (CNCF) in May/June 2020, 27% of the organizations are already using service mesh in their commercial environments, and 42% are in the evaluation and planning phases.

Service mesh usage survey results (CNCF SURVEY 2020)

So, what is this remarkable service mesh?

What is "service mesh?"

In a nutshell, service mesh can be described as "a set of functions to solve network challenges in microservice architecture". Therefore, before explaining service mesh, we will explain what "microservice architecture" is, and what kind of issues it addresses.

Microservices architecture

In conventional system development, "monolithic architecture," in which all functions required for an application are implemented in a single component (e.g., server, etc.), has been the norm. With the rise of container technology in recent years, microservices architecture, in which the functions required for an application are considered as a single service and separated into components, and the entire application is configured by linking each service together, has become attractive. Microservices architecture makes it easier to design an architecture with loosely coupled dependencies among services, and thus provides advantages such as improved reusability, faster release cycles by minimizing development units, and ease of scale-in/scale-out for each service unit.

Challenges of microservices architecture

In microservices architecture, the following points which did not need to be considered in monolithic architecture become issues due to the characteristics of multiple services working together to form a single system.

Traffic control between services

  1. How to control routing from one service to a service with multiple versions (BlueGreen Deployment, Canary Release)
  2. How to isolate the failed service in order to localize the impact
  3. How to handle requests when requests are over the capacity for a particular service

Ensure security of inter-service communication

  1. How to protect communications between services
  2. How to determine if communication is allowed or denied

Understanding the overall architecture and the connections between services

  1. How to understand system configurations and communication flows that have become complex due to the coordination of numerous services

Once again, what is service mesh?

The literal translation of the term "service mesh" is "service net," or in other words, a state in which a large number of services are intricately connected like microservices.

In addition, as mentioned above, microservice architecture requires the ability to flexibly launch and manage a large number of services, so it is generally built by containers., which have lower service launch and scale costs than virtual machines, with service mesh functionality in them.

For this reason, the following explanation is based on the assumption that microservice architecture and service meshes are deployed on Kubernetes.

Basic service mesh architecture

To solve the aforementioned microservice architecture challenges, each microservice unit must have the following capabilities:

  • Control ingress traffic for service.
  • Control egress traffic for service.
    Sending metrics for ingress and egress traffic for service.

The simplest way to achieve this would be to implement the logic governing inter-service communication in the applications that make up each service. However, since it is difficult to maintain the relevant logic across all services as development progresses, this method of absorbing everything at the application layer is not very realistic.
Therefore, early microservices practitioners have tried to keep application developers focused on the business logic by implementing libraries for each language as common functions.

Inter-service communication using library

Inter-service communication using library

However, this method had the following issues:

  • Supported program languages are limited.
  • Even if libraries are prepared for each language, it is difficult to maintain consistency of functionality across languages (e.g., if a change is made to the Java library, the Go library also needs to be modified)

In many cases, different languages should be used between each service especially when a microservice architecture is adopted.

Against the backdrop of these issues, the current approach to implementing service meshes is to insert functionality into the infrastructure layer rather than the application layer. Inserting functionality in the infrastructure layer has the advantage of implementing consistent functionality regardless of language, compared to using libraries.

Service mesh architecture

Service mesh architecture

Typical service mesh also consists of an L7 proxy running as a sidecar and a management process for the proxy. The sidecar is called the "data plane" of the service mesh and the management process is called the "control plane".

Control Plane

The control plane has components that support management of the service mesh. Examples include components that manage data plane traffic rules, components that manage security aspects such as certificates, and components that collect and aggregate metrics.

Data Plane

The data plane consists of a proxy that is usually inserted transparently as a sidecar. This proxy is configured to control all network traffic between applications. This allows the proxy to limit traffic, provide security protection, and collect metrics.

What is "Istio"?

Istio is a service mesh Open Source Software (OSS) developed by Google, IBM, and Lyft, reaching version 1.0 in July 2018. Initially, Istio was designed for Kubernetes only, but now it is platform-independent and can run in a variety of environments including cloud, on-premise, and Kubernetes.
In recent years, commercial support has also become available through the adoption of Istio-based products such as Google's Anthos Service Mesh and Red Hat's Red Hat OpenShift Service Mesh.

Istio architecture overview

Istio's architecture is shown in the figure below. It consists of the data plane and control plane introduced in "Basic Architecture of Service Mesh.

Architecture diagram (from Istio's official website)

Let's go through Istio components..

Data plane (Envoy)

Istio's data plane utilizes anenhanced version of the Envoy software. Envoy is a high-performance proxy OSS developed in C++, covering the L3/L4/L7 layers and supporting HTTP, HTTPS, gRPC and TCP. It was hosted by CNCF in September 2017, and was certified as a Graduated project in November 2018.

Istio deploys an Envoy proxy to each application Pod as a sidecar proxy. Envoy mediates all communications for each Pod and control communications and collect communication telemetry (communication content, response time, trace information, etc.).

The Envoy proxy deployment can be done automatically by Istio. In addition, since Envoy runs as a sidecar proxy, the application does not need to be aware of the proxy's existence. Thus, developers can apply Istio to existing applications withoutany modification to code and manifest.

Control plane (Istiod)

A component called "Istiod" serves as the control plane in Istio. Istiod is provided as a single binary and has functions such as service discovery, configuration management, and certificate management. Each of these functions is performed by the "Pilot," "Citadel," and "Galley" components within Istiod.

  • Pilot
    Component that provides service discovery for Envoy proxies. It also provides traffic control (A/B testing, Canary Release, etc.) by propagating routing rules to each Envoy proxy
  • Galley
    Component that performs configuration validation and distribution of Istio. Serves to separate Istio from underlying platforms (Kubernetes, VMs, etc.)
  • Citadel
    Act as a Certificate Authority (CA) to enable inter-service and end-user authentication

Istio features

Istio's capabilities allow you to control traffic without modifying the application and improve security and observability.

Traffic management (traffic control)

Istio's traffic control features allow you to control the flow of traffic and API calls between services.
The following table shows typical resources used for Istio's traffic control functions. For other Istio resources related to traffic control, please refer to Istio's official website.

Istio Resources Resource Overview Details of Functions
Gateway Can manage traffic in and out of the mesh.
Ingress Gateways are located at the boundaries of the mesh and receive communications from outside the mesh.
  • -Configuration of ports to be published
  • -Per-host routing configurations
  • -TLS configuration for endpoints
  • -Redirection of HTTP connections to HTTPS
Virtual Service Can configure how to route requests to services in a service mesh
  • -PATH-based routing
  • -Percentage-based traffic splitting
  • -Addition or deletion of HTTP request headers
  • -HTTP request URL rewriting
  • -HTTP request timeout configuration
  • -HTTP request retry configuration
  • -Circuit breaker
  • -Traffic mirroring
  • -Fault injection
Destination Rule Configure traffic behavior for routing destinations
  • -Configuration of load balancer algorithms
  • -Routing control based on workload execution region
  • -Configuration of connection pooling over TCP or HTTP
  • -Configuration of circuit breakers for abnormal hosts
  • -Configuration of TLS for connections
  • -Control of HTTP connections (maximum number, retry count, timeout control, etc.)
Service Entry External service entries can be added to the internal service registry
  • -Add out-of-mesh instances to the registry
  • -Route to appropriate external services
Schematic diagram of traffic control

Schematic diagram of traffic control

Security

Istio allows you to secure microservices without changing application code and infrastructure. Istio's security features protect services and data by providing strong IDs, strong policies, transparent TLS encryption, authentication, authorization, and audit tools. This allows for the creation of so-called "zero-trust networks.
The table below is a representative list of Istio's resources related to security. For other resources, please refer to Istio's official website.

Istio Resources Summary
Authorization Policy Enables workload access control within the mesh
Peer Authentication Define mutual TLS configurations for workloads
Request Authentication Define request authentication methods supported by the workload

Observability

Since traffic between microservices becomes more complex in microservice architecture, observability such as traffic information gathering becomes more important than legacy architectures. Istio provides metrics, logging, and tracing capabilities that support observability, making it easy to get a complete picture of the system and its traffic.

Type of Telemetry Summary
Metrics
  • -Collect request and response metrics to help you understand the status of your communications
  • -Metrics are collected by "Prometheus" and can be visualized using visualization tools such as Grafana
Log -Sidecar proxies can collect access logs of communications in and out of containers
Tracing
  • -The call flow of the service can be traced which can be used for bottleneck analysis during performance degradation
  • - Visualization tools such as "Kiali" and "Jaeger" can be used to visualize the overall architecture and communication

Summary

In this issue, we introduced the concept of service mesh, OSS for service mesh, and an overview of Istio in relation to the challenges of microservice architecture. Although the term "service mesh" itself is abstract and includes many functions, it is an important technology that has been attracting attention in recent years.

Original Article

This article is a translation and adaptation of an October 2021 ITmedia@IT article
サービスメッシュ、Istioがマイクロサービスのトラフィック制御、セキュリティ、可観測性に欠かせない理由:Cloud Nativeチートシート(9) - @IT (itmedia.co.jp)