What is "Kubernetes Native" CI/CD: History, Key Tools, Pipeline and Flow to the Cloud Native Era

In this article, we will introduce you to CI/CD (Continuous Integration/Continuous Delivery). In particular, we will go into depth about "Kubernetes Native" CI/CD which is based on the use of Kubernetes (k8s).

With the wide variety of CI/CD tools out there, some may not know what the best approach for understanding is. In the following sections, we will explain the history and background of CI/CD and what functions CI/CD should have including tool comparisons and trends.

History and Background – The Road to Kubernetes Native CI/CD

The history and background of the evolution of CI/CD is presented in the following four steps (Figure 1), which are divided into CI and CD, the application (AP) layer and the platform layer to understand CI/CD.

Table of Contents

  1. From CI to CI/CD
  2. Platform layer: on-premise to cloud to containers (and k8s)
  3. AP layer: from monolith to microservices
  4. From CIOps to GitOps

Figure 1: History of CI/CD evolution

1. From CI to CI/CD

How do readers perceive the keywords CI/CD and DevOps which have now become commonplace? When an article titled "Jenkins" came out, CI focused on compilation and testing, but deployment gradually became more important. Then, CI started to be called "CI/CD" instead of "CI" to match the actual situation.

Contents Command
CI
  • Compile
  • test
  • package
Example commands with "Apache Maven"
  • mvn compile
  • mvn Test
  • mvn Package
CD
  • Deploy
Deployment examples using CLI
  • ftp (file placement)
  • curl (deployment to "Apache Tomcat")

Another distinctive evolution at this time would be the pipeline function. The pipeline function is a feature that defines a series of build and deployment flows such as A job -> B job -> C job.

Before the pipeline feature was introduced, there was a time when CI was just starting to emerge in Jenkins 1.x, when we had to work hard to connect Maven jobs in Jenkins. I feel that pipelining has become more popular and convenient.

2. Platform layer: from on-premise to cloud to containers (and k8s)

In the on-premise era, if you wanted to achieve full automation through CI/CD or DevOps, you needed to be aware of the following various layers.

  • AP: Deploy Java war file
  • Middleware: Configuration of application servers such as Tomcat and "Oracle WebLogic" and RDBMS configuration such as "MySQL" and "Oracle Database"
  • OS: Placement of configuration files and others in the file system

With the advancement of virtualization and cloud computing, these AP/middleware/OS configurations are being replaced by containers and cloud resources. In other words, the configurations directly connected to CI/CD tools are centralized on containers and cloud resources, and they can be managed at another abstracted layer.

Of course, it cannot be said that "everything will be managed by manifests", but it will be possible to manage them in a text-based manner at a more abstract and centralized layer. It also facilitates the realization of IaC (Infrastructure as Code, the coding of system resources).

Figure 2: Centralized and abstracted management

3. AP layer: From Monolith to Microservices

In the era of Digital Transformation (DX), there is an increasing demand for high frequency service releases. Supporting technologies include cloud, agile, and microservices. Application units will be smaller than in the monolithic era, allowing for more flexible releases.

Specifically, the scope of releases is limited to small microservices, making it easier to apply the various CI/CD best practices that have become commonplace in recent years.

  • A/B test
  • Blue/Green deployment
  • Canary release

On the other hand, applications can easily become siloed, and failures of individual microservices can lead to failures of the entire system or make it difficult to see the operation of the entire system. In addition, the behavior of the system becomes more complex, so more ingenuity is required in testing to ensure quality.

  • Prevention of system-wide failures: Circuit breakers
  • Quality Assurance: Chaos engineering, Contract tests
  • Visualization: Distributed tracing, Logging,Metrics

4. From CIOps to GitOps

Recently, the concept of "GitOps" has emerged, and "software development techniques" are being incorporated into the CD domain.

Until now, CI tools have performed everything from build to deployment in one go (called "CIOps"). With GitOps, on the other hand, deployment is left to the CD tool, and the CI tool focuses on the build part.

Figure 3: Difference between CIOps and GitOps

The following four points and features are key to achieving GitOps with k8s. All these techniques are commonly used in software development.

  • Delegation
  • Declarative programming
  • Single Source of Truth (Event Listener)
  • DRY

Although slightly different/slightly off-topic, the concept of "ChatOps," in which chat tools are placed at the developer's front desk, is also emerging, and the world of CI/CD and DevOps is evolving in an extremely diverse way.

Each of the above four points will be explained in the following sections.

Delegation

GitOps delegates the process to a CD tool, which has advantages in terms of security and avoiding complexity.

First, security. In the CIOps world, a single CI tool such as Jenkins and "CircleCI" is used for regression testing - build - deploy. Many of you may be using them in this way. However, this style of CI/CD requires setting up permissions and credentials for the CI tool to deploy to the environment (run k8s APIs) which creates a security risk. With GitOps, CD tools are deployed on the k8s cluster, so credentials can be managed within the cluster, eliminating security risks.

Second, avoiding complexity. In CIOps, when deployments to different environments become necessary, CI tool configuration becomes more complex, and maintenance becomes more difficult. In GitOps, even if you need to deploy to different environments, you only need to monitor the same Git repository for the CD tools in each environment, so complexity is less likely to increase.

Declarative programming

The greatest value of k8s is its ability to define states (ToBe) instead of behaviors. If all deployments, like recoveries from failures, were defined as behaviors, it would be necessary to define behaviors (branching like if/else) according to the state which would make the configuration very complex.

By defining the state (ToBe) "declaratively" in a manifest file, you can concentrate entirely on the definition of the final state to be. Also, manifest files can be managed directly in a Git repository, making them highly compatible with Git.

Single Source of Truth (Event Listener)

Have you ever had the experience of taking individual temporary actions on an automatically configured server and then forgetting to revert it back? This is what is called a "Configuration Drift". In order to avoid this situation, it is important to ensure the "Immutable Infrastructure" which does not allow users to change the structure without permission.

To put immutable infrastructure into practice, you must manage your k8s manifest in a Git repository, make the manifest stored in the Git repository as "Single Source of Truth," and keep the deployed status always in sync with the Git repository.

"ArgoCD" and "Flux2" have a built-in Single Source of Truth and event listener to synchronize source and environment, allowing users to practice immutable infrastructure without being aware of it.

DRY (Don't Repeat Yourself)

If you copy/paste similar definitions and modify all the duplicated configuration files for every change, not only will the productivity be reduced, but also it will be difficult to keep quality. For example, if manifest files for development and commercial environments are maintained separately, when one file is updated, the other file must reflect the modifications, and so on.

DRY can keep CI/CD simple by practicing the philosophy of "not having multiple descriptions of the same content through commonization".

Tools such as "Kustomize" have the DRY concept built-in, allowing users to benefit from DRY without being aware of it.

Comparison of CI/CD tools

Based on the previous section, we compare typical tools (Figure 4).

Figure 4: Comparison table of typical CI/CD

First, we introduce CI-based tools. In Open Source Software (OSS), there are tools such as Jenkins and Concourse CI. There are also k8s-specific tools such as Tekton and Jenkins X.

Managed services from cloud vendors include tools such as Amazon Web Services' (AWS) "AWS CodePipeline" and other "Code" series, Microsoft Azure's "Azure Pipelines," Google Cloud Platform's (GCP) Google Cloud Build".

SaaS-type tools such as CircleCI and Travis CI are available. Tools like GitLab (GitLab CI/CD) and GitHub which have Git at its core are alsoavailable. There are also tools that specialize in CD such as Spinnaker, ArgoCD, and Flux2.

Usage of CI/CD tools

Let us get an idea of how tools are being used based on CNCF's survey reports. The most recent reports at the time of writing are November 2020 report (PDF), so we quote thathere (Figure 5).

Figure 5: CNCF CI/CD Tool Usage Report

The top CI tools used are Jenkins (53%), GitLab CI/CD (36%), and GitHub Actions (20%), and this indicates that Jenkins, the original CI/CD tool, is used by many users as a cloud-native CI/CD tool.

In addition, while the use of "Git" is becoming more common in development, tools that integrate CI/CD with Git's repository services are also gaining popularity because they are easy to start.

It is clear that ArgoCD is very widely used in CD tools.

By the way, in the next series, we will focus on GitLab CI/CD, a CI tool that integrates Git repositories as a feature, makes it easy to practice CI triggered by merge requests (pull requests) and provides a lot of information on third-party integration scripts such as image scanning for DevSecOps. Also, ArgoCD will be discussed as a CD tool.

CI/CD Pipeline in the Cloud Native Era

We have discussed CI/CD tools, and now we will examine "how to assemble a CI/CD flow" using these tools. We will take a step-by-step look at "how to assemble a CI/CD flow" with reference to cloud-native base container practices and branching strategies.

CI/CD Practices in Containers

Similarl to container applications, it would be desirable to keep the Git repository (application code) as a Single Source of Truth. In other words, the CI/CD flow should be such that there is one container image corresponding to a particular version of the Git repository, and deployments to dev and prod environments should also promote the same container image.

Identical container images can be verified by checking if the digest values of the images are identical.

Figure 6: Distribution of identical container images

You may also want to consider the following other practices:

  • Keep container images small (use ".dockerignore" to prevent unnecessary files from being included)
  • Utilize tags to keep version, stability, purpose, etc. easy to understand
  • Use multi-stage builds and caching to speed up builds and minimize images
  • Enable features such as container scanning to keep container images secure

Branch strategy

In most cases where Git is used, either Git Flow, GitHub Flow, or GitLab Flow is employed. This article does not provide a detailed comparison, but they are used in the following cases.

Flow Use case
Git Flow Complex application code needs to be managed and released over some period of time
GitHub Flow Developing an application that is simpler than Git Flow and deploys multiple times a day
GitLab Flow Requiring complex management to isolate application code for each environment

Adopting the GitOps concept described above would also separate CI from CD, and it also means that the lifecycle of application code in a Git repository is different from the lifecycle of deploying it to the environment. Therefore, managing the application code and deployment manifest (e.g. k8s' deployment.yaml) in separate Git repositories will keep life cycle management simple. For environment information, you can also use k8s' "ConfigMap" and "Secret" resources to eliminate environment information from the application code repository.

In the case of developing microservice-like applications, it is recommended to use GitHub Flow for the Git repository that manages the application to ensure easy deployment of microservices. Tagging is a good way to manage the version of the application.

In the Git repository that manages deployment manifests, you can also use Kustomize, to manage environment differences more easily. In the case of using tools like Kustomize, environment information is not managed in separate branches, but in patch files, so there is no need for environment branches like in GitLab Flow. GitHub Flow is also a good choice for this case.

CI/CD Flow in the Cloud-Native Era

We will examine the CI/CD flow with reference to branching strategies and other factors. First, after explaining the CI flow and CD flow in detail, we will explain the overall CI/CD flow from a higher point of view.

CI Flow

Pipelines run by CI tools are required to perform tasks for verifying quality and container image creation. Typical tasks in the CI pipeline include the following:

  • Static analysis of source code (e.g., "SonarQube")
  • Regression test
  • Container build
  • Container scan
  • Container push

It is a good idea to perform CI at each point in the (GitHub Flow) development workflow (pull/merge requests, master merge, etc.) so that you can always confirm that quality is being maintained. Also, for the container build, you may want to assign Git branches, tags, or commit hash values to image tags to make them easier to manage by matching the state of the Git repository at a specific point in time with the container image.

Let's look at an image of CI implementation (Figure 7).

Figure 7: CI flow

  1. Developer creates a branch from master and commits source code developed in the branch
  2. Developer pushes branch to Git repository and creates pull/merge request when source code development is complete
  3. CI tool triggers changes and executes CI ([1] in Fig. 7)
  4. Quality verification by regression testing, etc. with CI tools ([2] in Fig. 7)
  5. If there is no problem with quality verification, container build & push with CI tools ([3] in Fig. 7)

CD Flow

Next, I will explain the CD flow based on GitOps, which, as mentioned above, manages the Git repository for deploy manifests separately from the application code. Also, Kustomize makes it easier to manage manifests because it can manage only the diff patches for each environment.

Figure 8 shows an example of deploying a new container image from a container image already deployed in the prod environment with a tag of "v0.x" to a container image with a tag of "v1.0". It is assumed that the v1.0 container image has been stored in the container registry beforehand.

Figure: 8 CD Flow

  1. Create a branch from master and modify the deploy manifest in the branch (modify container image tag to v1.0)
  2. Push branch to Git repository and create pull/merge request
  3. Merge pull/merge request to master
  4. GitOps tools monitor the Git repository and detect changes ([1] in Figure 8)
  5. GitOps tool detects differences from the current state (v0.x) of the k8s cluster and synchronizes the cluster state with the Git repository state (container image v1.0 deployment manifest is applied) ([2] in Figure 8)
  6. Deploy container image v1.0 to k8s cluster ([3] in Figure 8)

Overall CI/CD flow

Figure 9 shows the overall flow including the CI and CD flows described above.

Figure 9: CI/CD flow

  1. Push code to Git repository (application code), pull/merge requests, master merge, tag creation
  2. CI tool executes CI triggered by changes in the Git repository (application code) ([1] in Figure 9)
  3. Quality verification testing by regression testing, etc. with CI tools ([2] in Figure 9)
  4. Container build & push with CI tools if quality verification is OK ([3] in Figure 9)
  5. Create branch from master Git repository (deploy manifest), modify deploy manifest in branch, push branch to Git repository, create pull/merge request.
  6. Merge pull/merge requests from Git repository (deploy manifest) into master.
  7. GitOps tools monitor the Git repository to detect changes ([4] in Figure 9)
  8. GitOps tool detects differences from the current state of the k8s cluster and synchronizes the cluster state with the Git repository state ([5] in Figure 9)
  9. Deploy the new container image to the k8s cluster ([6] in Figure 9)

What we have introduced in this article is just one example of a CI/CD flow, but if you develop a flow based on this, I believe you can build a CI/CD flow with a cloud-native approach.

Original Article

This article is a translation and adaptation of an May 2021 ITmedia@IT article 「Kubernetes Native」なCI/CDとは何か――クラウドネイティブ時代に至る歴史、主要ツール、パイプラインとフローの在り方:Cloud Nativeチートシート(5) - @IT (itmedia.co.jp)