Skip to content

Commit

Permalink
docs: add k8s 101 (#8107)
Browse files Browse the repository at this point in the history
  • Loading branch information
weicao committed Sep 9, 2024
1 parent 1a92d24 commit 5c6f584
Show file tree
Hide file tree
Showing 6 changed files with 113 additions and 7 deletions.
Binary file modified docs/img/kubeblocks-architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/user_docs/overview/about-this-manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@
title: About this manual
description: KubeBlocks, kbcli, how to
keywords: [kubeblocks, overview, introduction]
sidebar_position: 4
sidebar_position: 5
---
This manual introduces how to operate KubeBlocks with `kbcli`. For advanced users familiar with Kubernetes, this manual also includes guidance on how to operate KubeBlocks using `helm` and `kubectl`.
2 changes: 1 addition & 1 deletion docs/user_docs/overview/concept.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ Only Addon developers need to understand the ClusterDefinition and ComponentDefi
#### ClusterDefinition
ClusterDefinition is an API used to define all available topologies of a database cluster, offering a variety of topological configurations to meet diverse deployment needs and scenarios.

Each topology includes a list of component, each linked to a ComponentDefinition, which enhances reusability and reduce redundancy. For example, widely used components such as etcd and Zookeeper can be defined once and reused across multiple ClusterDefinitions, simplifying the setup of new systems.
Each topology includes a list of components, each linked to a ComponentDefinition, which enhances reusability and reduce redundancy. For example, ComponentDefinition of widely used components such as etcd and Zookeeper can be defined once and reused across multiple ClusterDefinitions, simplifying the setup of new systems.

Additionally, ClusterDefinition also specifies the sequence of startup, upgrade, and shutdown for components, ensuring a controlled and predictable management of component lifecycles.

Expand Down
8 changes: 4 additions & 4 deletions docs/user_docs/overview/introduction.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: KubeBlocks overview
description: KubeBlocks, kbcli, multicloud
title: Introduction
description: introduction to KubeBlocks
keywords: [kubeblocks, overview, introduction]
sidebar_position: 1
---
Expand All @@ -12,7 +12,7 @@ import TabItem from '@theme/TabItem';

## What is KubeBlocks

KubeBlocks is an open-source Kubernetes operator for databases, enabling users to run and manage multiple types of databases on Kubernetes. As far as we know, most database operators typically manage only one specific type of database. For example:
KubeBlocks is an open-source Kubernetes operator for databases (more specifically, for stateful applications, including databases and middleware like message queues), enabling users to run and manage multiple types of databases on Kubernetes. As far as we know, most database operators typically manage only one specific type of database. For example:
- CloudNativePG, Zalando, CrunchyData, StackGres operator can manage PostgreSQL
- Strimzi manages Kafka
- Oracle and Percona MySQL operator manage MySQL
Expand Down Expand Up @@ -217,7 +217,7 @@ This means that managing multiple databases on Kubernetes becomes simple, effici
- Rolling upgrades
- Decommission a specific replica
- Minor version upgrades
- In addition to the declarative API, KubeBlocks also offers an Ops API for executing one-time operational tasks on database clusters. The Ops API supports additional features such as queuing, concurrency control, progress tracking, and operation rollback.
- In addition to the declarative API, KubeBlocks also offers an OpsRequest API for executing one-time operational tasks on database clusters. The OpsRequest API supports additional features such as queuing, concurrency control, progress tracking, and operation rollback.
- Observability: Supports integration with Prometheus and Grafana.
- Includes a powerful and intuitive command-line tool `kbcli`, which makes operating KubeBlocks CRs on Kubernetes more straightforward and reduces keystrokes. For those well-versed in Kubernetes, kbcli can be used alongside kubectl to provide a more streamlined way of performing operations.

Expand Down
106 changes: 106 additions & 0 deletions docs/user_docs/overview/kubernetes_and_operator_101.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
title: Kubernetes and Operator 101
description: things about K8s you need to know
keywords: [K8s, operator, concept]
sidebar_position: 3
---

# Kubernetes and Operator 101

# K8s
What is Kubernetes? Some say it's a container orchestration system, others describe it as a distributed operating system, while some view it as a multi-cloud PaaS (Platform as a Service) platform, and others consider it a platform for building PaaS solutions.

This article will introduce the key concepts and building blocks within Kubernetes.

## K8s Control Plane
The Kubernetes Control Plane is the brain and heart of Kubernetes. It manages the overall operation of the cluster, including processing API requests, storing configuration data, and ensuring the cluster's desired state. Key components include the API Server (which handles communication), etcd (which stores all cluster data), the Controller Manager (which enforces the desired state), the Scheduler (which assigns workloads to Nodes), and the Cloud Controller Manager (which manages cloud-specific integrations, such as load balancers, storage, and networking). Together, these components orchestrate the deployment, scaling, and management of containers across the cluster.


## Node
Some describe Kubernetes as a distributed operating system, capable of managing many Nodes. A Node is a physical or virtual machine that acts as a worker within the cluster. Each Node runs essential services, including the container runtime (such as Docker or containerd), the kubelet, and the kube-proxy. The kubelet ensures that containers are running as specified in a Pod, the smallest deployable unit in Kubernetes. The kube-proxy handles network routing, maintaining network rules, and enabling communication between Pods and services. Nodes provide the computational resources needed to run containerized applications and are managed by the Kubernetes Master, which distributes tasks, monitors Node health, and maintains the desired state of the cluster.

:::note
In certain contexts, the term "Node" can be confusing when discussing Kubernetes (K8s) alongside databases. In Kubernetes, a "Node" refers to a physical or virtual machine that is part of the Kubernetes cluster and serves as a worker to run containerized applications. However, when a database is running within Kubernetes, the term "Database Node" typically refers to a Pod that hosts a database instance.

In the KubeBlocks documentation, "Node" generally refers to a Database Node. If we are referring to a Kubernetes Node, we will explicitly specify it as a "K8s Node" to avoid any confusion.
:::

## kubelet
The kubelet is the agent that the Kubernetes Control Plane uses to manage each Node in the cluster. It ensures that containers are running in a Pod as defined by the Kubernetes control plane. The kubelet continuously monitors the state of the containers, making sure they are healthy and running as expected. If a container fails, the kubelet attempts to restart it according to the specified policies.


## Pod
In Kubernetes, a Pod is somewhat analogous to a virtual machine but is much more lightweight and specialized. It is the smallest deployable unit in Kubernetes.
It represents one or more containers that are tightly coupled and need to work together, along with shared storage (volumes), network resources, and a specification for how to run the containers. These containers can communicate with each other using localhost and share resources like memory and storage.

Kubernetes dynamically manages Pods, ensuring they are running as specified and automatically restarting or replacing them if they fail. Pods can be distributed across Nodes for redundancy, making them fundamental to deploying and managing containerized applications (including databases) in Kubernetes.


## Storage Class
When creating disks for workloads inside a Pod, such as databases, you may need to specify the type of disk media, whether it's HDD or SSD. In cloud environments, there are often more options available. For example, AWS EBS offers various volume types, such as General Purpose SSD (gp2/gp3), Provisioned IOPS SSD (io1/io2), and Throughput Optimized HDD (st1). In Kubernetes, you can select the desired disk type through a StorageClass.

## PVC
A Persistent Volume Claim (PVC) in Kubernetes is a request for storage by a user. A PVC is essentially a way to ask for storage with specific characteristics, such as storage class, size and access modes (e.g., read-write or read-only). PVCs enable Pods to use storage without needing to know the details of the underlying infrastructure.

In K8s, to use this storage, users create a PVC. When a PVC is created, Kubernetes looks for a StorageClass that matches the request. If a matching StorageClass is found, it automatically provisions the storage according to the defined parameters—whether it's SSD, HDD, EBS or NAS. If a PVC does not specify a StorageClass, Kubernetes will use the default StorageClass (if one is configured) to provision storage.

## CSI
In Kubernetes, various StorageClasses are provided through the Container Storage Interface (CSI), which is responsible for provisioning the underlying storage "disks" used by applications. CSI functions similarly to a "disk driver" in Kubernetes, enabling the platform to adapt to and integrate with a wide range of storage systems, such as local disks, AWS EBS, and Ceph. These StorageClasses, and the associated storage resources, are provisioned by specific CSI drivers that handle the interaction with the underlying storage infrastructure.

CSI is a standard API that enables Kubernetes to interact with various storage systems in a consistent and extensible manner. CSI drivers, created by storage vendors or the Kubernetes community, expose essential storage functions like dynamic provisioning, attaching, mounting, and snapshotting to Kubernetes.

When you define a StorageClass in Kubernetes, it typically specifies a CSI driver as its provisioner. This driver automatically provisions Persistent Volumes (PVs) based on the parameters in the StorageClass and associated Persistent Volume Claims (PVCs), ensuring the appropriate type and configuration of storage—whether SSD, HDD, or otherwise—is provided for your applications.

## PV
In Kubernetes, a Persistent Volume (PV) represents a storage resource that can be backed by various systems like local disks, NFS, or cloud-based storage (e.g., AWS EBS, Google Cloud Persistent Disks), typically managed by different CSI drivers.

A PV has its own lifecycle, independent of the Pod, and is managed by the Kubernetes control plane. It allows data to persist even if the associated Pod is deleted. PVs are bound to Persistent Volume Claims (PVCs), which request specific storage characteristics like size and access modes, ensuring that applications receive the storage they require.

In summary, PV is the actual storage resource, while PVC is a request for storage. Through the StorageClass in the PVC, it can be bound to a PV provisioned by different CSI drivers.

## Service
In Kubernetes, a Service acts as a load balancer. It defines a logical set of Pods and provides a policy for accessing them. Since Pods are ephemeral and can be dynamically created and destroyed, their IP addresses are not stable. A Service resolves this issue by providing a stable network endpoint (a virtual IP address, known as a ClusterIP) that remains constant, allowing other Pods or external clients to communicate with the set of Pods behind the Service without needing to know their specific IP addresses.

Service supports different types: ClusterIP (internal cluster access), NodePort (external access via <NodeIP>:<NodePort>), LoadBalancer (exposes the Service externally using a cloud provider’s load balancer), and ExternalName (maps the Service to an external DNS).


## ConfigMap

A ConfigMap is used to store configuration data in key-value pairs, allowing you to decouple configuration from application code. This way, you can manage application settings separately and reuse them across multiple environments. ConfigMaps can be used to inject configuration data into Pods as environment variables, command-line arguments, or configuration files. They provide a flexible and convenient way to manage application configurations without hardcoding values directly into your application container.


## Secret

A Secret is used to store sensitive data such as passwords, tokens, or encryption keys. Secrets allow you to manage confidential information separately from your application code and avoid exposing sensitive data in your container images. Kubernetes Secrets can be injected into Pods as environment variables or mounted as files, ensuring that sensitive information is handled in a secure and controlled manner.

However, Secrets are not encrypted by default—they are simply base64-encoded, which does not provide real encryption. They should still be used with care, ensuring proper access controls are in place.

## CRD
If you want to manage database objects using Kubernetes, you need to extend the Kubernetes API to describe the database objects you're managing. This is where the CRD (Custom Resource Definition) mechanism comes in, allowing you to define custom resources specific to your use case, such as database clusters or backups, and manage them just like native Kubernetes resources.

## CR
A Custom Resource (CR) is an instance of a Custom Resource Definition (CRD). It represents a specific configuration or object that extends the Kubernetes API. CRs allow you to define and manage custom resources, such as databases or applications, using Kubernetes' native tools. Once a CR is created, Kubernetes controllers or Operators monitor it and perform actions to maintain the desired state.

CRD and CR are the foundation for developing a Kubernetes Operator. CRDs are often used to implement custom controllers or operators, allowing for continuously watches for changes to CRs (representing, for example, database clusters) and automatically performs actions.

## What is Kubernetes Operator?

A Kubernetes Operator is a software, typically composed of one or more controllers, that automates the management of complex applications by translating changes made to a Custom Resource (CR) into actions on native Kubernetes objects, such as Pods, Services, PVCs, ConfigMaps, and Secrets.

- Input: User modifications to the CR.
- Output: Corresponding changes to underlying Kubernetes resources or interactions with external systems (e.g., writing to a database or calling APIs), depending on the requirements of the managed application.

The Operator continuously watches the state of these Kubernetes objects. When changes occur (e.g., a Pod crashes), the Operator automatically takes corrective actions, like recreating the Pod or adjusting traffic (e.g., updating Service Endpoints).

In essence, a Kubernetes Operator encapsulates complex operational knowledge into software, automating tasks like deployment, scaling, upgrades, and backups, ensuring the application consistently maintains its desired state without manual intervention.

## Helm and Helm Chart

Helm is a popular package manager for Kubernetes that helps manage and deploy applications. It packages all the necessary Kubernetes resources into a single Helm Chart, allowing you to install applications with a single command (helm install). Helm also handles configuration management and updates (helm upgrade), making the entire lifecycle of the application much easier to manage.
Key components of a Helm Chart:

- Templates: YAML files with placeholders that define Kubernetes resources (like Pods, Services, and ConfigMaps).
- Values.yaml: A file where users specify default values for the templates, allowing easy customization. Helm allows you to take an existing chart and override the default values using values.yaml or command-line flags, enabling you to provide environment-specific configurations without modifying the underlying templates.
- Chart.yaml: Metadata about the chart, including the name, version, and description.

Helm integrates well with CI/CD tools like Jenkins, GitLab CI, and GitHub Actions. It can be used to automate deployments and rollbacks as part of a continuous delivery pipeline, ensuring that applications are consistently deployed across different environments.
2 changes: 1 addition & 1 deletion docs/user_docs/overview/supported-addons.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: Supported addons
description: Addons supported by KubeBlocks
keywords: [addons, enable, KubeBlocks, prometheus, s3, alertmanager,]
sidebar_position: 3
sidebar_position: 4
sidebar_label: Supported addons
---

Expand Down

0 comments on commit 5c6f584

Please sign in to comment.