Skip to content

Latest commit

 

History

History
294 lines (227 loc) · 11 KB

multinode.md

File metadata and controls

294 lines (227 loc) · 11 KB

KNE with a Multi Node Cluster

Background

A k8s cluster is made up of 1 or more nodes. Each node can hold up to 110 pods. See the official large cluster considerations here. An emulated DUT in KNE brings up 1 pod. An emulated ATE in KNE brings up 1 pod per port. Together with the controller pods and other dependency pods, this in turn restricts a KNE user using kind (a single node cluster) to less than ~100 DUTs + ATE ports. For large testbeds, this is not an acceptable restriction. Additionally each device requires CPU and other resources shared from the host.

A multi-node KNE cluster setup addresses these limitations through a controller + worker(s) setup spread across multiple VMs.

cluster nodes

In this cluster diagram we can see that there is a single central Control Plane along with 3 Nodes.

This guide will show you how to use KNE on an existing multi-node cluster, as well as provide steps to setup a multi-node cluster on GCP.

External cluster type

The kne deploy command is used to setup a cluster as well as ingress, CNI, and vendor controllers inside the cluster. For a multi-node cluster we will be using the External cluster type in the deployment config.

External is essentially a no-op cluster type. It is assumed a k8s cluster has already been deployed. In this case, KNE does no cluster lifecycle management. KNE only setups the dependencies. This guide will show you how to utilize the External cluster type option to get KNE up an running on a multi-node cluster.

The kne CLI will be run on the host of the controller VM and the created topology will be automatically provisioned across the worker nodes depending on available resources on each. This setup can easily be scaled up by adding more worker VMs with increased resources (CPU, etc.).

To conclude this guide we will bring up a 150 node topology in our multi-node cluster.

Create a topology in a multi-node cluster

This guide assumes a multi-node cluster has already been set up. The cluster should adhere to these restrictions:

The following optional section shows how to create a multi-node cluster using kubeadm on GCP. Skip directly to the topology creation step if your existing cluster adheres to the above guidelines.

Cluster setup

Using GCP we will create a 3 VM setup, 1 VM serving as the Control Plane + 2 VMs each serving as a worker Node.

VPC

A custom VPC is required to handle to k8s routing between the VMs. The following commands will set up a custom VPC with a known CIDR range in an existing GCP project.

gcloud compute networks create multinode --subnet-mode custom
gcloud compute networks subnets create multinode-nodes \
  --network multinode \
  --range 10.240.0.0/24 \
  --region us-central1
gcloud compute firewall-rules create multinode-allow-internal \
  --allow tcp,udp,icmp,ipip \
  --network multinode \
  --source-ranges 10.240.0.0/24
gcloud compute firewall-rules create multinode-allow-external \
  --allow tcp:22,tcp:6443,icmp \
  --network multinode \
  --source-ranges 0.0.0.0/0

VM Instances

We run the official production k8 solution kubeadm to create our cluster. We chose dockerd as our Container Runtime Interface (CRI) and flannel as our pod networking add-on. These tools are already installed on the KNE VM image built on ubuntu. The created VMs in the next step will use this base image to reduce the manual setup required.

Import the KNE VM image:

gcloud compute images import kne-cb8d6252-14aa-4f68-bbd9-a97d9443d795 \
  --os=ubuntu-2004 \
  --source-file=gs://kne-vm-image/cb8d6252-14aa-4f68-bbd9-a97d9443d795.tar.gz

Create an SSH key pair to use for all of the VMs created below:

ssh-keygen -f /tmp/multinode-key -C user -N ""
sed -i '1s/^/user:/' /tmp/multinode-key.pub
Controller

Create the controller VM using the gcloud CLI. Note that the controller VM is assigned the internal IP address 10.240.0.11 in the custom VPC.

gcloud compute instances create controller \
  --zone=us-central1-a \
  --image=kne-cb8d6252-14aa-4f68-bbd9-a97d9443d795 \
  --machine-type=n2-standard-8 \
  --enable-nested-virtualization \
  --scopes=https://www.googleapis.com/auth/cloud-platform \
  --metadata-from-file=ssh-keys=/tmp/multinode-key.pub \
  --can-ip-forward \
  --private-network-ip 10.240.0.11 \
  --subnet multinode-nodes

SSH to the VM:

ssh -i /tmp/multinode-key user@<EXTERNAL IP OF VM>

Now run the following commands to setup the cluster:

sudo kubeadm init --cri-socket unix:///var/run/cri-dockerd.sock --pod-network-cidr 10.244.0.0/16
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Setup the flannel pod networking add-on:

kubectl apply -f flannel/Documentation/kube-flannel.yml
Workers

For the purposes of this CodeLab, create 2 worker VMs. Run this command 2 times replacing {n} with 1 and 2.

gcloud compute instances create worker-{n} \
  --zone=us-central1-a \
  --image=kne-cb8d6252-14aa-4f68-bbd9-a97d9443d795 \
  --machine-type=n2-standard-64 \
  --enable-nested-virtualization \
  --scopes=https://www.googleapis.com/auth/cloud-platform \
  --metadata-from-file=ssh-keys=/tmp/multinode-key.pub \
  --can-ip-forward \
  --private-network-ip 10.240.0.2{n} \
  --subnet multinode-nodes

SSH to each VM:

ssh -i /tmp/multinode-key user@<EXTERNAL IP OF VM>

And run the following command, using the token and SHA output from cluster setup on the controller VM:

sudo kubeadm join 10.240.0.11:6443 \
  --token {token} \
  --discovery-token-ca-cert-hash sha256:{sha} \
  --cri-socket unix:///var/run/cri-dockerd.sock

Topology Creation

SSH to the VM acting as the controller. Confirm that the worker nodes all successfully joined the cluster:

$ kubectl get nodes
NAME         STATUS   ROLES           AGE     VERSION
controller   Ready    control-plane   4m37s   v1.25.4
worker-1     Ready    <none>          134s    v1.25.4
worker-2     Ready    <none>          110s    v1.25.4

Create a new docker network for use in the cluster:

docker network create multinode

Now deploy the KNE dependencies (CNI, ingress, vendor controllers):

kne deploy kne/deploy/kne/external-multinode.yaml

IMPORTANT: Contact Arista to get access to the cEOS image.

Create the 150 node cEOS topology:

kne create kne/examples/arista/ceos-150/ceos-150.pb.txt

Open a second terminal on the controller VM to track the topology creation progress:

$ kubectl get pods -n ceos-150 -o wide --watch
NAME   READY   STATUS     RESTARTS   AGE     IP             NODE       NOMINATED NODE   READINESS GATES
r1     1/1     Running    0          9m22s   10.244.1.66    worker-1   <none>           <none>
r10    0/1     Init:0/1   0          9m43s   10.244.1.59    worker-2   <none>           <none>
r100   0/1     Init:0/1   0          8m17s   10.244.1.91    worker-1   <none>           <none>
r101   1/1     Running    0          8m36s   10.244.1.84    worker-1   <none>           <none>
r102   0/1     Init:0/1   0          9m4s    10.244.1.72    worker-2   <none>           <none>
r103   0/1     Pending    0          5m20s   <none>         <none>     <none>           <none>
...

This command will show all of the pods with additional information including with worker node they are running on. If any of the pods are stuck in a Pending state for a long time with no NODE assigned, see the troubleshooting section.

Once the kne create command completes with success, your topology is ready. Confirm by checking all nodes are Running using kubectl get pods -n ceos-150.

Verify that a gRPC connection can be made to a pod on one of the worker VMs, the service external IP can be determined from kubectl get services -n ceos-150:

TIP: gnmi_cli can be installed by running go install github.com/openconfig/gnmi/cmd/gnmi_cli@latest if the CLI is missing from the VM.

export GNMI_USER=admin
export GNMI_PASS=admin
gnmi_cli -a <service external ip>:6030 -q "/interfaces/interface/state" -tls_skip_verify -with_user_pass

Troubleshooting

Too many pods

If the topology being deployed has too many pods for number of worker nodes, you may see a warning event like below when waiting for a Pending pod with no assigned node:

$ kubectl describe pods r25 -n ceos-150
...
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  34s   default-scheduler  0/2 nodes are available: 1 Too many pods, 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 Preemption is not helpful for scheduling.

To fix this, add another worker VM and run the necessary commands to have it join the cluster. Then without any further action on the controller, the assignments should resolve themselves.

Service external IP pending

If you see some services with a <pending> EXTERNAL-IP, then your MetalLB configuration does not have enough IPs to assign. This codelab uses a configuration with a pool of 200 unique external IP addresses. If your topology has more than 200 nodes then this issue will be seen. Increase the value to accommodate your number of nodes, and redeploy.

$ kubectl get services -n ceos-150
NAME           TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)                                     AGE
service-r1     LoadBalancer   10.106.154.237   172.18.0.110   6030:31407/TCP,22:30550/TCP,443:30990/TCP   29m
service-r10    LoadBalancer   10.98.147.137    172.18.0.103   6030:31914/TCP,22:31900/TCP,443:30555/TCP   30m
service-r100   LoadBalancer   10.101.4.72      172.18.0.135   6030:32364/TCP,22:30243/TCP,443:32621/TCP   28m
service-r101   LoadBalancer   10.98.45.234     172.18.0.128   6030:32515/TCP,22:30159/TCP,443:32329/TCP   29m
service-r102   LoadBalancer   10.107.163.133   172.18.0.116   6030:32344/TCP,22:30100/TCP,443:31541/TCP   29m
service-r103   LoadBalancer   10.96.185.215    <pending>      6030:31092/TCP,22:30549/TCP,443:32424/TCP   25m
...