Skip to content
This repository has been archived by the owner on Jul 19, 2024. It is now read-only.

Commit

Permalink
Merge branch 'feature/update-operator'
Browse files Browse the repository at this point in the history
  • Loading branch information
displague committed Mar 28, 2019
2 parents 8a02287 + ec47b88 commit 683f0d1
Show file tree
Hide file tree
Showing 16 changed files with 284 additions and 45 deletions.
28 changes: 23 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Kubernetes Terraform installer for Linode Instances

This Terraform module creates a Kubernetes v1.13 Cluster on Linode Cloud infrastructure using the ContainerLinux operating system. The cluster is designed to take advantage of the Linode regional private network, and is equiped with Linode specific cluster enhancements.
This Terraform module creates a Kubernetes v1.14.0 Cluster on Linode Cloud infrastructure using the ContainerLinux operating system. The cluster is designed to take advantage of the Linode regional private network, and is equiped with Linode specific cluster enhancements.

Cluster size and instance types are configurable through Terraform variables.

Expand All @@ -20,8 +20,8 @@ Before running the project you'll have to create an access token for Terraform t
Using the token and your access key, create the `LINODE_TOKEN` environment variable:

```bash
read -sp "Linode Token: " LINODE_TOKEN # Enter your Linode Token (it will be hidden)
export LINODE_TOKEN
read -sp "Linode Token: " TF_VAR_linode_token # Enter your Linode Token (it will be hidden)
export TF_VAR_linode_token
```

This variable will need to be supplied to every Terraform `apply`, `plan`, and `destroy` command using `-var linode_token=$LINODE_TOKEN` unless a `terraform.tfvars` file is created with this secret token.
Expand Down Expand Up @@ -74,8 +74,14 @@ This will do the following:
* installs a Calico network between Linode Instances
* runs kubeadm init on the master server and configures kubectl
* joins the nodes in the cluster using the kubeadm token obtained from the master
* installs Linode add-ons: CSI (LinodeBlock Storage Volumes), CCM (Linode NodeBalancers), External-DNS (Linode Domains)
* installs cluster add-ons: Kubernetes dashboard, metrics server and Heapster
* installs Linode add-ons:
* [CSI](https://github.com/linode/linode-blockstorage-csi-driver/) (LinodeBlock Storage Volumes)
* [CCM](https://github.com/linode/linode-cloud-controller-manager) (Linode NodeBalancers)
* [External-DNS](https://github.com/kubernetes-incubator/external-dns/blob/master/docs/tutorials/linode.md) (Linode Domains)
* installs cluster add-ons:
* Kubernetes dashboard
* metrics server
* [Container Linux Update Operator](https://github.com/coreos/container-linux-update-operator)
* copies the kubectl admin config file for local `kubectl` use via the public IP of the API server

A full list of the supported variables are available in the [Terraform Module Registry](https://registry.terraform.io/modules/linode/k8s/linode/?tab=inputs).
Expand Down Expand Up @@ -161,6 +167,18 @@ As configured in this Terraform module, any service or ingress with a specific a

[Learn more at the External-DNS Github project.](https://github.com/kubernetes-incubator/external-dns)

### [**Container Linux Update Operator**](https://github.com/coreos/container-linux-update-operator/)

The Update Operator deploys an agent to all of the nodes (include the master) which will schedule Container Linux reboots when an update has been prepared. The Update Operator prevents multiple nodes from rebooting at the same time. Cordone and drain commands are sent to the nodes before rebooting. **System update reboots are paused by default** to prevent new clusters from rebooting in the first five minutes of their life-cycle which could have an adverse effect on the Terraform provisioning process.

Set the `update_agent_reboot_paused` variable using the `-var` argument, `TF_VAR_update_agent_reboot_paused` environment variable, or by creating a `update_agent.tfvars` file with the following contents:

```
update_agent_reboot_paused = "false"
```

In practice, rebooted nodes will be unavailable for a minute or two once the reboot has started. Take advantage of the Linode Block Storage CSI driver so Persistent Volumes can be rescheduled with workloads to the available nodes.

## Development

To make changes to this project, verify that you have the prerequisites and then clone the repo. Instead of using the Terraform `module` syntax, and being confined by the variables that are provided, you'll be able to make any changes necessary.
Expand Down
19 changes: 18 additions & 1 deletion main.tf
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
provider "linode" {
token = "${var.linode_token}"
version = "1.4.0"
version = "1.5.0"
}

provider "external" {
Expand Down Expand Up @@ -64,3 +64,20 @@ resource "null_resource" "local_kubectl" {
on_failure = "continue"
}
}

resource "null_resource" "update-agent" {
depends_on = ["module.masters", "module.nodes"]

triggers {
cluster_ips = "${"${module.masters.k8s_master_public_ip} ${join(" ", module.nodes.nodes_public_ip)}"}"
}

provisioner "remote-exec" {
connection {
host = "${module.masters.k8s_master_public_ip}"
user = "core"
}

inline = ["/opt/bin/kubectl annotate node --all --overwrite container-linux-update.v1.coreos.com/reboot-paused=${var.update_agent_reboot_paused}"]
}
}
19 changes: 15 additions & 4 deletions modules/instances/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,20 @@ resource "linode_instance" "instance" {
}
}

provisioner "remote-exec" {
inline = [
"mkdir -p /home/core/init/",
]

connection {
user = "core"
timeout = "300s"
}
}

provisioner "file" {
source = "${path.module}/scripts/"
destination = "/tmp"
destination = "/home/core/init/"

connection {
user = "core"
Expand All @@ -42,9 +53,9 @@ resource "linode_instance" "instance" {
provisioner "remote-exec" {
inline = [
"set -e",
"chmod +x /tmp/start.sh && sudo /tmp/start.sh",
"chmod +x /tmp/linode-network.sh && sudo /tmp/linode-network.sh ${self.private_ip_address} ${self.label}",
"chmod +x /tmp/kubeadm-install.sh && sudo /tmp/kubeadm-install.sh ${var.k8s_version} ${var.cni_version} ${var.crictl_version} ${self.label} ${var.use_public ? self.ip_address : self.private_ip_address} ${var.k8s_feature_gates}",
"chmod +x /home/core/init/start.sh && sudo /home/core/init/start.sh",
"chmod +x /home/core/init/linode-network.sh && sudo /home/core/init/linode-network.sh ${self.private_ip_address} ${self.label}",
"chmod +x /home/core/init/kubeadm-install.sh && sudo /home/core/init/kubeadm-install.sh ${var.k8s_version} ${var.cni_version} ${var.crictl_version} ${self.label} ${var.use_public ? self.ip_address : self.private_ip_address} ${var.k8s_feature_gates}",
]

connection {
Expand Down
15 changes: 9 additions & 6 deletions modules/instances/outputs.tf
Original file line number Diff line number Diff line change
@@ -1,16 +1,19 @@
// todo: ha, return nb address
output "public_ip_address" {
depends_on = ["linode_instance.instance.0"]
value = "${element(concat(linode_instance.instance.*.ip_address, list("")), 0)}"
depends_on = ["linode_instance.instance.0"]
description = "Public IP Address of the first instance in the group"
value = "${element(concat(linode_instance.instance.*.ip_address, list("")), 0)}"
}

// todo: this doesnt make sense in ha -- return all?
output "private_ip_address" {
depends_on = ["linode_instance.instance.0"]
value = "${element(concat(linode_instance.instance.*.private_ip_address, list("")), 0)}"
description = "Private IP Address of the first instance in the group"
depends_on = ["linode_instance.instance.0"]
value = "${element(concat(linode_instance.instance.*.private_ip_address, list("")), 0)}"
}

output "nodes_public_ip" {
depends_on = ["linode_instance.instance.*"]
value = "${concat(linode_instance.instance.*.ip_address)}"
depends_on = ["linode_instance.instance.*"]
description = "Public IP Address of the instance(s)"
value = "${concat(linode_instance.instance.*.ip_address)}"
}
3 changes: 0 additions & 3 deletions modules/instances/scripts/end.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
#!/usr/bin/env bash
set -e

# TODO: https://github.com/coreos/container-linux-update-operator
# systemctl start update-engine

6 changes: 3 additions & 3 deletions modules/instances/scripts/linode-addons.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@ LINODE_TOKEN="$2"
sed -i -E \
-e 's/\$\(LINODE_REGION\)/'$LINODE_REGION'/g' \
-e 's/\$\(LINODE_TOKEN\)/'$LINODE_TOKEN'/g' \
/tmp/linode-token.yaml
/home/core/init/linode-token.yaml

# TODO swap these for helm charts
for yaml in \
linode-token.yaml \
ccm-linode.yaml \
csi-linode.yaml \
external-dns.yaml \
; do kubectl apply -f /tmp/${yaml}; done
; do kubectl apply -f /home/core/init/${yaml}; done

rm /tmp/linode-token.yaml
rm /home/core/init/linode-token.yaml
6 changes: 3 additions & 3 deletions modules/instances/scripts/monitoring-install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ set -e

# TODO swap these for helm charts

kubectl apply -f /tmp/dashboard-rbac.yaml
kubectl apply -f /tmp/dashboard.yaml
kubectl apply -f /home/core/init/dashboard-rbac.yaml
kubectl apply -f /home/core/init/dashboard.yaml

kubectl apply -f /tmp/metrics-server.yaml
kubectl apply -f /home/core/init/metrics-server.yaml
6 changes: 5 additions & 1 deletion modules/instances/scripts/start.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,8 @@ set -e

for mod in ip_vs_sh ip_vs ip_vs_rr ip_vs_wrr nf_conntrack_ipv4; do echo $mod | sudo tee /etc/modules-load.d/$mod.conf; done

systemctl stop update-engine
# Enable the update-engine, but disable the locksmith which it requires
sudo systemctl unmask update-engine.service || true
sudo systemctl start update-engine.service || true
sudo systemctl stop locksmithd.service || true
sudo systemctl mask locksmithd.service || true
7 changes: 7 additions & 0 deletions modules/instances/scripts/update-operator.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/usr/bin/env bash

set -e

# TODO swap these for helm charts

kubectl apply -f /home/core/init/update-operator.yaml
26 changes: 20 additions & 6 deletions modules/masters/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,21 @@ module "master_instance" {
resource "null_resource" "masters_provisioner" {
depends_on = ["module.master_instance"]

provisioner "remote-exec" {
inline = [
"mkdir -p /home/core/init/",
]

connection {
user = "core"
timeout = "300s"
host = "${module.master_instance.public_ip_address}"
}
}

provisioner "file" {
source = "${path.module}/manifests/"
destination = "/tmp"
destination = "/home/core/init/"

connection {
user = "core"
Expand All @@ -34,13 +46,15 @@ resource "null_resource" "masters_provisioner" {
# TODO advertise on public adress
inline = [
"set -e",
"chmod +x /tmp/kubeadm-init.sh && sudo /tmp/kubeadm-init.sh ${var.cluster_name} ${var.k8s_version} ${module.master_instance.public_ip_address} ${module.master_instance.private_ip_address} ${var.k8s_feature_gates}",
"chmod +x /home/core/init/kubeadm-init.sh && sudo /home/core/init/kubeadm-init.sh ${var.cluster_name} ${var.k8s_version} ${module.master_instance.public_ip_address} ${module.master_instance.private_ip_address} ${var.k8s_feature_gates}",
"mkdir -p $HOME/.kube && sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config && sudo chown core $HOME/.kube/config",
"export PATH=$${PATH}:/opt/bin",
"kubectl apply -f /tmp/calico.yaml",
"chmod +x /tmp/linode-addons.sh && /tmp/linode-addons.sh ${var.region} ${var.linode_token}",
"chmod +x /tmp/monitoring-install.sh && /tmp/monitoring-install.sh",
"chmod +x /tmp/end.sh && sudo /tmp/end.sh",
"kubectl apply -f /home/core/init/calico.yaml",
"chmod +x /home/core/init/linode-addons.sh && /home/core/init/linode-addons.sh ${var.region} ${var.linode_token}",
"chmod +x /home/core/init/monitoring-install.sh && /home/core/init/monitoring-install.sh",
"chmod +x /home/core/init/update-operator.sh && /home/core/init/update-operator.sh",
"kubectl annotate node $${HOSTNAME} --overwrite container-linux-update.v1.coreos.com/reboot-paused=true",
"chmod +x /home/core/init/end.sh && sudo /home/core/init/end.sh",
]

connection {
Expand Down
153 changes: 153 additions & 0 deletions modules/masters/manifests/update-operator.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
# xref: https://raw.githubusercontent.com/coreos/container-linux-update-operator/master/examples/deploy/00-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: reboot-coordinator
---
# xref: https://raw.githubusercontent.com/coreos/container-linux-update-operator/master/examples/deploy/update-operator.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: container-linux-update-operator
namespace: reboot-coordinator
spec:
replicas: 1
template:
metadata:
labels:
app: container-linux-update-operator
spec:
containers:
- name: update-operator
image: quay.io/coreos/container-linux-update-operator:v0.7.0
command:
- "/bin/update-operator"
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
---
# xref: https://raw.githubusercontent.com/coreos/container-linux-update-operator/master/examples/deploy/update-agent.yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: container-linux-update-agent
namespace: reboot-coordinator
spec:
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
app: container-linux-update-agent
spec:
containers:
- name: update-agent
image: quay.io/coreos/container-linux-update-operator:v0.7.0
command:
- "/bin/update-agent"
volumeMounts:
- mountPath: /var/run/dbus
name: var-run-dbus
- mountPath: /etc/coreos
name: etc-coreos
- mountPath: /usr/share/coreos
name: usr-share-coreos
- mountPath: /etc/os-release
name: etc-os-release
env:
# read by update-agent as the node name to manage reboots for
- name: UPDATE_AGENT_NODE
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
volumes:
- name: var-run-dbus
hostPath:
path: /var/run/dbus
- name: etc-coreos
hostPath:
path: /etc/coreos
- name: usr-share-coreos
hostPath:
path: /usr/share/coreos
- name: etc-os-release
hostPath:
path: /etc/os-release
---
# xref: https://raw.githubusercontent.com/coreos/container-linux-update-operator/master/examples/deploy/rbac/cluster-role.yaml
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: reboot-coordinator
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- get
- update
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- watch
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- list
- delete
- apiGroups:
- "extensions"
resources:
- daemonsets
verbs:
- get
---
# xref: https://raw.githubusercontent.com/coreos/container-linux-update-operator/master/examples/deploy/rbac/cluster-role-binding.yaml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: reboot-coordinator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: reboot-coordinator
subjects:
- kind: ServiceAccount
namespace: reboot-coordinator
name: default

Loading

0 comments on commit 683f0d1

Please sign in to comment.