Improve README.md

compumike · Oct 23, 2020 · b955fcf · b955fcf
1 parent 1d13489
commit b955fcf
Show file tree

Hide file tree

Showing 2 changed files with 58 additions and 11 deletions.
diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
 # hairpin-proxy
 
-PROXY protocol support for internal-to-LoadBalancer traffic for Kubernetes Ingress users.
+PROXY protocol support for internal-to-LoadBalancer traffic for Kubernetes Ingress users, specifically for cert-manager self-checks.
 
 If you've had problems with ingress-nginx, cert-manager, LetsEncrypt ACME HTTP01 self-check failures, and the PROXY protocol, read on.
 
@@ -21,9 +21,9 @@ In this case, Kubernetes networking is too smart for its own good. [See upstream
 
 An ingress controller service deploys a LoadBalancer, which is provisioned by your cloud provider. Kubernetes notices the LoadBalancer's external IP address. As an "optimization", kube-proxy on each node writes iptables rules that rewrite all outbound traffic to the LoadBalancer's external IP address to instead be redirected to the cluster-internal Service ClusterIP address. If your cloud load balancer doesn't modify the traffic, then indeed this is a helpful optimization.
 
-However, when you have the PROXY protocol enabled, the external load balancer _does_ modify the traffic, prepending the PROXY line before each TCP connection. If you connect directly to the web server internally, bypassing the external load balancer, then it will receive traffic _without_ the PROXY line. In the case of ingress-nginx with `use-proxy-protocol: "true"`, you'll find that NGINX fails when receiving a bare GET request. As a result, accessing http://your-site/ from inside the cluster fails!
+However, when you have the PROXY protocol enabled, the external load balancer _does_ modify the traffic, prepending the PROXY line before each TCP connection. If you connect directly to the web server internally, bypassing the external load balancer, then it will receive traffic _without_ the PROXY line. In the case of ingress-nginx with `use-proxy-protocol: "true"`, you'll find that NGINX fails when receiving a bare GET request. As a result, accessing http://subdomain.example.com/ from inside the cluster fails!
 
-This is particularly a problem when using cert-manager for provisioning SSL certificates. Cert-manager uses HTTP01 validation, and before asking LetsEncrypt to hit http://your-site/some-special-url, it tries to access this URL itself as a self-check. This fails. Cert-manager does not allow you to skip the self-check. As a result, your certificate is never provisioned, even though the verification URL would be perfectly accessible externally. See upstream cert-manager issues: [proxy_protocol mode breaks HTTP01 challenge Check stage](https://github.com/jetstack/cert-manager/issues/466), [http-01 self check failed for domain](https://github.com/jetstack/cert-manager/issues/656), [Self check always fail](https://github.com/jetstack/cert-manager/issues/863) 
+This is particularly a problem when using cert-manager for provisioning SSL certificates. Cert-manager uses HTTP01 validation, and before asking LetsEncrypt to hit http://subdomain.example.com/.well-known/acme-challenge/some-special-code, it tries to access this URL itself as a self-check. This fails. Cert-manager does not allow you to skip the self-check. As a result, your certificate is never provisioned, even though the verification URL would be perfectly accessible externally. See upstream cert-manager issues: [proxy_protocol mode breaks HTTP01 challenge Check stage](https://github.com/jetstack/cert-manager/issues/466), [http-01 self check failed for domain](https://github.com/jetstack/cert-manager/issues/656), [Self check always fail](https://github.com/jetstack/cert-manager/issues/863)
 
 ## Possible Solutions
 
@@ -38,14 +38,63 @@ None of these are particularly easy without modifying upstream packages, and the
 
 ## The hairpin-proxy Solution
 
-1. hairpin-proxy intercepts and modifies cluster-internal DNS lookups for hostnames that are served by your ingress controller, pointing them to the IP of an internal `hairpin-proxy-haproxy` service instead. (This is managed by `hairpin-proxy-controller`, which simply watches the Kubernetes API for new/modified Ingress resources and updates the CoreDNS ConfigMap when necessary.)
+1. hairpin-proxy intercepts and modifies cluster-internal DNS lookups for hostnames that are served by your ingress controller, pointing them to the IP of an internal `hairpin-proxy-haproxy` service instead. (This is managed by `hairpin-proxy-controller`, which simply watches the Kubernetes API for new/modified Ingress resources, examines their `spec.tls.hosts`, and updates the CoreDNS ConfigMap when necessary.)
 2. The internal `hairpin-proxy-haproxy` service runs a minimal HAProxy instance which is configured to append the PROXY line and forward the traffic on to the internal ingress controller.
 
 As a result, when pod in your cluster (such as cert-manager) try to access http://your-site/, they resolve to the hairpin-proxy, which adds the PROXY line and sends it to your `ingress-nginx`. The NGINX parses the PROXY protocol just as it would if it had come from an external load balancer, so it sees a valid request and handles it identically to external requests.
 
-## Deployment
+## Quick Start Deployment
+
+### Step 0: Confirm that HTTP does NOT work from containers in your cluster
+
+Let's suppose that `http://subdomain.example.com/` is served from your cluster, behind a cloud load balancer with PROXY protocol enabled, and served by an ingress-nginx. You've just tried to add `cert-manager` but found that your certificates are stuck because the self-check is failing.
+
+Get a shell within your cluster and try to access the site to confirm that it isn't working:
+
+```shell
+k run my-test-container --image=alpine -it --rm -- /bin/sh
+apk add bind-tools curl
+dig subdomain.example.com
+curl http://subdomain.example.com/
+curl http://subdomain.example.com/ --haproxy-protocol
+```
+
+The `dig` should show the external load balancer IP address. The `curl` should fail with `Empty reply from server` because NGINX expects the PROXY protocol. However, the second `curl` with `--haproxy-protocol` should succeed, indicating that despite the external-appearing IP address, the traffic is being rewritten by Kubernetes to bypass the external load balancer.
+
+### Step 1: Install hairpin-proxy in your Kubernetes cluster
+
+```shell
+kubectl apply -f https://raw.githubusercontent.com/compumike/hairpin-proxy/master/deploy.yml
+```
+
+If you're using [ingress-nginx](https://kubernetes.github.io/ingress-nginx/) and optionally [cert-manager](https://github.com/jetstack/cert-manager), it will work out of the box.
+
+### Step 2: Confirm that your CoreDNS configuration was updated
 
 ```shell
-kubectl apply -f deploy.yml
+kubectl get configmap -n kube-system coredns -o=jsonpath='{.data.Corefile}'
 ```
-Coming soon.
+
+Once the hairpin-proxy-controller pod starts, you should immediately see one [rewrite](https://coredns.io/plugins/rewrite/) line per TLS-enabled ingress host, such as:
+
+```
+rewrite name subdomain.example.com hairpin-proxy.hairpin-proxy.svc.cluster.local # Added by hairpin-proxy
+```
+
+Note that the comment `# Added by hairpin-proxy` is used to prevent hairpin-proxy-controller from modifying any other rewrites you may have.
+
+### Step 3: Confirm that your DNS has propagated and that HTTP now works from containers in your cluster
+
+```shell
+k run my-test-container --image=alpine -it --rm -- /bin/sh
+apk add bind-tools curl
+dig subdomain.example.com
+dig hairpin-proxy.hairpin-proxy.svc.cluster.local
+curl http://subdomain.example.com/
+```
+
+This time, the first `dig` should show an internal service IP address (generally `10.x.y.z`), matching the second `dig`. This time, the `curl` should succeed.
+
+NOTE: CoreDNS is a cache, so even if you see the `rewrite` rules in Step 2, it will take another minute or two before the queries resolve correctly. Be patient. You may wish to `watch -n 1 dig subdomain.example.com` to see when this changeover happens.
+
+At this point, cert-manager's self-check will pass, and you'll get valid LetsEncrypt certificates within a few minutes.
diff --git a/deploy.yml b/deploy.yml
@@ -23,8 +23,7 @@ spec:
         app: hairpin-proxy-haproxy
     spec:
       containers:
-        - image: compumike/hairpin-proxy-haproxy:latest
-          imagePullPolicy: Always
+        - image: compumike/hairpin-proxy-haproxy:0.1
           name: main
 
 ---
@@ -146,6 +145,5 @@ spec:
         runAsUser: 405
         runAsGroup: 65533
       containers:
-        - image: compumike/hairpin-proxy-controller:latest
-          imagePullPolicy: Always
+        - image: compumike/hairpin-proxy-controller:0.1
           name: main