GitHub - mikhailknyazev/kube-course-litmus2-gke: Litmus 2.x resources for Udemy course "Configuring Kubernetes for Reliability with LitmusChaos"

Litmus 2.x resources for "Configuring Kubernetes for Reliability with LitmusChaos"

This repo consists of resources that were used while carrying out the Litmus 2.x demonstration in this video tutorial. It is a part of Udemy course Configuring Kubernetes for Reliability with LitmusChaos. They are referenced as part of instructions to replicate the demo environment. Also provided are some basic details about the bank-of-anthos & podtato-kill-with-http-probe chaos workflows described therein.

Prerequisites

Create a multi-node (preferably, 3-node) GKE cluster with compute instance type: Ubuntu with Docker ubuntu. Configure cluster access for kubectl.

Demo Environment Configuration

The following steps provide the testbed configuration instructions.

Step-1: Application Deployment

Deploy the test applications that will be subjected to chaos

Step-2: Install LitmusChaos

Install the LitmusChaos 2.x control plane (chaos center) & local cluster-mode chaos (self) agent

Note: Update your gcp firewall rules to allow traffic to/from the litmusportal server nodeport to ensure successful functioning of the chaos (self) agent.

Step-3: Set up Observability Infra

Set up the observability infrastructure with kube-prometheus-stack

Step-4: Begin Monitoring Litmus & Application Metrics

Deploy blackbox exporter to track the podtato-head service's operational characteristics
Create servicemonitor custom resources mapped to the chaos exporter and blackbox exporter
Add the newly created servicemonitors to the prometheus CR instance to & apply to start scraping the metrics
Launch the chaos-instrumented dashboard on Grafana to visualize service metrics

Chaos Usecases

This section describes the intent & functioning behind the two sample chaos workflows used in the demonstration.

Prerequisites

Period: 0m0s-5m22s
Objectives:
- Introduction to the LitmusChaos control plane (chaos center, viz litmus portal)
- Feature Overview

Bank of Anthos BlackHole Chaos Workflow

Period: 5m23s-11m20s
Objectives:
- Creation of a chaos workflow by selecting & tuning an experiment from the integrated chaoshub
- Execution & visualization of workflow progress
- Examination of experiment logs & chaosresults
Usecase: The workflow injects 100% network packet loss in the balancereader pod, causing a degraded user experience and a semi-operational/faulty e-banking app.
Possible Mitigation/Resilience Fix: Configure services with liveness probes/health-checks that call out accessibility errors(by killing/crashing the containers) with additional replicas of the microservice at hand to serve requests. Further fixes could involve the inclusion of middleware that can re-route request to replicas on other nodes/geo-locations based on (degraded) perf characteristics of the service.

Podtato-Head Pod Kill Chaos Workflow

Period: 11m21s-16:41s
Objectives:
- Creation of chaos workflow from a pre-existing template
- Steady-state hypothesis validation through chaos duration using Litmus HTTP Probe
- Visualization of chaos impact/manual SLO checks via chaos interleaved grafana dashboards
- Examination of experiment logs & chaosresults (with probe success/failures)
Usecase: The workflow injects a pod kill/deletion fault on a single-replica podtato-head application causing the availability percentage to drop below the set threshold & also violating access latency limits until the pod is rescheduled and initialized.
Possible Mitigation/Resilience Fix: Follow deployment best practices with multi-replica deployments so that kube-proxy can route requests to other live end-points.

Conclusion

To learn more about LitmusChaos 2.x, refer to the documentation. Have a look at the Udemy course Configuring Kubernetes for Reliability with LitmusChaos.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
applications		applications
monitoring		monitoring
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Litmus 2.x resources for "Configuring Kubernetes for Reliability with LitmusChaos"

Prerequisites

Demo Environment Configuration

Step-1: Application Deployment

Step-2: Install LitmusChaos

Step-3: Set up Observability Infra

Step-4: Begin Monitoring Litmus & Application Metrics

Chaos Usecases

Prerequisites

Bank of Anthos BlackHole Chaos Workflow

Podtato-Head Pod Kill Chaos Workflow

Conclusion

About

Releases

Packages

Contributors 2

mikhailknyazev/kube-course-litmus2-gke

Folders and files

Latest commit

History

Repository files navigation

Litmus 2.x resources for "Configuring Kubernetes for Reliability with LitmusChaos"

Prerequisites

Demo Environment Configuration

Step-1: Application Deployment

Step-2: Install LitmusChaos

Step-3: Set up Observability Infra

Step-4: Begin Monitoring Litmus & Application Metrics

Chaos Usecases

Prerequisites

Bank of Anthos BlackHole Chaos Workflow

Podtato-Head Pod Kill Chaos Workflow

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages