This app allows you to handle complex modelling and forecast various scenarios against your cloud application.
Everything is defined in terms of the amount of resource required to service a single request, in addition to a baseline for that component.
It also allows you to describe a penality for failed requests like what you might be used to with an SLA.
See the demo.ts for a complete example of a simple three tier (frontend, app, database) application.
You can model all of your application tiers, you could even have this read from Kubernetes manifests if you like.
It assumes you've got an accurate performance benchmark on what your applications can perform, and assumes that things scale linearly.
These are the the key events in a timeline, it is an array of timestamps to the requests a second.
const pullers = [
{
time: 1234, // unix timestamp
requests: 500 // number of requests a second
},
...
]
The units of time to calculate in, minutes probably makes sense for most use cases.
const units = 60000
These are tiers of your application
Name | Definition |
---|---|
name | Friendly name of the component e.g. frontend , backend , database , cache , searchindex |
requestToCPU | Number of milicores required to handle a single request |
requestToMemory | Number of mb of memory required to handle a single request |
baselineCpu | Baseline milicores required for the application to run, standing still, before any requests are added |
baselineMemory | Baseline mb of memory required for the application to run, standing still, before any requests are added |
limitMemory | Limit of memory in mb |
limitCpu | Limit of the CPU milicores |
minReplica | Minimum replica count configured in the Horizontal Pod Autoscaler |
maxReplica | Maximum replica count configured in the Horizontal Pod Autoscaler |
scalingThresholdCpu | Percentage of CPU configured in the Horizontal Pod Autoscaler |
scalingIntervals | Number of units required for the component to be ready for requests |
const components = [
{
name: 'backend',
requestToCpu: 10,
requestToMemory: 43,
baselineCpu: 110,
baselineMemory: 700,
limitMemory: 1024,
limitCpu: 900,
minReplica: 5,
maxReplica: 1000,
scalingThresholdCpu: 75,
scalingIntervals: 2
}
...
]
Parameters for the Node type you've selected to use
Name | Definition |
---|---|
maxPods | Maximum number of pods that can run |
availableCpu | Number of milicores available to use |
availableMemory | Number mb of memory available to use |
cost | Cost per unit of time |
scalingIntervals | Number of units required for the node to be available and ready to schedule workload |
maxNodes | 1000 |
minNodes | 30 |
const node = {
// m5.8xlarge
maxPods: 234,
availableCpu: 31750,
availableMemory: 124971,
cost: 1.536 / 60,
scalingIntervals: 5,
maxNodes: 1000,
minNodes: 30,
}
The penalty to apply per request that fails, useful to consider your cost saving aspirations against the effective business cost of failing a request.
const failedRequestPenalty = 0.02
const output = calculate(pullers, units, components, node, failedRequestPenalty)
Name | Definition |
---|---|
time | Timestamp |
requests | Requests/second received in this interval |
components[].needCpuForRequests | CPU required to handle the requests (without replica*baseline applied) |
components[].needMemoryForRequests | Memory required to handle the requests (without replica*baseline applied) |
components[].needCpuReplica | Needed replica based on CPU need |
components[].needMemoryReplica | Needed Replica based on memory need |
components[].needReplica | Needed Replica |
components[].needCpu | Needed CPU |
components[].needMemory | Needed Memory |
components[].desiredReplica | Desired Replica |
components[].desiredCpu | Desired Cpu |
components[].desiredMemory | Desired Memory |
components[].pendingReplica | Number of replicas that are pending |
components[].readyReplica | Number replicas that are ready |
components[].readyRequestCapacity | Capacity for requests/second |
components[].failedRequests | Failed requests |
needPods | Total number of Pods needed |
needCpu | Total CPU needed |
needMemory | Total memory needed |
desiredPods | Total desired pods |
desiredCpu | Total desired cpu in mb |
desiredMemory | Total desired memory in mb |
needNodesByCpu | Number of nodes by CPU that are needed |
needNodesByMemory | Number of nodes by memory that are needed |
needNodesByPods | Number of nodes by pods that are needed |
desiredNodesByCpu | Number of nodes by cpu need that are desired |
desiredNodesByMemory | Number of nodes by memory need that are desired |
desiredNodesByPods | Number of nodes by pods that are desired |
desiredNodes | Number of nodes that are desired |
readyNodes | Number of nodes that are ready |
readyRequestCapacity | Capacity for requests/second |
failedRequests | Failed requests/second in this interval |
cost | Cost for this interval |
failedRequestPenalty | Total penalty for this interval |
[
{
time: 1609465200000,
requests: 473,
components: [
{
name: 'backend',
requestToCpu: 10,
requestToMemory: 43,
baselineCpu: 110,
baselineMemory: 700,
limitMemory: 1024,
limitCpu: 900,
minReplica: 5,
maxReplica: 1000,
scalingThresholdCpu: 75,
scalingIntervals: 2,
needCpuForRequests: 4730,
needMemoryForRequests: 20339,
needCpuReplica: 6,
needMemoryReplica: 63,
needReplica: 63,
needCpu: 11660,
needMemory: 64439,
desiredReplica: 8,
desiredCpu: 7200,
desiredMemory: 8192,
pendingReplica: 8,
readyReplica: 7,
readyRequestCapacity: 52,
failedRequests: 421
},
{
name: 'frontend',
requestToCpu: 5,
requestToMemory: 0.2,
baselineCpu: 38,
baselineMemory: 48,
limitMemory: 512,
limitCpu: 800,
minReplica: 50,
maxReplica: 2000,
scalingThresholdCpu: 99,
scalingIntervals: 2,
needCpuForRequests: 2365,
needMemoryForRequests: 94.60000000000001,
needCpuReplica: 4,
needMemoryReplica: 1,
needReplica: 50,
needCpu: 4265,
needMemory: 2494.6,
desiredReplica: 50,
desiredCpu: 40000,
desiredMemory: 25600,
pendingReplica: 50,
readyReplica: 50,
readyRequestCapacity: 7620,
failedRequests: 0
},
{
name: 'database',
requestToCpu: 32,
requestToMemory: 1.5,
baselineCpu: 2500,
baselineMemory: 2048,
limitMemory: 4096,
limitCpu: 15000,
minReplica: 8,
maxReplica: 50,
scalingThresholdCpu: 25,
scalingIntervals: 120,
needCpuForRequests: 15136,
needMemoryForRequests: 709.5,
needCpuReplica: 2,
needMemoryReplica: 1,
needReplica: 8,
needCpu: 35136,
needMemory: 17093.5,
desiredReplica: 8,
desiredCpu: 120000,
desiredMemory: 32768,
pendingReplica: 8,
readyReplica: 8,
readyRequestCapacity: 3125,
failedRequests: 0
}
],
nodes: {
maxPods: 234,
availableCpu: 31750,
availableMemory: 124971,
cost: 0.0256,
scalingIntervals: 5,
maxNodes: 1000,
minNodes: 30
},
needPods: 121,
needCpu: 51061,
needMemory: 84027.1,
desiredPods: 66,
desiredCpu: 167200,
desiredMemory: 66560,
needNodesByCpu: 2,
needNodesByMemory: 1,
needNodesByPods: 1,
desiredNodesByCpu: 6,
desiredNodesByMemory: 1,
desiredNodesByPods: 1,
desiredNodes: 30,
readyNodes: 30,
readyRequestCapacity: 52,
failedRequests: 421,
cost: 0.768,
failedRequestPenalty: 8.42
},
...
]
You can graph the output to help visualize it, I used Grafana here.