Skip to content

Complex modelling and forecast various scenarios against your cloud application

License

Notifications You must be signed in to change notification settings

appvia/cloud-spend-forecaster

Repository files navigation

Cloud Spend Forecaster

Known Vulnerabilities GitHub license GitHub stars GitHub forks GitHub issues ci

This app allows you to handle complex modelling and forecast various scenarios against your cloud application.

Everything is defined in terms of the amount of resource required to service a single request, in addition to a baseline for that component.

It also allows you to describe a penality for failed requests like what you might be used to with an SLA.

See the demo.ts for a complete example of a simple three tier (frontend, app, database) application.

You can model all of your application tiers, you could even have this read from Kubernetes manifests if you like.

It assumes you've got an accurate performance benchmark on what your applications can perform, and assumes that things scale linearly.

Usage

Pullers

These are the the key events in a timeline, it is an array of timestamps to the requests a second.

const pullers = [
  {
    time: 1234, // unix timestamp
    requests: 500 // number of requests a second
  },
  ...
]

Units

The units of time to calculate in, minutes probably makes sense for most use cases.

const units = 60000

Components

These are tiers of your application

Name Definition
name Friendly name of the component e.g. frontend, backend, database, cache, searchindex
requestToCPU Number of milicores required to handle a single request
requestToMemory Number of mb of memory required to handle a single request
baselineCpu Baseline milicores required for the application to run, standing still, before any requests are added
baselineMemory Baseline mb of memory required for the application to run, standing still, before any requests are added
limitMemory Limit of memory in mb
limitCpu Limit of the CPU milicores
minReplica Minimum replica count configured in the Horizontal Pod Autoscaler
maxReplica Maximum replica count configured in the Horizontal Pod Autoscaler
scalingThresholdCpu Percentage of CPU configured in the Horizontal Pod Autoscaler
scalingIntervals Number of units required for the component to be ready for requests
const components = [
  {
    name: 'backend',
    requestToCpu: 10,
    requestToMemory: 43,
    baselineCpu: 110,
    baselineMemory: 700,
    limitMemory: 1024,
    limitCpu: 900,
    minReplica: 5,
    maxReplica: 1000,
    scalingThresholdCpu: 75,
    scalingIntervals: 2
  }
  ...
]

Node

Parameters for the Node type you've selected to use

Name Definition
maxPods Maximum number of pods that can run
availableCpu Number of milicores available to use
availableMemory Number mb of memory available to use
cost Cost per unit of time
scalingIntervals Number of units required for the node to be available and ready to schedule workload
maxNodes 1000
minNodes 30
const node = {
  // m5.8xlarge
  maxPods: 234,
  availableCpu: 31750,
  availableMemory: 124971,
  cost: 1.536 / 60,
  scalingIntervals: 5,
  maxNodes: 1000,
  minNodes: 30,
}

Failed Request Penalty

The penalty to apply per request that fails, useful to consider your cost saving aspirations against the effective business cost of failing a request.

const failedRequestPenalty = 0.02

RUN!

const output = calculate(pullers, units, components, node, failedRequestPenalty)

Example output

Name Definition
time Timestamp
requests Requests/second received in this interval
components[].needCpuForRequests CPU required to handle the requests (without replica*baseline applied)
components[].needMemoryForRequests Memory required to handle the requests (without replica*baseline applied)
components[].needCpuReplica Needed replica based on CPU need
components[].needMemoryReplica Needed Replica based on memory need
components[].needReplica Needed Replica
components[].needCpu Needed CPU
components[].needMemory Needed Memory
components[].desiredReplica Desired Replica
components[].desiredCpu Desired Cpu
components[].desiredMemory Desired Memory
components[].pendingReplica Number of replicas that are pending
components[].readyReplica Number replicas that are ready
components[].readyRequestCapacity Capacity for requests/second
components[].failedRequests Failed requests
needPods Total number of Pods needed
needCpu Total CPU needed
needMemory Total memory needed
desiredPods Total desired pods
desiredCpu Total desired cpu in mb
desiredMemory Total desired memory in mb
needNodesByCpu Number of nodes by CPU that are needed
needNodesByMemory Number of nodes by memory that are needed
needNodesByPods Number of nodes by pods that are needed
desiredNodesByCpu Number of nodes by cpu need that are desired
desiredNodesByMemory Number of nodes by memory need that are desired
desiredNodesByPods Number of nodes by pods that are desired
desiredNodes Number of nodes that are desired
readyNodes Number of nodes that are ready
readyRequestCapacity Capacity for requests/second
failedRequests Failed requests/second in this interval
cost Cost for this interval
failedRequestPenalty Total penalty for this interval
[
  {
    time: 1609465200000,
    requests: 473,
    components: [
      {
        name: 'backend',
        requestToCpu: 10,
        requestToMemory: 43,
        baselineCpu: 110,
        baselineMemory: 700,
        limitMemory: 1024,
        limitCpu: 900,
        minReplica: 5,
        maxReplica: 1000,
        scalingThresholdCpu: 75,
        scalingIntervals: 2,
        needCpuForRequests: 4730,
        needMemoryForRequests: 20339,
        needCpuReplica: 6,
        needMemoryReplica: 63,
        needReplica: 63,
        needCpu: 11660,
        needMemory: 64439,
        desiredReplica: 8,
        desiredCpu: 7200,
        desiredMemory: 8192,
        pendingReplica: 8,
        readyReplica: 7,
        readyRequestCapacity: 52,
        failedRequests: 421
      },
      {
        name: 'frontend',
        requestToCpu: 5,
        requestToMemory: 0.2,
        baselineCpu: 38,
        baselineMemory: 48,
        limitMemory: 512,
        limitCpu: 800,
        minReplica: 50,
        maxReplica: 2000,
        scalingThresholdCpu: 99,
        scalingIntervals: 2,
        needCpuForRequests: 2365,
        needMemoryForRequests: 94.60000000000001,
        needCpuReplica: 4,
        needMemoryReplica: 1,
        needReplica: 50,
        needCpu: 4265,
        needMemory: 2494.6,
        desiredReplica: 50,
        desiredCpu: 40000,
        desiredMemory: 25600,
        pendingReplica: 50,
        readyReplica: 50,
        readyRequestCapacity: 7620,
        failedRequests: 0
      },
      {
        name: 'database',
        requestToCpu: 32,
        requestToMemory: 1.5,
        baselineCpu: 2500,
        baselineMemory: 2048,
        limitMemory: 4096,
        limitCpu: 15000,
        minReplica: 8,
        maxReplica: 50,
        scalingThresholdCpu: 25,
        scalingIntervals: 120,
        needCpuForRequests: 15136,
        needMemoryForRequests: 709.5,
        needCpuReplica: 2,
        needMemoryReplica: 1,
        needReplica: 8,
        needCpu: 35136,
        needMemory: 17093.5,
        desiredReplica: 8,
        desiredCpu: 120000,
        desiredMemory: 32768,
        pendingReplica: 8,
        readyReplica: 8,
        readyRequestCapacity: 3125,
        failedRequests: 0
      }
    ],
    nodes: {
      maxPods: 234,
      availableCpu: 31750,
      availableMemory: 124971,
      cost: 0.0256,
      scalingIntervals: 5,
      maxNodes: 1000,
      minNodes: 30
    },
    needPods: 121,
    needCpu: 51061,
    needMemory: 84027.1,
    desiredPods: 66,
    desiredCpu: 167200,
    desiredMemory: 66560,
    needNodesByCpu: 2,
    needNodesByMemory: 1,
    needNodesByPods: 1,
    desiredNodesByCpu: 6,
    desiredNodesByMemory: 1,
    desiredNodesByPods: 1,
    desiredNodes: 30,
    readyNodes: 30,
    readyRequestCapacity: 52,
    failedRequests: 421,
    cost: 0.768,
    failedRequestPenalty: 8.42
  },
  ...
]

Graphing

You can graph the output to help visualize it, I used Grafana here.

Grafana Example