Skip to content

Commit

Permalink
POL-1392 AWS Rightsize RDS Instances: Downsize Multiple Tiers (#2763)
Browse files Browse the repository at this point in the history
* update

* fix

* update

* fix

* update

* fix

* update

* update

* update

* update

* update
  • Loading branch information
XOmniverse authored Oct 21, 2024
1 parent 764f574 commit dfabd5f
Show file tree
Hide file tree
Showing 6 changed files with 126 additions and 104 deletions.
4 changes: 4 additions & 0 deletions cost/aws/rightsize_rds_instances/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Changelog

## v5.5.0

- Added support for downsizing multiple sizes where appropriate

## v5.4.2

- Minor code improvements to conform with current standards. Functionality unchanged.
Expand Down
1 change: 1 addition & 0 deletions cost/aws/rightsize_rds_instances/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ This policy has the following input parameters required when launching the polic
- `Key=~/Regex/` - Filter all resources where the value for the specified key matches the specified regex string.
- `Key!~/Regex/` - Filter all resources where the value for the specified key does not match the specified regex string. This will also filter all resources missing the specified tag key.
- *Exclusion Tags: Any / All* - Whether to filter instances containing any of the specified tags or only those that contain all of them. Only applicable if more than one value is entered in the `Exclusion Tags` field.
- *Skip Instance Sizes* - Whether to recommend downsizing multiple sizes. When set to 'No', only the next smaller size will ever be recommended for downsizing. When set to 'Yes', more aggressive downsizing recommendations will be made when appropriate.
- *Report Unused or Underutilized* - Whether to report on unused instances, underutilized instances, or both. If both are selected, unused instances will not appear in the list of underutilized instances regardless of CPU usage.
- *Underutilized Instance CPU Threshold (%)* - The CPU threshold at which to consider an instance to be underutilized and therefore be flagged for downsizing.
- *Statistic Lookback Period* - How many days back to look at statistical data for instances to determine if they are underutilized or unused. This value cannot be set higher than 90 because AWS does not retain metrics for longer than 90 days.
Expand Down
210 changes: 109 additions & 101 deletions cost/aws/rightsize_rds_instances/aws_rightsize_rds_instances.pt
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ severity "low"
category "Cost"
default_frequency "weekly"
info(
version: "5.4.2",
version: "5.5.0",
provider: "AWS",
service: "RDS",
policy_set: "Rightsize Database Instances",
Expand Down Expand Up @@ -43,6 +43,15 @@ parameter "param_min_savings" do
default 0
end

parameter "param_downsize_multiple" do
type "string"
category "Policy Settings"
label "Skip Instance Sizes"
description "Whether to recommend downsizing multiple sizes. When set to 'No', only the next smaller size will ever be recommended for downsizing. When set to 'Yes', more aggressive downsizing recommendations will be made when appropriate."
allowed_values "Yes", "No"
default "No"
end

parameter "param_regions_allow_or_deny" do
type "string"
category "Filters"
Expand Down Expand Up @@ -1164,146 +1173,145 @@ EOS
end

datasource "ds_rds_nonidle_instances_with_metrics" do
run_script $js_rds_nonidle_instances_with_metrics, $ds_rds_nonidle_instances, $ds_cloudwatch_underutil_cpu_sorted, $ds_cloudwatch_underutil_mem_sorted, $ds_cloudwatch_underutil_netin_sorted, $ds_cloudwatch_underutil_netout_sorted, $ds_instance_costs_grouped, $ds_aws_instance_size_map
run_script $js_rds_nonidle_instances_with_metrics, $ds_rds_nonidle_instances, $ds_cloudwatch_underutil_cpu_sorted, $ds_cloudwatch_underutil_mem_sorted, $ds_cloudwatch_underutil_netin_sorted, $ds_cloudwatch_underutil_netout_sorted
end

script "js_rds_nonidle_instances_with_metrics", type: "javascript" do
parameters "ds_rds_nonidle_instances", "ds_cloudwatch_underutil_cpu_sorted", "ds_cloudwatch_underutil_mem_sorted", "ds_cloudwatch_underutil_netin_sorted", "ds_cloudwatch_underutil_netout_sorted", "ds_instance_costs_grouped", "ds_aws_instance_size_map"
parameters "ds_rds_nonidle_instances", "ds_cloudwatch_underutil_cpu_sorted", "ds_cloudwatch_underutil_mem_sorted", "ds_cloudwatch_underutil_netin_sorted", "ds_cloudwatch_underutil_netout_sorted"
result "result"
code <<-EOS
// Merge original instance list with cloudwatch data into a single list
result = []
_.each(ds_rds_nonidle_instances, function(instance) {
region = instance['region']
id = instance['instanceId']
cloudwatch_name = instance['name'].toLowerCase()
resourceType = instance['resourceType']
newResourceType = null
if (typeof(ds_aws_instance_size_map[resourceType]) == 'object') {
newResourceType = ds_aws_instance_size_map[resourceType]["down"]
}
// Only proceed if the CloudWatch data actually has the region and instance id.
// Otherwise, we have no usage data on the instance and thus dont include it in the results.
if (ds_cloudwatch_underutil_cpu_sorted[region] != undefined) {
if (ds_cloudwatch_underutil_cpu_sorted[region][cloudwatch_name] != undefined) {
instance_cpu_metrics = ds_cloudwatch_underutil_cpu_sorted[region][cloudwatch_name]
instance_mem_metrics = ds_cloudwatch_underutil_mem_sorted[region][cloudwatch_name]
instance_netin_metrics = ds_cloudwatch_underutil_netin_sorted[region][cloudwatch_name]
instance_netout_metrics = ds_cloudwatch_underutil_netout_sorted[region][cloudwatch_name]
savings = 0.0
if (typeof(newResourceType) == 'string') {
cost_name = "db:" + instance['name'].toLowerCase()
cost = 0.0
if (ds_cloudwatch_underutil_cpu_sorted[region] && ds_cloudwatch_underutil_cpu_sorted[region][cloudwatch_name]) {
instance_cpu_metrics = ds_cloudwatch_underutil_cpu_sorted[region][cloudwatch_name]
instance_mem_metrics = ds_cloudwatch_underutil_mem_sorted[region][cloudwatch_name]
instance_netin_metrics = ds_cloudwatch_underutil_netin_sorted[region][cloudwatch_name]
instance_netout_metrics = ds_cloudwatch_underutil_netout_sorted[region][cloudwatch_name]
// Create object we're going to return
merged_instance = {
instanceId: instance['instanceId'],
instanceArn: instance['instanceArn'],
resourceType: instance['resourceType'],
name: instance['name'],
status: instance['status'],
databaseEngine: instance['databaseEngine'],
engineVersion: instance['engineVersion'],
privateDnsName: instance['privateDnsName'],
region: instance['region'],
tags: instance['tags'],
availabilityZone: instance['availabilityZone'],
licenseModel: instance['licenseModel'],
processorFeatures: instance['processorFeatures'],
vcpus: instance['vcpus']
}
if (typeof(ds_instance_costs_grouped[cost_name]) == 'number') {
cost = ds_instance_costs_grouped[cost_name]
}
// Grab usage data for the instance if it is present.
// Note: We don't simply name them the same as Cloudwatch does because
// prior versions of this policy also didn't, and we want to ensure
// that exported incident data looks the same as it used to.
cpu_stats_list = [ "Average", "Minimum", "Maximum", "p99", "p95", "p90" ]
cost_per_nfu = cost / ds_aws_instance_size_map[resourceType]['nfu']
_.each(cpu_stats_list, function(stat) {
incident_statname = "cpu" + "_" + stat.toLowerCase()
new_cost = cost_per_nfu * ds_aws_instance_size_map[newResourceType]['nfu']
savings = cost - new_cost
}
merged_instance[incident_statname] = null
// Create object we're going to return
merged_instance = {
instanceId: instance['instanceId'],
instanceArn: instance['instanceArn'],
resourceType: instance['resourceType'],
name: instance['name'],
status: instance['status'],
databaseEngine: instance['databaseEngine'],
engineVersion: instance['engineVersion'],
privateDnsName: instance['privateDnsName'],
region: instance['region'],
tags: instance['tags'],
availabilityZone: instance['availabilityZone'],
licenseModel: instance['licenseModel'],
processorFeatures: instance['processorFeatures'],
vcpus: instance['vcpus'],
savings: savings,
newResourceType: newResourceType
if (typeof(instance_cpu_metrics[stat]) == 'number') {
merged_instance[incident_statname] = Math.round(instance_cpu_metrics[stat] * 100) / 100
}
})
// Grab usage data for the instance if it is present.
// Note: We don't simply name them the same as Cloudwatch does because
// prior versions of this policy also didn't, and we want to ensure
// that exported incident data looks the same as it used to.
cpu_stats_list = [ "Average", "Minimum", "Maximum", "p99", "p95", "p90" ]
_.each(cpu_stats_list, function(stat) {
incident_statname = "cpu" + "_" + stat.toLowerCase()
merged_instance[incident_statname] = null
if (typeof(instance_cpu_metrics[stat]) == 'number') {
merged_instance[incident_statname] = Math.round(instance_cpu_metrics[stat] * 100) / 100
}
})
mem_stats_list = [ "Average", "Minimum", "Maximum" ]
mem_stats_list = [ "Average", "Minimum", "Maximum" ]
_.each(mem_stats_list, function(stat) {
incident_statname = "mem" + "_" + stat.toLowerCase()
_.each(mem_stats_list, function(stat) {
incident_statname = "mem" + "_" + stat.toLowerCase()
merged_instance[incident_statname] = null
merged_instance[incident_statname] = null
if (typeof(instance_mem_metrics[stat]) == 'number') {
merged_instance[incident_statname] = Math.round(instance_mem_metrics[stat] / 1024 / 1024 * 100) / 100
}
})
if (typeof(instance_mem_metrics[stat]) == 'number') {
merged_instance[incident_statname] = Math.round(instance_mem_metrics[stat] / 1024 / 1024 * 100) / 100
}
})
netin_stats_list = [ "Average", "Minimum", "Maximum" ]
netin_stats_list = [ "Average", "Minimum", "Maximum" ]
_.each(netin_stats_list, function(stat) {
incident_statname = "netin" + "_" + stat.toLowerCase()
_.each(netin_stats_list, function(stat) {
incident_statname = "netin" + "_" + stat.toLowerCase()
merged_instance[incident_statname] = null
merged_instance[incident_statname] = null
if (typeof(instance_netin_metrics[stat]) == 'number') {
merged_instance[incident_statname] = Math.round(instance_netin_metrics[stat] * 100) / 100
}
})
if (typeof(instance_netin_metrics[stat]) == 'number') {
merged_instance[incident_statname] = Math.round(instance_netin_metrics[stat] * 100) / 100
}
})
netout_stats_list = [ "Average", "Minimum", "Maximum" ]
netout_stats_list = [ "Average", "Minimum", "Maximum" ]
_.each(netout_stats_list, function(stat) {
incident_statname = "netout" + "_" + stat.toLowerCase()
_.each(netout_stats_list, function(stat) {
incident_statname = "netout" + "_" + stat.toLowerCase()
merged_instance[incident_statname] = null
merged_instance[incident_statname] = null
if (typeof(instance_netout_metrics[stat]) == 'number') {
merged_instance[incident_statname] = Math.round(instance_netout_metrics[stat] * 100) / 100
}
})
if (typeof(instance_netout_metrics[stat]) == 'number') {
merged_instance[incident_statname] = Math.round(instance_netout_metrics[stat] * 100) / 100
}
})
// Send the instance information with the CloudWatch data into the final result.
// Also adds in the account ID and currency symbol since itll be needed for the incident.
result.push(merged_instance)
}
// Send the instance information with the CloudWatch data into the final result.
// Also adds in the account ID and currency symbol since itll be needed for the incident.
result.push(merged_instance)
}
})
EOS
end

datasource "ds_rds_underutil_instances" do
run_script $js_rds_underutil_instances, $ds_rds_nonidle_instances_with_metrics, $param_stats_threshold, $param_stats_underutil_threshold_cpu_value
run_script $js_rds_underutil_instances, $ds_rds_nonidle_instances_with_metrics, $ds_aws_instance_size_map, $ds_instance_costs_grouped, $param_stats_threshold, $param_stats_underutil_threshold_cpu_value, $param_downsize_multiple
end

script "js_rds_underutil_instances", type: "javascript" do
parameters "ds_rds_nonidle_instances_with_metrics", "param_stats_threshold", "param_stats_underutil_threshold_cpu_value"
parameters "ds_rds_nonidle_instances_with_metrics", "ds_aws_instance_size_map", "ds_instance_costs_grouped", "param_stats_threshold", "param_stats_underutil_threshold_cpu_value", "param_downsize_multiple"
result "result"
code <<-'EOS'
// Filter above list to just underutilized instances
result = _.filter(ds_rds_nonidle_instances_with_metrics, function(instance) {
result = []
_.each(ds_rds_nonidle_instances_with_metrics, function(instance) {
cpu_value = instance['cpu_' + param_stats_threshold.toLowerCase()]
cpu_undertilized = cpu_value < param_stats_underutil_threshold_cpu_value || cpu_value == null
return cpu_undertilized && typeof(instance['newResourceType']) == 'string'
if (typeof(cpu_value) != 'number') { cpu_value = 0 }
if (cpu_value < param_stats_underutil_threshold_cpu_value && ds_aws_instance_size_map[instance["resourceType"]]["down"]) {
newResourceType = ds_aws_instance_size_map[instance["resourceType"]]["down"]
if (param_downsize_multiple == "Yes") {
while (ds_aws_instance_size_map[newResourceType]["down"] && cpu_value * 2 < param_stats_underutil_threshold_cpu_value) {
cpu_value = cpu_value * 2
newResourceType = ds_aws_instance_size_map[newResourceType]['down']
}
}
cost_name = "db:" + instance['name'].toLowerCase()
savings = 0.0
cost = 0.0
if (typeof(ds_instance_costs_grouped[cost_name]) == 'number') { cost = ds_instance_costs_grouped[cost_name] }
cost_per_nfu = cost / ds_aws_instance_size_map[instance["resourceType"]]['nfu']
new_cost = cost_per_nfu * ds_aws_instance_size_map[newResourceType]['nfu']
savings = cost - new_cost
new_instance = { newResourceType: newResourceType, savings: savings }
_.each(_.keys(instance), function(key) { new_instance[key] = instance[key] })
result.push(new_instance)
}
})
EOS
end
Expand Down Expand Up @@ -1474,7 +1482,7 @@ script "js_rds_idle_incident", type: "javascript" do
licenseModel: instance['licenseModel'],
processorFeatures: instance['processorFeatures'],
vcpus: instance['vcpus'],
savings: parseFloat(savings.toFixed(3)),
savings: Math.round(savings * 1000) / 1000,
savingsCurrency: ds_currency['symbol'],
lookbackPeriod: param_stats_lookback,
newResourceType: "Terminate RDS Instance",
Expand Down Expand Up @@ -1515,7 +1523,7 @@ script "js_rds_idle_incident", type: "javascript" do
savings_message = [
ds_currency['symbol'], ' ',
formatNumber(parseFloat(total_savings).toFixed(2), ds_currency['separator'])
formatNumber(Math.round(total_savings * 100) / 100, ds_currency['separator'])
].join('')
// Sort by descending order of savings value
Expand Down Expand Up @@ -1615,7 +1623,7 @@ script "js_rds_underutil_incident", type: "javascript" do
licenseModel: instance['licenseModel'],
processorFeatures: instance['processorFeatures'],
vcpus: instance['vcpus'],
savings: parseFloat(savings.toFixed(3)),
savings: Math.round(savings * 1000) / 1000,
savingsCurrency: ds_currency['symbol'],
lookbackPeriod: param_stats_lookback,
service: "AmazonRDS",
Expand Down Expand Up @@ -1661,7 +1669,7 @@ script "js_rds_underutil_incident", type: "javascript" do
savings_message = [
ds_currency['symbol'], ' ',
formatNumber(parseFloat(total_savings).toFixed(2), ds_currency['separator'])
formatNumber(Math.round(total_savings * 100) / 100, ds_currency['separator'])
].join('')
// Sort by descending order of savings value
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ category "Meta"
default_frequency "15 minutes"
info(
provider: "AWS",
version: "5.4.2", # This version of the Meta Parent Policy Template should match the version of the Child Policy Template as it appears in the Catalog for best reliability
version: "5.5.0", # This version of the Meta Parent Policy Template should match the version of the Child Policy Template as it appears in the Catalog for best reliability
publish: "true",
deprecated: "false"
)
Expand Down Expand Up @@ -77,6 +77,15 @@ parameter "param_min_savings" do
default 0
end

parameter "param_downsize_multiple" do
type "string"
category "Policy Settings"
label "Skip Instance Sizes"
description "Whether to recommend downsizing multiple sizes. When set to 'No', only the next smaller size will ever be recommended for downsizing. When set to 'Yes', more aggressive downsizing recommendations will be made when appropriate."
allowed_values "Yes", "No"
default "No"
end

parameter "param_regions_allow_or_deny" do
type "string"
category "Filters"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2347,7 +2347,7 @@
{
"id": "./cost/aws/rightsize_rds_instances/aws_rightsize_rds_instances.pt",
"name": "AWS Rightsize RDS Instances",
"version": "5.4.2",
"version": "5.5.0",
"providers": [
{
"name": "aws",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1346,7 +1346,7 @@
required: true
- id: "./cost/aws/rightsize_rds_instances/aws_rightsize_rds_instances.pt"
name: AWS Rightsize RDS Instances
version: 5.4.2
version: 5.5.0
:providers:
- :name: aws
:permissions:
Expand Down

0 comments on commit dfabd5f

Please sign in to comment.