Skip to content

Commit

Permalink
Merge pull request #181 from CBIIT/standard-opensearch
Browse files Browse the repository at this point in the history
Standardizing OpenSearch Implementations
  • Loading branch information
michael-fleming authored Feb 21, 2024
2 parents 855eff4 + 8d9eae8 commit 5deae74
Show file tree
Hide file tree
Showing 10 changed files with 550 additions and 135 deletions.
80 changes: 70 additions & 10 deletions terraform/modules/opensearch/README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,20 @@
## Notes on module configuration:
# Amazon OpenSearch Implementation Guide:

- Project decides if a service-linked role is created by the module.
- HTTPS is enforced using TLS 1.2 by default for security purposes.
- Encryption is applied at rest by default for security purposes.
- Node-to-node encryption is applied by default for security purposes.
- A security group is created by the module - but the project must specify the security group rules to attach to the security group. The security group identifiers are exported as outputs to use as a reference in rule development at the project level.
- Automated snapshots are on by default
- Auto-tune is enabled by default, but can be disabled at by passing in "DISABLED" from the project for the opensearch_autotune_desired_state argument. Autotune set to occur daily at 11:59 PM EST.
- Logs are sent to cloudwatch, and cloudwatch group created in module. Need to decide as team as to which logs to send from OpenSearch.
# Basic Usage

<pre><code>module "opensearch" {
source = "../.."

attach_permissions_boundary = true
cluster_tshirt_size = "md"
engine_version = "OpenSearch_2.11"
resource_prefix = "program-tier-app"
s3_snapshot_bucket_arn = "arn:aws:s3:::basic-example-snapshot-bucket"
subnet_ids = ["subnet-01234567891011121", "subnet-abcdefghijklmnopq"]
vpc_id = "vpc-01234567891011121"
}</code></pre>

# Terraform Documentation

<!-- BEGIN_TF_DOCS -->
## Requirements
Expand Down Expand Up @@ -71,4 +76,59 @@ No modules.
| <a name="output_opensearch_endpoint"></a> [opensearch\_endpoint](#output\_opensearch\_endpoint) | the opensearch domain endpoint url |
| <a name="output_opensearch_security_group_arn"></a> [opensearch\_security\_group\_arn](#output\_opensearch\_security\_group\_arn) | the arn of the security group associated with the OpenSearch cluster |
| <a name="output_opensearch_security_group_id"></a> [opensearch\_security\_group\_id](#output\_opensearch\_security\_group\_id) | the id of the security group associated with the OpenSearch cluster |
<!-- END_TF_DOCS -->
<!-- END_TF_DOCS -->

# Implementation Guide
The following guide provides a reference for OpenSearch service implementations aligned with NCI, NIST, and CTOS standards.

## Service-Linked Role
The OpenSearch service requires a service-linked role to be created in order to manage the OpenSearch cluster. The service-linked role is not created by the module, and therefore Terraform will return an error on the first apply. Simply execute the Terraform Apply workflow again to resolve the issue. Upon the failure on the initial apply, the service-linked role will be created and the apply will succeed on the second attempt.

## OpenSearch Manual Snapshots
By default, this module creates the IAM resources required to perform OpenSearch manual snapshot operations. Creation of the IAM resources can be disabled by setting the variable named `create_snapshot_role` to `false`. Please note that the module does not create an S3 bucket to store the snapshots, but provides the variable named `s3_snapshot_bucket_arn`. The project must create the S3 bucket and pass the ARN to the module.

## Security Group
By default, this module creates a security group for the OpenSearch cluster which can be disabled by setting the variable named `create_security_group` to `false` and providing one or more security group identifiers for the variable named `security_group_ids`. The security group is created with no ingress rules attached, but does create an egress rule allowing all outbound traffic. The project must attach the ingress rules to the security group by referencing the `security_group_id` output.

## Standard Cluster Sizing Configurations
The CTOS program establishes pre-determined cluster sizing configurations for OpenSearch to simplify implementation and achieve cross-program standardization. T-Shirt sizes influence the instance type and EBS storage volume sizes. Please note: the storage sizes are per node, and the default node count is 1 data node.

| Size | vCPU | Memory | Storage |
| :---: | :---: | :---: | :---: |
| `xs` | 2 | 2 GB | 10 GB |
| `sm` | 2 | 4 GB | 20 GB |
| `md` | 2 | 8 GB | 40 GB |
| `lg` | 4 | 16 GB | 80 GB |
| `xl` | 8 | 32 GB | 160 GB |

If you select a T-Shirt size for the variable named `cluster_tshirt_size`, then you do not have to specify the `volume_size` or `instance_type` variables. The module will automatically select the appropriate instance type and storage volume size based on the T-Shirt size selected. Providing a value for the `volume_size` or `instance_type` variables will override the T-Shirt size selection. You can also specify `instance_count` to increase the number of data nodes in the cluster.

## Choosing a Cluster T-Shirt Size
Consider the following factors when choosing a cluster T-Shirt size:
1. The target storage utilization should be no more than 70% of the total storage capacity.
2. The estimated size of any index should not be larger than 50% of a single node's storage capacity.

### Example Scenario 1:
A project anticipates `5` indexes that will be stored in OpenSearch. Each of the five indexes are anticipated to be 1 GB in size.
- Total Storage Required: `5 GB`
- Largest Index Size: `1 GB`
- T-Shirt Size Recommendation: `xs`

### Example Scenario 2:
A project anticipates 2 indexes that will be stored in OpenSearch. One index is anticipated to be 6 GB in size. The other index is anticipated to be 0.5 GB in size.
- Total Storage Required: `6.5 GB`
- Largest Index Size: `6 GB`
- T-Shirt Size Recommendation: `sm`


## Multi-AZ Deployments
By default, this module creates a single data node in a Single AWS Availability Zone (AZ). To configure the cluster for a Multi-AZ deployment, set the variable named `zone_awareness_enabled` to `true`. Be aware that Multi-AZ deployments will double the number of Data Nodes in the cluster.

## Domain Access Policies
By default, this module creates and attaches an OpenSearch Domain Access Policy that allows HTTPS-based actions to be performed by any AWS principal. The policy can be disabled by setting the variable named `create_access_policies` to `false`. If you disable the access policy, you must provide your own access policy for the variable named `access_policies`. The value of the `access_policies` variable must be a valid JSON string, which can be produced from using a `data.iam_policy_document.{name}.json` data source reference.

## Cluster Log Monitoring
By default, the cluster has logging configured, and all types of logs generated by the cluster are forwarded to a CloudWatch Log Group created by the module. The logs that are generated are described below.
- Index Slow Logs: Logs that are generated when indexing operations take longer than the index.indexing\_slowlog.threshold.index.warn value.
- Search Slow Logs: Logs that are generated when search operations take longer than the index.search\_slowlog.threshold.query.warn value.
- Application Logs: Error logs that are generated by the OpenSearch cluster.
84 changes: 78 additions & 6 deletions terraform/modules/opensearch/data.tf
Original file line number Diff line number Diff line change
@@ -1,22 +1,94 @@
data "aws_region" "region" {}
data "aws_caller_identity" "current" {}

data "aws_caller_identity" "caller" {}
data "aws_iam_policy_document" "logs" {
count = var.create_cloudwatch_log_policy ? 1 : 0

data "aws_iam_policy_document" "os" {
statement {
effect = "Allow"
actions = [
"logs:PutLogEvents",
"logs:PutLogEventsBatch",
"logs:CreateLogStream"
]
principals {
type = "Service"
identifiers = ["es.amazonaws.com"]
}
resources = [
aws_cloudwatch_log_group.os.arn,
"${aws_cloudwatch_log_group.os.arn}:*"
"${aws_cloudwatch_log_group.this[0].arn}",
"${aws_cloudwatch_log_group.this[0].arn}:*"
]
}
}

data "aws_iam_policy_document" "access_policy" {
count = var.create_access_policies ? 1 : 0

statement {
effect = "Allow"
actions = [
"es:ESHttpPut",
"es:ESHttpPost",
"es:ESHttpPatch",
"es:ESHttpHead",
"es:ESHttpGet",
"es:ESHttpDelete"
]
principals {
type = "AWS"
identifiers = ["*"]
}
resources = ["${aws_opensearch_domain.this.arn}/*"]
}
}

data "aws_iam_policy_document" "trust" {
count = var.create_snapshot_role ? 1 : 0

statement {
effect = "Allow"
actions = ["sts:AssumeRole"]

principals {
type = "Service"
identifiers = ["es.amazonaws.com"]
}
}
}

data "aws_iam_policy_document" "snapshot" {
count = var.create_snapshot_role ? 1 : 0

statement {
effect = "Allow"
actions = ["s3:ListBucket"]
resources = [var.s3_snapshot_bucket_arn]
}

statement {
effect = "Allow"
actions = [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
]
resources = ["${var.s3_snapshot_bucket_arn}/*"]
}

statement {
effect = "Allow"
actions = [
"iam:PassRole",
"iam:GetRole"
]
resources = [aws_iam_role.snapshot[0].arn]
}

statement {
effect = "Allow"
actions = ["es:ESHttpPut"]
resources = [
"${aws_opensearch_domain.this.arn}/*",
"${aws_opensearch_domain.this.arn}/*/*"
]
}
}
Empty file.
Empty file.
11 changes: 11 additions & 0 deletions terraform/modules/opensearch/examples/basic-configuration/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
module "opensearch" {
source = "../.."

attach_permissions_boundary = true
cluster_tshirt_size = "md"
engine_version = "OpenSearch_2.11"
resource_prefix = "program-tier-app"
s3_snapshot_bucket_arn = "arn:aws:s3:::basic-example-snapshot-bucket"
subnet_ids = ["subnet-01234567891011121", "subnet-abcdefghijklmnopq"]
vpc_id = "vpc-01234567891011121"
}
Empty file.
38 changes: 32 additions & 6 deletions terraform/modules/opensearch/locals.tf
Original file line number Diff line number Diff line change
@@ -1,7 +1,33 @@
locals {
domain_name = "${var.resource_prefix}-opensearch"
subnets = var.multi_az_enabled ? var.opensearch_subnet_ids : [ var.opensearch_subnet_ids[0]]
sg_description = "The security group regulating network access to the OpenSearch cluster"
log_retention = var.env == "dev" || var.env == "qa" ? 30 : 90
now = timestamp()
}
access_policies = var.create_access_policies ? data.aws_iam_policy_document.access_policy[0].json : var.access_policies
permissions_boundary = var.attach_permissions_boundary ? "arn:aws:iam::${data.aws_caller_identity.current.account_id}:policy/PermissionBoundary_PowerUser" : null
security_group_ids = var.create_security_group ? aws_security_group.this[0].id : var.security_group_ids
custom_instance_type = var.instance_type == null && var.cluster_tshirt_size != null ? lookup(local.instance_type_lookup, var.cluster_tshirt_size, null) : var.instance_type
custom_instance_count = var.instance_count == null ? 1 : var.instance_count
custom_volume_size = var.volume_size == null && var.cluster_tshirt_size != null ? lookup(local.volume_size_lookup, var.cluster_tshirt_size, null) : var.volume_size


instance_type_lookup = {
xs = "t3.small.search"
sm = "t3.medium.search"
md = "m6g.large.search"
lg = "m6g.xlarge.search"
xl = "m6g.2xlarge.search"
}

volume_size_lookup = {
xs = 10
sm = 20
md = 40
lg = 80
xl = 160
}

outputs = {
security_group_arn = var.create_security_group ? aws_security_group.this[0].arn : "A Security Group was not created"
security_group_id = var.create_security_group ? aws_security_group.this[0].id : "A Security Group was not created"
role_arn = var.create_snapshot_role ? aws_iam_role.snapshot[0].arn : "A Snapshot Role was not created"
role_id = var.create_snapshot_role ? aws_iam_role.snapshot[0].id : "A Snapshot Role was not created"
role_name = var.create_snapshot_role ? aws_iam_role.snapshot[0].name : "A Snapshot Role was not created"
}
}
Loading

0 comments on commit 5deae74

Please sign in to comment.