Skip to content

Latest commit

 

History

History
377 lines (317 loc) · 43.7 KB

README.md

File metadata and controls

377 lines (317 loc) · 43.7 KB

AWS EKS Terraform module

Terraform module which creates Amazon EKS (Kubernetes) resources

SWUbanner

External Documentation

Please note that we strive to provide a comprehensive suite of documentation for configuring and utilizing the module(s) defined here, and that documentation regarding EKS (including EKS managed node group, self managed node group, and Fargate profile) and/or Kubernetes features, usage, etc. are better left up to their respective sources:

Usage

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 20.0"

  cluster_name    = "my-cluster"
  cluster_version = "1.30"

  cluster_endpoint_public_access  = true

  cluster_addons = {
    coredns                = {}
    eks-pod-identity-agent = {}
    kube-proxy             = {}
    vpc-cni                = {}
  }

  vpc_id                   = "vpc-1234556abcdef"
  subnet_ids               = ["subnet-abcde012", "subnet-bcde012a", "subnet-fghi345a"]
  control_plane_subnet_ids = ["subnet-xyzde987", "subnet-slkjf456", "subnet-qeiru789"]

  # EKS Managed Node Group(s)
  eks_managed_node_group_defaults = {
    instance_types = ["m6i.large", "m5.large", "m5n.large", "m5zn.large"]
  }

  eks_managed_node_groups = {
    example = {
      # Starting on 1.30, AL2023 is the default AMI type for EKS managed node groups
      ami_type       = "AL2023_x86_64_STANDARD"
      instance_types = ["m5.xlarge"]

      min_size     = 2
      max_size     = 10
      desired_size = 2
    }
  }

  # Cluster access entry
  # To add the current caller identity as an administrator
  enable_cluster_creator_admin_permissions = true

  access_entries = {
    # One access entry with a policy associated
    example = {
      kubernetes_groups = []
      principal_arn     = "arn:aws:iam::123456789012:role/something"

      policy_associations = {
        example = {
          policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSViewPolicy"
          access_scope = {
            namespaces = ["default"]
            type       = "namespace"
          }
        }
      }
    }
  }

  tags = {
    Environment = "dev"
    Terraform   = "true"
  }
}

Cluster Access Entry

When enabling authentication_mode = "API_AND_CONFIG_MAP", EKS will automatically create an access entry for the IAM role(s) used by managed node group(s) and Fargate profile(s). There are no additional actions required by users. For self-managed node groups and the Karpenter sub-module, this project automatically adds the access entry on behalf of users so there are no additional actions required by users.

On clusters that were created prior to CAM support, there will be an existing access entry for the cluster creator. This was previously not visible when using aws-auth ConfigMap, but will become visible when access entry is enabled.

Bootstrap Cluster Creator Admin Permissions

Setting the bootstrap_cluster_creator_admin_permissions is a one time operation when the cluster is created; it cannot be modified later through the EKS API. In this project we are hardcoding this to false. If users wish to achieve the same functionality, we will do that through an access entry which can be enabled or disabled at any time of their choosing using the variable enable_cluster_creator_admin_permissions

Enabling EFA Support

When enabling EFA support via enable_efa_support = true, there are two locations this can be specified - one at the cluster level, and one at the node group level. Enabling at the cluster level will add the EFA required ingress/egress rules to the shared security group created for the node group(s). Enabling at the node group level will do the following (per node group where enabled):

  1. All EFA interfaces supported by the instance will be exposed on the launch template used by the node group
  2. A placement group with strategy = "clustered" per EFA requirements is created and passed to the launch template used by the node group
  3. Data sources will reverse lookup the availability zones that support the instance type selected based on the subnets provided, ensuring that only the associated subnets are passed to the launch template and therefore used by the placement group. This avoids the placement group being created in an availability zone that does not support the instance type selected.

Tip

Use the aws-efa-k8s-device-plugin Helm chart to expose the EFA interfaces on the nodes as an extended resource, and allow pods to request the interfaces be mounted to their containers.

The EKS AL2 GPU AMI comes with the necessary EFA components pre-installed - you just need to expose the EFA devices on the nodes via their launch templates, ensure the required EFA security group rules are in place, and deploy the aws-efa-k8s-device-plugin in order to start utilizing EFA within your cluster. Your application container will need to have the necessary libraries and runtime in order to utilize communication over the EFA interfaces (NCCL, aws-ofi-nccl, hwloc, libfabric, aws-neuornx-collectives, CUDA, etc.).

If you disable the creation and use of the managed node group custom launch template (create_launch_template = false and/or use_custom_launch_template = false), this will interfere with the EFA functionality provided. In addition, if you do not supply an instance_type for self-managed node group(s), or instance_types for the managed node group(s), this will also interfere with the functionality. In order to support the EFA functionality provided by enable_efa_support = true, you must utilize the custom launch template created/provided by this module, and supply an instance_type/instance_types for the respective node group.

The logic behind supporting EFA uses a data source to lookup the instance type to retrieve the number of interfaces that the instance supports in order to enumerate and expose those interfaces on the launch template created. For managed node groups where a list of instance types are supported, the first instance type in the list is used to calculate the number of EFA interfaces supported. Mixing instance types with varying number of interfaces is not recommended for EFA (or in some cases, mixing instance types is not supported - i.e. - p5.48xlarge and p4d.24xlarge). In addition to exposing the EFA interfaces and updating the security group rules, a placement group is created per the EFA requirements and only the availability zones that support the instance type selected are used in the subnets provided to the node group.

In order to enable EFA support, you will have to specify enable_efa_support = true on both the cluster and each node group that you wish to enable EFA support for:

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 20.0"

  # Truncated for brevity ...

  # Adds the EFA required security group rules to the shared
  # security group created for the node group(s)
  enable_efa_support = true

  eks_managed_node_groups = {
    example = {
      instance_types = ["p5.48xlarge"]

      # Exposes all EFA interfaces on the launch template created by the node group(s)
      # This would expose all 32 EFA interfaces for the p5.48xlarge instance type
      enable_efa_support = true

      pre_bootstrap_user_data = <<-EOT
        # Mount NVME instance store volumes since they are typically
        # available on instance types that support EFA
        setup-local-disks raid0
      EOT

      # EFA should only be enabled when connecting 2 or more nodes
      # Do not use EFA on a single node workload
      min_size     = 2
      max_size     = 10
      desired_size = 2
    }
  }
}

Examples

Contributing

We are grateful to the community for contributing bugfixes and improvements! Please see below to learn how you can take part.

Requirements

Name Version
terraform >= 1.3.2
aws >= 5.61
time >= 0.9
tls >= 3.0

Providers

Name Version
aws >= 5.61
time >= 0.9
tls >= 3.0

Modules

Name Source Version
eks_managed_node_group ./modules/eks-managed-node-group n/a
fargate_profile ./modules/fargate-profile n/a
kms terraform-aws-modules/kms/aws 2.1.0
self_managed_node_group ./modules/self-managed-node-group n/a

Resources

Name Type
aws_cloudwatch_log_group.this resource
aws_ec2_tag.cluster_primary_security_group resource
aws_eks_access_entry.this resource
aws_eks_access_policy_association.this resource
aws_eks_addon.before_compute resource
aws_eks_addon.this resource
aws_eks_cluster.this resource
aws_eks_identity_provider_config.this resource
aws_iam_openid_connect_provider.oidc_provider resource
aws_iam_policy.cluster_encryption resource
aws_iam_policy.cni_ipv6_policy resource
aws_iam_role.this resource
aws_iam_role_policy_attachment.additional resource
aws_iam_role_policy_attachment.cluster_encryption resource
aws_iam_role_policy_attachment.this resource
aws_security_group.cluster resource
aws_security_group.node resource
aws_security_group_rule.cluster resource
aws_security_group_rule.node resource
time_sleep.this resource
aws_caller_identity.current data source
aws_eks_addon_version.this data source
aws_iam_policy_document.assume_role_policy data source
aws_iam_policy_document.cni_ipv6_policy data source
aws_iam_session_context.current data source
aws_partition.current data source
tls_certificate.this data source

Inputs

Name Description Type Default Required
access_entries Map of access entries to add to the cluster any {} no
attach_cluster_encryption_policy Indicates whether or not to attach an additional policy for the cluster IAM role to utilize the encryption key provided bool true no
authentication_mode The authentication mode for the cluster. Valid values are CONFIG_MAP, API or API_AND_CONFIG_MAP string "API_AND_CONFIG_MAP" no
bootstrap_self_managed_addons Indicates whether or not to bootstrap self-managed addons after the cluster has been created bool null no
cloudwatch_log_group_class Specified the log class of the log group. Possible values are: STANDARD or INFREQUENT_ACCESS string null no
cloudwatch_log_group_kms_key_id If a KMS Key ARN is set, this key will be used to encrypt the corresponding log group. Please be sure that the KMS Key has an appropriate key policy (https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/encrypt-log-data-kms.html) string null no
cloudwatch_log_group_retention_in_days Number of days to retain log events. Default retention - 90 days number 90 no
cloudwatch_log_group_tags A map of additional tags to add to the cloudwatch log group created map(string) {} no
cluster_additional_security_group_ids List of additional, externally created security group IDs to attach to the cluster control plane list(string) [] no
cluster_addons Map of cluster addon configurations to enable for the cluster. Addon name can be the map keys or set with name any {} no
cluster_addons_timeouts Create, update, and delete timeout configurations for the cluster addons map(string) {} no
cluster_enabled_log_types A list of the desired control plane logs to enable. For more information, see Amazon EKS Control Plane Logging documentation (https://docs.aws.amazon.com/eks/latest/userguide/control-plane-logs.html) list(string)
[
"audit",
"api",
"authenticator"
]
no
cluster_encryption_config Configuration block with encryption configuration for the cluster. To disable secret encryption, set this value to {} any
{
"resources": [
"secrets"
]
}
no
cluster_encryption_policy_description Description of the cluster encryption policy created string "Cluster encryption policy to allow cluster role to utilize CMK provided" no
cluster_encryption_policy_name Name to use on cluster encryption policy created string null no
cluster_encryption_policy_path Cluster encryption policy path string null no
cluster_encryption_policy_tags A map of additional tags to add to the cluster encryption policy created map(string) {} no
cluster_encryption_policy_use_name_prefix Determines whether cluster encryption policy name (cluster_encryption_policy_name) is used as a prefix bool true no
cluster_endpoint_private_access Indicates whether or not the Amazon EKS private API server endpoint is enabled bool true no
cluster_endpoint_public_access Indicates whether or not the Amazon EKS public API server endpoint is enabled bool false no
cluster_endpoint_public_access_cidrs List of CIDR blocks which can access the Amazon EKS public API server endpoint list(string)
[
"0.0.0.0/0"
]
no
cluster_identity_providers Map of cluster identity provider configurations to enable for the cluster. Note - this is different/separate from IRSA any {} no
cluster_ip_family The IP family used to assign Kubernetes pod and service addresses. Valid values are ipv4 (default) and ipv6. You can only specify an IP family when you create a cluster, changing this value will force a new cluster to be created string "ipv4" no
cluster_name Name of the EKS cluster string "" no
cluster_security_group_additional_rules List of additional security group rules to add to the cluster security group created. Set source_node_security_group = true inside rules to set the node_security_group as source any {} no
cluster_security_group_description Description of the cluster security group created string "EKS cluster security group" no
cluster_security_group_id Existing security group ID to be attached to the cluster string "" no
cluster_security_group_name Name to use on cluster security group created string null no
cluster_security_group_tags A map of additional tags to add to the cluster security group created map(string) {} no
cluster_security_group_use_name_prefix Determines whether cluster security group name (cluster_security_group_name) is used as a prefix bool true no
cluster_service_ipv4_cidr The CIDR block to assign Kubernetes service IP addresses from. If you don't specify a block, Kubernetes assigns addresses from either the 10.100.0.0/16 or 172.20.0.0/16 CIDR blocks string null no
cluster_service_ipv6_cidr The CIDR block to assign Kubernetes pod and service IP addresses from if ipv6 was specified when the cluster was created. Kubernetes assigns service addresses from the unique local address range (fc00::/7) because you can't specify a custom IPv6 CIDR block when you create the cluster string null no
cluster_tags A map of additional tags to add to the cluster map(string) {} no
cluster_timeouts Create, update, and delete timeout configurations for the cluster map(string) {} no
cluster_upgrade_policy Configuration block for the cluster upgrade policy any {} no
cluster_version Kubernetes <major>.<minor> version to use for the EKS cluster (i.e.: 1.27) string null no
control_plane_subnet_ids A list of subnet IDs where the EKS cluster control plane (ENIs) will be provisioned. Used for expanding the pool of subnets used by nodes/node groups without replacing the EKS control plane list(string) [] no
create Controls if resources should be created (affects nearly all resources) bool true no
create_cloudwatch_log_group Determines whether a log group is created by this module for the cluster logs. If not, AWS will automatically create one if logging is enabled bool true no
create_cluster_primary_security_group_tags Indicates whether or not to tag the cluster's primary security group. This security group is created by the EKS service, not the module, and therefore tagging is handled after cluster creation bool true no
create_cluster_security_group Determines if a security group is created for the cluster. Note: the EKS service creates a primary security group for the cluster by default bool true no
create_cni_ipv6_iam_policy Determines whether to create an AmazonEKS_CNI_IPv6_Policy bool false no
create_iam_role Determines whether a an IAM role is created or to use an existing IAM role bool true no
create_kms_key Controls if a KMS key for cluster encryption should be created bool true no
create_node_security_group Determines whether to create a security group for the node groups or use the existing node_security_group_id bool true no
custom_oidc_thumbprints Additional list of server certificate thumbprints for the OpenID Connect (OIDC) identity provider's server certificate(s) list(string) [] no
dataplane_wait_duration Duration to wait after the EKS cluster has become active before creating the dataplane components (EKS managed node group(s), self-managed node group(s), Fargate profile(s)) string "30s" no
eks_managed_node_group_defaults Map of EKS managed node group default configurations any {} no
eks_managed_node_groups Map of EKS managed node group definitions to create any {} no
enable_cluster_creator_admin_permissions Indicates whether or not to add the cluster creator (the identity used by Terraform) as an administrator via access entry bool false no
enable_efa_support Determines whether to enable Elastic Fabric Adapter (EFA) support bool false no
enable_irsa Determines whether to create an OpenID Connect Provider for EKS to enable IRSA bool true no
enable_kms_key_rotation Specifies whether key rotation is enabled bool true no
fargate_profile_defaults Map of Fargate Profile default configurations any {} no
fargate_profiles Map of Fargate Profile definitions to create any {} no
iam_role_additional_policies Additional policies to be added to the IAM role map(string) {} no
iam_role_arn Existing IAM role ARN for the cluster. Required if create_iam_role is set to false string null no
iam_role_description Description of the role string null no
iam_role_name Name to use on IAM role created string null no
iam_role_path Cluster IAM role path string null no
iam_role_permissions_boundary ARN of the policy that is used to set the permissions boundary for the IAM role string null no
iam_role_tags A map of additional tags to add to the IAM role created map(string) {} no
iam_role_use_name_prefix Determines whether the IAM role name (iam_role_name) is used as a prefix bool true no
include_oidc_root_ca_thumbprint Determines whether to include the root CA thumbprint in the OpenID Connect (OIDC) identity provider's server certificate(s) bool true no
kms_key_administrators A list of IAM ARNs for key administrators. If no value is provided, the current caller identity is used to ensure at least one key admin is available list(string) [] no
kms_key_aliases A list of aliases to create. Note - due to the use of toset(), values must be static strings and not computed values list(string) [] no
kms_key_deletion_window_in_days The waiting period, specified in number of days. After the waiting period ends, AWS KMS deletes the KMS key. If you specify a value, it must be between 7 and 30, inclusive. If you do not specify a value, it defaults to 30 number null no
kms_key_description The description of the key as viewed in AWS console string null no
kms_key_enable_default_policy Specifies whether to enable the default key policy bool true no
kms_key_override_policy_documents List of IAM policy documents that are merged together into the exported document. In merging, statements with non-blank sids will override statements with the same sid list(string) [] no
kms_key_owners A list of IAM ARNs for those who will have full key permissions (kms:*) list(string) [] no
kms_key_service_users A list of IAM ARNs for key service users list(string) [] no
kms_key_source_policy_documents List of IAM policy documents that are merged together into the exported document. Statements must have unique sids list(string) [] no
kms_key_users A list of IAM ARNs for key users list(string) [] no
node_security_group_additional_rules List of additional security group rules to add to the node security group created. Set source_cluster_security_group = true inside rules to set the cluster_security_group as source any {} no
node_security_group_description Description of the node security group created string "EKS node shared security group" no
node_security_group_enable_recommended_rules Determines whether to enable recommended security group rules for the node security group created. This includes node-to-node TCP ingress on ephemeral ports and allows all egress traffic bool true no
node_security_group_id ID of an existing security group to attach to the node groups created string "" no
node_security_group_name Name to use on node security group created string null no
node_security_group_tags A map of additional tags to add to the node security group created map(string) {} no
node_security_group_use_name_prefix Determines whether node security group name (node_security_group_name) is used as a prefix bool true no
openid_connect_audiences List of OpenID Connect audience client IDs to add to the IRSA provider list(string) [] no
outpost_config Configuration for the AWS Outpost to provision the cluster on any {} no
prefix_separator The separator to use between the prefix and the generated timestamp for resource names string "-" no
putin_khuylo Do you agree that Putin doesn't respect Ukrainian sovereignty and territorial integrity? More info: https://en.wikipedia.org/wiki/Putin_khuylo! bool true no
self_managed_node_group_defaults Map of self-managed node group default configurations any {} no
self_managed_node_groups Map of self-managed node group definitions to create any {} no
subnet_ids A list of subnet IDs where the nodes/node groups will be provisioned. If control_plane_subnet_ids is not provided, the EKS cluster control plane (ENIs) will be provisioned in these subnets list(string) [] no
tags A map of tags to add to all resources map(string) {} no
vpc_id ID of the VPC where the cluster security group will be provisioned string null no

Outputs

Name Description
access_entries Map of access entries created and their attributes
access_policy_associations Map of eks cluster access policy associations created and their attributes
cloudwatch_log_group_arn Arn of cloudwatch log group created
cloudwatch_log_group_name Name of cloudwatch log group created
cluster_addons Map of attribute maps for all EKS cluster addons enabled
cluster_arn The Amazon Resource Name (ARN) of the cluster
cluster_certificate_authority_data Base64 encoded certificate data required to communicate with the cluster
cluster_endpoint Endpoint for your Kubernetes API server
cluster_iam_role_arn IAM role ARN of the EKS cluster
cluster_iam_role_name IAM role name of the EKS cluster
cluster_iam_role_unique_id Stable and unique string identifying the IAM role
cluster_id The ID of the EKS cluster. Note: currently a value is returned only for local EKS clusters created on Outposts
cluster_identity_providers Map of attribute maps for all EKS identity providers enabled
cluster_ip_family The IP family used by the cluster (e.g. ipv4 or ipv6)
cluster_name The name of the EKS cluster
cluster_oidc_issuer_url The URL on the EKS cluster for the OpenID Connect identity provider
cluster_platform_version Platform version for the cluster
cluster_primary_security_group_id Cluster security group that was created by Amazon EKS for the cluster. Managed node groups use this security group for control-plane-to-data-plane communication. Referred to as 'Cluster security group' in the EKS console
cluster_security_group_arn Amazon Resource Name (ARN) of the cluster security group
cluster_security_group_id ID of the cluster security group
cluster_service_cidr The CIDR block where Kubernetes pod and service IP addresses are assigned from
cluster_status Status of the EKS cluster. One of CREATING, ACTIVE, DELETING, FAILED
cluster_tls_certificate_sha1_fingerprint The SHA1 fingerprint of the public key of the cluster's certificate
cluster_version The Kubernetes version for the cluster
eks_managed_node_groups Map of attribute maps for all EKS managed node groups created
eks_managed_node_groups_autoscaling_group_names List of the autoscaling group names created by EKS managed node groups
fargate_profiles Map of attribute maps for all EKS Fargate Profiles created
kms_key_arn The Amazon Resource Name (ARN) of the key
kms_key_id The globally unique identifier for the key
kms_key_policy The IAM resource policy set on the key
node_security_group_arn Amazon Resource Name (ARN) of the node shared security group
node_security_group_id ID of the node shared security group
oidc_provider The OpenID Connect identity provider (issuer URL without leading https://)
oidc_provider_arn The ARN of the OIDC Provider if enable_irsa = true
self_managed_node_groups Map of attribute maps for all self managed node groups created
self_managed_node_groups_autoscaling_group_names List of the autoscaling group names created by self-managed node groups

License

Apache 2 Licensed. See LICENSE for full details.

Additional information for users from Russia and Belarus