Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cluster-autoscaler/exoscale): add support for --nodes #2

Merged
merged 1 commit into from
Apr 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion charts/cluster-autoscaler/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ name: cluster-autoscaler
sources:
- https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
type: application
version: 9.36.0
version: 9.36.1
22 changes: 15 additions & 7 deletions charts/cluster-autoscaler/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -238,10 +238,20 @@ Additional config parameters available, see the `values.yaml` for more details

### Exoscale

The following parameters are required:
Create a `values.yaml` file with the following content:
```yaml
cloudProvider: exoscale
autoDiscovery:
clusterName: cluster.local # this value is not used, but must be set
```

- `cloudProvider=exoscale`
- `autoDiscovery.clusterName=<CLUSTER NAME>`
Optionally, you may specify the minimum and maximum size of a particular nodepool by adding the following to the `values.yaml` file:
```yaml
autoscalingGroups:
- name: your-nodepool-name
maxSize: 10
minSize: 1
```

Create an Exoscale API key with appropriate permissions as described in [cluster-autoscaler/cloudprovider/exoscale/README.md](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/exoscale/README.md).
A secret of name `<release-name>-exoscale-cluster-autoscaler` needs to be created, containing the api key and secret, as well as the zone.
Expand All @@ -255,9 +265,7 @@ $ kubectl create secret generic my-release-exoscale-cluster-autoscaler \
After creating the secret, the chart may be installed:

```console
$ helm install my-release autoscaler/cluster-autoscaler \
--set cloudProvider=exoscale \
--set autoDiscovery.clusterName=<CLUSTER NAME>
$ helm install my-release autoscaler/cluster-autoscaler -f values.yaml
```

Read [cluster-autoscaler/cloudprovider/exoscale/README.md](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/exoscale/README.md) for further information on the setup without helm.
Expand Down Expand Up @@ -391,7 +399,7 @@ vpa:
| autoDiscovery.namespace | string | `nil` | Enable autodiscovery via cluster namespace for for `cloudProvider=clusterapi` |
| autoDiscovery.roles | list | `["worker"]` | Magnum node group roles to match. |
| autoDiscovery.tags | list | `["k8s.io/cluster-autoscaler/enabled","k8s.io/cluster-autoscaler/{{ .Values.autoDiscovery.clusterName }}"]` | ASG tags to match, run through `tpl`. |
| autoscalingGroups | list | `[]` | For AWS, Azure AKS or Magnum. At least one element is required if not using `autoDiscovery`. For example: <pre> - name: asg1<br /> maxSize: 2<br /> minSize: 1 </pre> For Hetzner Cloud, the `instanceType` and `region` keys are also required. <pre> - name: mypool<br /> maxSize: 2<br /> minSize: 1<br /> instanceType: CPX21<br /> region: FSN1 </pre> |
| autoscalingGroups | list | `[]` | For AWS, Azure AKS, Exoscale or Magnum. At least one element is required if not using `autoDiscovery`. For example: <pre> - name: asg1<br /> maxSize: 2<br /> minSize: 1 </pre> For Hetzner Cloud, the `instanceType` and `region` keys are also required. <pre> - name: mypool<br /> maxSize: 2<br /> minSize: 1<br /> instanceType: CPX21<br /> region: FSN1 </pre> |
| autoscalingGroupsnamePrefix | list | `[]` | For GCE. At least one element is required if not using `autoDiscovery`. For example: <pre> - name: ig01<br /> maxSize: 10<br /> minSize: 0 </pre> |
| awsAccessKeyID | string | `""` | AWS access key ID ([if AWS user keys used](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md#using-aws-credentials)) |
| awsRegion | string | `"us-east-1"` | AWS region (required if `cloudProvider=aws`) |
Expand Down
20 changes: 14 additions & 6 deletions charts/cluster-autoscaler/README.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -238,10 +238,20 @@ Additional config parameters available, see the `values.yaml` for more details

### Exoscale

The following parameters are required:
Create a `values.yaml` file with the following content:
```yaml
cloudProvider: exoscale
autoDiscovery:
clusterName: cluster.local # this value is not used, but must be set
```

- `cloudProvider=exoscale`
- `autoDiscovery.clusterName=<CLUSTER NAME>`
Optionally, you may specify the minimum and maximum size of a particular nodepool by adding the following to the `values.yaml` file:
```yaml
autoscalingGroups:
- name: your-nodepool-name
maxSize: 10
minSize: 1
```

Create an Exoscale API key with appropriate permissions as described in [cluster-autoscaler/cloudprovider/exoscale/README.md](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/exoscale/README.md).
A secret of name `<release-name>-exoscale-cluster-autoscaler` needs to be created, containing the api key and secret, as well as the zone.
Expand All @@ -255,9 +265,7 @@ $ kubectl create secret generic my-release-exoscale-cluster-autoscaler \
After creating the secret, the chart may be installed:

```console
$ helm install my-release autoscaler/cluster-autoscaler \
--set cloudProvider=exoscale \
--set autoDiscovery.clusterName=<CLUSTER NAME>
$ helm install my-release autoscaler/cluster-autoscaler -f values.yaml
```

Read [cluster-autoscaler/cloudprovider/exoscale/README.md](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/exoscale/README.md) for further information on the setup without helm.
Expand Down
2 changes: 1 addition & 1 deletion charts/cluster-autoscaler/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ autoDiscovery:
labels: []
# - color: green
# - shape: circle
# autoscalingGroups -- For AWS, Azure AKS or Magnum. At least one element is required if not using `autoDiscovery`. For example:
# autoscalingGroups -- For AWS, Azure AKS, Exoscale or Magnum. At least one element is required if not using `autoDiscovery`. For example:
# <pre>
# - name: asg1<br />
# maxSize: 2<br />
Expand Down
34 changes: 28 additions & 6 deletions cluster-autoscaler/cloudprovider/exoscale/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,19 @@
The Cluster Autoscaler (CA) for Exoscale scales worker nodes running in
Exoscale SKS Nodepools or Instance Pools.

- [Cluster Autoscaler for Exoscale](#cluster-autoscaler-for-exoscale)
- [Configuration](#configuration)
- [Authenticating to the Exoscale API](#authenticating-to-the-exoscale-api)
- [Optional configuration](#optional-configuration)
- [Deployment](#deployment)
- [Helm](#helm)
- [Manifest](#manifest)
- [⚠️ Important Notes](#️--important-notes)

## Configuration

### Authenticating to the Exoscale API

> Note: the following guide assumes you have the permissions to create
> resources in the `kube-system` namespace of the target Kubernetes cluster.

Expand Down Expand Up @@ -49,7 +59,7 @@ environment.
You can restrict API operation your IAM key can perform:

* When deploying the Cluster Autoscaler in SKS, your can restrict your IAM access key
to these API operations :
to these API operations :

```
evict-sks-nodepool-members
Expand All @@ -74,7 +84,19 @@ get-quota
scale-instance-pool
```

### Deploying the Cluster Autoscaler
### Optional configuration

By default, all nodepools in the k8s cluster are considered for scaling.
The flag `--nodes=<min>:<max>:<nodepool-name>` may be specified to limit the minimum and
maximum size of a particular nodepool.

## Deployment

### Helm

See the [Helm Chart README](https://github.com/kubernetes/autoscaler/tree/master/charts/cluster-autoscaler).

### Manifest

To deploy the CA on your Kubernetes cluster, you can use the manifest provided as example:

Expand All @@ -92,10 +114,10 @@ kubectl apply -f ./examples/cluster-autoscaler.yaml

## ⚠️ Important Notes

* The minimum node group size is 1
* The maximum node group size is computed based on the current [Compute
instances limit][exo-limits] of the Exoscale account the Cluster Autoscaler
is running in.
* The minimum and maximum node group size of particular nodepools
may be specified via the `--nodes` flag, if omitted (default),
the minimum is 1 and maximum is computed based on the current [Compute instances limit][exo-limits]
of the Exoscale account the Cluster Autoscaler is running in.
* The Instance Pool candidate for scaling is determined based on the Compute
instance the Kubernetes node is running on, depending on cluster resource
constraining events emitted by the Kubernetes scheduler.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import (
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider"
egoscale "k8s.io/autoscaler/cluster-autoscaler/cloudprovider/exoscale/internal/github.com/exoscale/egoscale/v2"
"k8s.io/autoscaler/cluster-autoscaler/config"
"k8s.io/autoscaler/cluster-autoscaler/config/dynamic"
"k8s.io/autoscaler/cluster-autoscaler/utils/errors"
"k8s.io/autoscaler/cluster-autoscaler/utils/gpu"
)
Expand Down Expand Up @@ -98,10 +99,38 @@ func (e *exoscaleCloudProvider) NodeGroupForNode(node *apiv1.Node) (cloudprovide
)
}

// nodeGroupSpec contains the configuration spec from the '--nodes' flag
// which includes the min and max size of the node group.
var nodeGroupSpec *dynamic.NodeGroupSpec
for _, spec := range e.manager.discoveryOpts.NodeGroupSpecs {
s, err := dynamic.SpecFromString(spec, scaleToZeroSupported)
if err != nil {
return nil, fmt.Errorf("failed to parse node group spec: %v", err)
}

if s.Name == *sksNodepool.Name {
nodeGroupSpec = s
break
}
}
var minSize, maxSize int
if nodeGroupSpec != nil {
minSize = nodeGroupSpec.MinSize
maxSize = nodeGroupSpec.MaxSize
} else {
minSize = 1
maxSize, err = e.manager.computeInstanceQuota()
if err != nil {
return nil, err
}
}

nodeGroup = &sksNodepoolNodeGroup{
sksNodepool: sksNodepool,
sksCluster: sksCluster,
m: e.manager,
minSize: minSize,
maxSize: maxSize,
}
debugf("found node %s belonging to SKS Nodepool %s", toNodeID(node.Spec.ProviderID), *sksNodepool.ID)
} else {
Expand Down Expand Up @@ -196,15 +225,15 @@ func (e *exoscaleCloudProvider) Refresh() error {
}

// BuildExoscale builds the Exoscale cloud provider.
func BuildExoscale(_ config.AutoscalingOptions, _ cloudprovider.NodeGroupDiscoveryOptions, rl *cloudprovider.ResourceLimiter) cloudprovider.CloudProvider {
manager, err := newManager()
func BuildExoscale(_ config.AutoscalingOptions, discoveryOpts cloudprovider.NodeGroupDiscoveryOptions, rl *cloudprovider.ResourceLimiter) cloudprovider.CloudProvider {
manager, err := newManager(discoveryOpts)
if err != nil {
fatalf("failed to initialize manager: %v", err)
}

// The cloud provider automatically uses all Instance Pools in the k8s cluster.
// This means we don't use the cloudprovider.NodeGroupDiscoveryOptions
// flags (which can be set via '--node-group-auto-discovery' or '-nodes')
// The flag '--nodes=1:5:nodepoolname' may be specified to limit the size of a nodepool.
// The flag '--node-group-auto-discovery' is not implemented.
provider, err := newExoscaleCloudProvider(manager, rl)
if err != nil {
fatalf("failed to create Exoscale cloud provider: %v", err)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ func (ts *cloudProviderTestSuite) SetupTest() {
ts.T().Setenv("EXOSCALE_API_KEY", "x")
ts.T().Setenv("EXOSCALE_API_SECRET", "x")

manager, err := newManager()
manager, err := newManager(cloudprovider.NodeGroupDiscoveryOptions{})
if err != nil {
ts.T().Fatalf("error initializing cloud provider manager: %v", err)
}
Expand Down Expand Up @@ -214,6 +214,17 @@ func (ts *cloudProviderTestSuite) TestExoscaleCloudProvider_NodeGroupForNode_Ins
}

func (ts *cloudProviderTestSuite) TestExoscaleCloudProvider_NodeGroupForNode_SKSNodepool() {
ts.p.manager.client.(*exoscaleClientMock).
On("GetQuota", ts.p.manager.ctx, ts.p.manager.zone, testComputeInstanceQuotaName).
Return(
&egoscale.Quota{
Resource: &testComputeInstanceQuotaName,
Usage: &testComputeInstanceQuotaUsage,
Limit: &testComputeInstanceQuotaLimit,
},
nil,
)

ts.p.manager.client.(*exoscaleClientMock).
On("ListSKSClusters", ts.p.manager.ctx, ts.p.manager.zone).
Return(
Expand Down Expand Up @@ -313,6 +324,17 @@ func (ts *cloudProviderTestSuite) TestExoscaleCloudProvider_NodeGroups() {
// Nodegroup. If everything works as expected, the
// cloudprovider.NodeGroups() method should return 2 Nodegroups.

ts.p.manager.client.(*exoscaleClientMock).
On("GetQuota", ts.p.manager.ctx, ts.p.manager.zone, testComputeInstanceQuotaName).
Return(
&egoscale.Quota{
Resource: &testComputeInstanceQuotaName,
Usage: &testComputeInstanceQuotaUsage,
Limit: &testComputeInstanceQuotaLimit,
},
nil,
)

ts.p.manager.client.(*exoscaleClientMock).
On("GetInstancePool", ts.p.manager.ctx, ts.p.manager.zone, instancePoolID).
Return(
Expand Down
18 changes: 10 additions & 8 deletions cluster-autoscaler/cloudprovider/exoscale/exoscale_manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,13 +43,14 @@ const defaultAPIEnvironment = "api"
// Manager handles Exoscale communication and data caching of
// node groups (Instance Pools).
type Manager struct {
ctx context.Context
client exoscaleClient
zone string
nodeGroups []cloudprovider.NodeGroup
ctx context.Context
client exoscaleClient
zone string
nodeGroups []cloudprovider.NodeGroup
discoveryOpts cloudprovider.NodeGroupDiscoveryOptions
}

func newManager() (*Manager, error) {
func newManager(discoveryOpts cloudprovider.NodeGroupDiscoveryOptions) (*Manager, error) {
var (
zone string
apiKey string
Expand Down Expand Up @@ -82,9 +83,10 @@ func newManager() (*Manager, error) {
debugf("initializing manager with zone=%s environment=%s", zone, apiEnvironment)

m := &Manager{
ctx: exoapi.WithEndpoint(context.Background(), exoapi.NewReqEndpoint(apiEnvironment, zone)),
client: client,
zone: zone,
ctx: exoapi.WithEndpoint(context.Background(), exoapi.NewReqEndpoint(apiEnvironment, zone)),
client: client,
zone: zone,
discoveryOpts: discoveryOpts,
}

return m, nil
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,18 +19,19 @@ package exoscale
import (
"os"

"k8s.io/autoscaler/cluster-autoscaler/cloudprovider"
egoscale "k8s.io/autoscaler/cluster-autoscaler/cloudprovider/exoscale/internal/github.com/exoscale/egoscale/v2"
)

func (ts *cloudProviderTestSuite) TestNewManager() {
manager, err := newManager()
manager, err := newManager(cloudprovider.NodeGroupDiscoveryOptions{})
ts.Require().NoError(err)
ts.Require().NotNil(manager)

os.Unsetenv("EXOSCALE_API_KEY")
os.Unsetenv("EXOSCALE_API_SECRET")

manager, err = newManager()
manager, err = newManager(cloudprovider.NodeGroupDiscoveryOptions{})
ts.Require().Error(err)
ts.Require().Nil(manager)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,10 @@ import (
schedulerframework "k8s.io/kubernetes/pkg/scheduler/framework"
)

const (
scaleToZeroSupported = false
)

// sksNodepoolNodeGroup implements cloudprovider.NodeGroup interface for Exoscale SKS Nodepools.
type sksNodepoolNodeGroup struct {
sksNodepool *egoscale.SKSNodepool
Expand All @@ -36,21 +40,19 @@ type sksNodepoolNodeGroup struct {
m *Manager

sync.Mutex

minSize int
maxSize int
}

// MaxSize returns maximum size of the node group.
func (n *sksNodepoolNodeGroup) MaxSize() int {
limit, err := n.m.computeInstanceQuota()
if err != nil {
return 0
}

return limit
return n.maxSize
}

// MinSize returns minimum size of the node group.
func (n *sksNodepoolNodeGroup) MinSize() int {
return 1
return n.minSize
}

// TargetSize returns the current target size of the node group. It is possible that the
Expand Down
Loading
Loading