-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ACM-14076]: Ensure CMO ConfigMap is reconciled on any event #1610
Conversation
philipgough
commented
Sep 6, 2024
•
edited
Loading
edited
- Adds an integration test for merging of CMO ConfigMap with other clients
- Adds the Delete event to the Watch for CMO ConfigMap
- Explicitly adds the CMO ConfigMap reference to the filtered cache
Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
…lertmanagersConfig Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
/retest-required |
/retest |
/test test-e2e |
Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
Quality Gate failedFailed conditions |
@philipgough: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
/test test-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks for this Philip! Heroic work.
@@ -555,7 +557,7 @@ func (r *ObservabilityAddonReconciler) SetupWithManager(mgr ctrl.Manager) error | |||
Watches( | |||
&corev1.ConfigMap{}, | |||
&handler.EnqueueRequestForObject{}, | |||
builder.WithPredicates(getPred(clusterMonitoringConfigName, promNamespace, true, true, false)), | |||
builder.WithPredicates(getPred(clusterMonitoringConfigName, promNamespace, true, true, true)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this, it probably makes sense to create the config map when it's missing to insure that alerts are being forwarded.
|
||
mgr, err := ctrl.NewManager(testEnvHub.Config, ctrl.Options{ | ||
Scheme: k8sClient.Scheme(), | ||
Metrics: metricsserver.Options{BindAddress: "0"}, // Avoids port conflict with the default port 8080 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏽 wildcard port, but it is reserved and we lose the ability to discover the port if we need it. In this case that's not a problem.
}() | ||
|
||
cm := &corev1.ConfigMap{} | ||
err = wait.Poll(1*time.Second, time.Minute, func() (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might make the unit suite take a bit longer to run. Is it worth checking less often? Perhaps every 10s.
foundClusterMonitoringConfiguration := &cmomanifests.ClusterMonitoringConfiguration{} | ||
err = yaml2.Unmarshal([]byte(cm.Data[clusterMonitoringConfigDataKey]), foundClusterMonitoringConfiguration) | ||
assert.NoError(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: Readability, I'm not sure if i'm remembering incorrectly, but i think you already spun this into a util function in the ocp_monitoring_config.go refactor.
assert.Len(t, foundClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs, 1) | ||
assert.Equal(t, foundClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs[0].Scheme, "https") | ||
|
||
foundClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs[0].Scheme = "http" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: might be worth having a JSON->JSON comparison for these diffs to improve readability instead of field extracts/asserts but that would require a refactor of this entire test case which neither of us is willing to do 😆
foundUpdatedClusterMonitoringConfiguration := &cmomanifests.ClusterMonitoringConfiguration{} | ||
err = yaml2.Unmarshal([]byte(updated.Data[clusterMonitoringConfigDataKey]), foundUpdatedClusterMonitoringConfiguration) | ||
if err != nil { | ||
return false, nil | ||
} | ||
|
||
if foundUpdatedClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs[0].Scheme != "https" { | ||
return false, nil | ||
} | ||
|
||
if foundUpdatedClusterMonitoringConfiguration.PrometheusK8sConfig.Retention != "infinity-and-beyond" { | ||
return false, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: same as above comment, might be better for test readability to have this as a straight expected/actual comparison. But not a must for this PR
@@ -394,16 +394,20 @@ func createOrUpdateClusterMonitoringConfig( | |||
// check if alertmanagerConfigs exists | |||
if foundClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs != nil { | |||
additionalAlertmanagerConfigExists := false | |||
for _, v := range foundClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs { | |||
var atIndex int | |||
for i, v := range foundClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be easier here to have a func that just checks for the alertmanager URL that we care about, and returns true if it finds the object. Or conversely if we always just add it when there's a reconcile. Is there a cost to updating this field? (Does alertmanager restart)
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: moadz, philipgough The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -95,6 +95,8 @@ func main() { | |||
{FieldSelector: namespaceSelector}, | |||
{FieldSelector: fmt.Sprintf("metadata.name==%s,metadata.namespace!=%s", | |||
operatorconfig.AllowlistCustomConfigMapName, "open-cluster-management-observability")}, | |||
{FieldSelector: fmt.Sprintf("metadata.name==%s,metadata.namespace==%s", | |||
operatorconfig.OCPClusterMonitoringConfigMapName, operatorconfig.OCPClusterMonitoringNamespace)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth an inline comment to describe the issue we ran into with not specifying the namespace on a watched resource? or is it neither here nor there.