Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ACM-14076]: Ensure CMO ConfigMap is reconciled on any event #1610

Merged
merged 5 commits into from
Sep 10, 2024

Conversation

philipgough
Copy link
Contributor

@philipgough philipgough commented Sep 6, 2024

  1. Adds an integration test for merging of CMO ConfigMap with other clients
  2. Adds the Delete event to the Watch for CMO ConfigMap
  3. Explicitly adds the CMO ConfigMap reference to the filtered cache

Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
…lertmanagersConfig

Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
@philipgough
Copy link
Contributor Author

/retest-required

@coleenquadros
Copy link
Contributor

/retest

@philipgough
Copy link
Contributor Author

/test test-e2e

Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
Copy link

sonarcloud bot commented Sep 9, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
62.5% Coverage on New Code (required ≥ 70%)

See analysis details on SonarCloud

Copy link

openshift-ci bot commented Sep 9, 2024

@philipgough: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/sonarcloud 7b0dbb2 link false /test sonarcloud

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@philipgough
Copy link
Contributor Author

/test test-e2e

@philipgough philipgough changed the title Test 2 [ACM-14076]: Ensure CMO ConfigMap is reconciled on any event Sep 10, 2024
Copy link
Contributor

@moadz moadz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for this Philip! Heroic work.

@@ -555,7 +557,7 @@ func (r *ObservabilityAddonReconciler) SetupWithManager(mgr ctrl.Manager) error
Watches(
&corev1.ConfigMap{},
&handler.EnqueueRequestForObject{},
builder.WithPredicates(getPred(clusterMonitoringConfigName, promNamespace, true, true, false)),
builder.WithPredicates(getPred(clusterMonitoringConfigName, promNamespace, true, true, true)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this, it probably makes sense to create the config map when it's missing to insure that alerts are being forwarded.


mgr, err := ctrl.NewManager(testEnvHub.Config, ctrl.Options{
Scheme: k8sClient.Scheme(),
Metrics: metricsserver.Options{BindAddress: "0"}, // Avoids port conflict with the default port 8080
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏽 wildcard port, but it is reserved and we lose the ability to discover the port if we need it. In this case that's not a problem.

}()

cm := &corev1.ConfigMap{}
err = wait.Poll(1*time.Second, time.Minute, func() (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might make the unit suite take a bit longer to run. Is it worth checking less often? Perhaps every 10s.

Comment on lines +157 to +159
foundClusterMonitoringConfiguration := &cmomanifests.ClusterMonitoringConfiguration{}
err = yaml2.Unmarshal([]byte(cm.Data[clusterMonitoringConfigDataKey]), foundClusterMonitoringConfiguration)
assert.NoError(t, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: Readability, I'm not sure if i'm remembering incorrectly, but i think you already spun this into a util function in the ocp_monitoring_config.go refactor.

assert.Len(t, foundClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs, 1)
assert.Equal(t, foundClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs[0].Scheme, "https")

foundClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs[0].Scheme = "http"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: might be worth having a JSON->JSON comparison for these diffs to improve readability instead of field extracts/asserts but that would require a refactor of this entire test case which neither of us is willing to do 😆

Comment on lines +181 to +192
foundUpdatedClusterMonitoringConfiguration := &cmomanifests.ClusterMonitoringConfiguration{}
err = yaml2.Unmarshal([]byte(updated.Data[clusterMonitoringConfigDataKey]), foundUpdatedClusterMonitoringConfiguration)
if err != nil {
return false, nil
}

if foundUpdatedClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs[0].Scheme != "https" {
return false, nil
}

if foundUpdatedClusterMonitoringConfiguration.PrometheusK8sConfig.Retention != "infinity-and-beyond" {
return false, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: same as above comment, might be better for test readability to have this as a straight expected/actual comparison. But not a must for this PR

@@ -394,16 +394,20 @@ func createOrUpdateClusterMonitoringConfig(
// check if alertmanagerConfigs exists
if foundClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs != nil {
additionalAlertmanagerConfigExists := false
for _, v := range foundClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs {
var atIndex int
for i, v := range foundClusterMonitoringConfiguration.PrometheusK8sConfig.AlertmanagerConfigs {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be easier here to have a func that just checks for the alertmanager URL that we care about, and returns true if it finds the object. Or conversely if we always just add it when there's a reconcile. Is there a cost to updating this field? (Does alertmanager restart)

Copy link

openshift-ci bot commented Sep 10, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: moadz, philipgough

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@@ -95,6 +95,8 @@ func main() {
{FieldSelector: namespaceSelector},
{FieldSelector: fmt.Sprintf("metadata.name==%s,metadata.namespace!=%s",
operatorconfig.AllowlistCustomConfigMapName, "open-cluster-management-observability")},
{FieldSelector: fmt.Sprintf("metadata.name==%s,metadata.namespace==%s",
operatorconfig.OCPClusterMonitoringConfigMapName, operatorconfig.OCPClusterMonitoringNamespace)},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth an inline comment to describe the issue we ran into with not specifying the namespace on a watched resource? or is it neither here nor there.

@philipgough philipgough merged commit 59654ff into stolostron:main Sep 10, 2024
19 of 22 checks passed
@philipgough philipgough deleted the test-2 branch September 10, 2024 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants