-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support disable DR when cluster is unavailable #1147
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
22dff67
to
f404c45
Compare
Note from discussion with @ShyamsundarR
Wanted behavior when drcluster is deleted:
|
8b29b1f
to
355263f
Compare
@ShyamsundarR @BenamarMk please review. |
In cluster replacement flow, we may need to disable DR when a drcluster is not available, which breaks disable DR flow in many places. To allow cleanup to proceed on the hub and on the remaining managed cluster, the user will have to delete the drcluster for the failed cluster. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
When getting or updating vrgs from managed clusters, we passed the drpolicy, and used it's drcluster names for fetching the vrg. To handle unavailable drcluster, we need the DRCluster type instead of it name, but we don't have the required context and client to get it in getVRGsFromManagedClusters(). Fixed by getting the drclusters in the caller when we have the context and client, and pass down the drcluster list instead of the drpolicy. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
When drcluster is unreachable the ManagedClusterView may continue to report the last status for the VRG and no error. This fails disable DR when we wait until all VRGs are deleted. Fixed by ignoring VRGs on deleted drclusters. Once the VRG on the available drcluster is deleted, VRG count reach zero and we can complete drpc deletion. Since this change is in the getVRGsFromManagedClusters() is also affects creatingDRPCInstance and maintainance mode. I'm not sure about these changes yet, but it seems a good idea not to handle VRG on a deleted drcluster. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
355263f
to
dd536d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed changes done. Still reviewing impact to other parts of the code because of the changes and other use-cases that may need to handle DRCluster in a deleted state.
controllers/drplacementcontrol.go
Outdated
@@ -1453,16 +1453,8 @@ func (d *DRPCInstance) createVRGManifestWork(homeCluster string, repState rmn.Re | |||
return nil | |||
} | |||
|
|||
// ensureVRGManifestWork ensures that the VRG manifestwork exists and matches the current VRG state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now update MW for VRG from the cached MCV VRG replicationState when this function is invoked.
While this is not a problem when called from ensureActionCompleted
where its callers ensure that the cached VRG is primary and we then invoke ensureVRGManifestWork
.
There are some gaps in the invocation from RunInitialDeployment
that needs closer inspection.
For example the code that decides a VRG exists on one of the clusters is not checking if that VRG exists as primary, if that was done using the replication state from the MCV is fine, as it would always be primary when this function is called.
This function ensureVRGManifestWork
is currently only safe when VRG MCV is Primary and desired VRG MW is also Primary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update the documentation and look at callers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this issue new? this changes only calls when the vrg exists - this could be called for the wrong vrg state before this change if the vrg manifest work was deleted, no?
controllers/drplacementcontrol.go
Outdated
for i := range d.drClusters { | ||
c := &d.drClusters[i] | ||
if !drClusterIsDeleted(c) { | ||
r = append(r, c.Spec.S3ProfileName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works, but took 8 minutes before the drpc controller detect the cluster is deleted when updating the vrg. Looks like we have a caching issue.
- drcluster controller find tha the drcluster was deleted
2023-12-12T16:15:34.356Z INFO controllers.DRCluster controllers/drcluster_controller.go:272 reconcile enter {"name": "dr1", "rid": "bf896718-4c69-41a0-bb95-9182e04a42ac"}
2023-12-12T16:15:34.356Z INFO controllers.DRCluster controllers/drcluster_controller.go:453 delete {"name": "dr1", "rid": "bf896718-4c69-41a0-bb95-9182e04a42ac"}
2023-12-12T16:15:34.357Z INFO controllers.DRCluster controllers/drcluster_controller.go:298 reconcile exit {"name": "dr1", "rid": "bf896718-4c69-41a0-bb95-9182e04a42ac"}
2023-12-12T16:15:34.357Z ERROR controller/controller.go:329 Reconciler error {"controller": "drcluster", "controllerGroup": "ramendr.openshift.io", "controllerKind": "DRCluster", "DRCluster": {"name":"dr1"}, "namespace": "", "name": "dr1", "reconcileID": "33b6081b-3d31-48a0-866c-88ab6a038e7b", "error": "drclusters undeploy: drcluster 'dr1' referenced in one or more existing drPolicy resources"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235
- 7 minutes later - drplacement controller detects that the cluster was deleted
2023-12-12T16:23:29.231Z INFO controllers.DRPlacementControl controllers/drplacementcontrol_controller.go:594 Entering reconcile loop {"DRPC": "busybox-regional-rbd-deploy/busybox-regional-rbd-deploy-drpc", "rid": "2e059f82-a46a-48bd-b975-a0619a608aa7"}
...
2023-12-12T16:23:29.234Z INFO controllers.DRPlacementControl controllers/drplacementcontrol_controller.go:1486 Skipping VRG on deleted drcluster {"DRPC": "busybox-regional-rbd-deploy/busybox-regional-rbd-deploy-drpc", "rid": "2e059f82-a46a-48bd-b975-a0619a608aa7", "drcluster": "dr1", "vrg": "busybox-regional-rbd-deploy-drpc"}
- In the same reconcile loop - it updates the vrg manifestwork - including s3 store on the deleted cluster
2023-12-12T16:23:29.236Z INFO controllers.DRPlacementControl controllers/drplacementcontrol.go:1606 ensureNamespaceExistsOnManagedCluster: namespace 'busybox-regional-rbd-deploy' exists on cluster dr2: true {"DRPC": "busybox-regional-rbd-deploy/busybox-regional-rbd-deploy-drpc", "rid": "2e059f82-a46a-48bd-b975-a0619a608aa7"}
2023-12-12T16:23:29.236Z INFO controllers.DRPlacementControl controllers/drplacementcontrol.go:1434 Creating VRG ManifestWork {"DRPC": "busybox-regional-rbd-deploy/busybox-regional-rbd-deploy-drpc", "rid": "2e059f82-a46a-48bd-b975-a0619a608aa7", "Last State:": "FailedOver", "cluster": "dr2"}
2023-12-12T16:23:29.236Z INFO controllers.DRPlacementControl util/mw_util.go:121 Create or Update manifestwork busybox-regional-rbd-deploy-drpc:busybox-regional-rbd-deploy:dr2:{TypeMeta:{Kind:VolumeReplicationGroup APIVersion:ramendr.openshift.io/v1alpha1} ObjectMeta:{Name:busybox-regional-rbd-deploy-drpc GenerateName: Namespace:busybox-regional-rbd-deploy SelfLink: UID: ResourceVersion: Generation:0 CreationTimestamp:0001-01-01 00:00:00 +0000 UTC DeletionTimestamp:<nil> DeletionGracePeriodSeconds:<nil> Labels:map[] Annotations:map[] OwnerReferences:[] Finalizers:[] ManagedFields:[]} Spec:{PVCSelector:{MatchLabels:map[appname:busybox] MatchExpressions:[]} ReplicationState:primary S3Profiles:[minio-on-dr2] Async:0xc000fe3a90 Sync:<nil> VolSync:{RDSpec:[] Disabled:false} PrepareForFinalSync:false RunFinalSync:false Action:Failover KubeObjectProtection:<nil>} Status:{State: ProtectedPVCs:[] Conditions:[] ObservedGeneration:0 LastUpdateTime:0001-01-01 00:00:00 +0000 UTC KubeObjectProtection:{CaptureToRecoverFrom:<nil>} PrepareForFinalSyncComplete:false FinalSyncComplete:false LastGroupSyncTime:<nil> LastGroupSyncDuration:nil LastGroupSyncBytes:<nil>}} {"DRPC": "busybox-regional-rbd-deploy/busybox-regional-rbd-deploy-drpc", "rid": "2e059f82-a46a-48bd-b975-a0619a608aa7"}
2023-12-12T16:23:29.236Z INFO controllers.DRPlacementControl util/mw_util.go:499 Updating Manifestwork {"DRPC": "busybox-regional-rbd-deploy/busybox-regional-rbd-deploy-drpc", "rid": "2e059f82-a46a-48bd-b975-a0619a608aa7", "name": "busybox-regional-rbd-deploy-drpc-busybox-regional-rbd-deploy-vrg-mw", "namespace": "dr2"}
- Few milliseconds later - we update the VRG without the s3 profile of the failed cluster
2023-12-12T16:23:29.236Z INFO controllers.DRPlacementControl util/mw_util.go:121 Create or Update manifestwork busybox-regional-rbd-deploy-drpc:busybox-regional-rbd-deploy:dr2:{TypeMeta:{Kind:VolumeReplicationGroup APIVersion:ramendr.openshift.io/v1alpha1} ObjectMeta:{Name:busybox-regional-rbd-deploy-drpc GenerateName: Namespace:busybox-regional-rbd-deploy SelfLink: UID: ResourceVersion: Generation:0 CreationTimestamp:0001-01-01 00:00:00 +0000 UTC DeletionTimestamp:<nil> DeletionGracePeriodSeconds:<nil> Labels:map[] Annotations:map[] OwnerReferences:[] Finalizers:[] ManagedFields:[]} Spec:{PVCSelector:{MatchLabels:map[appname:busybox] MatchExpressions:[]} ReplicationState:primary S3Profiles:[minio-on-dr2] Async:0xc000fe3a90 Sync:<nil> VolSync:{RDSpec:[] Disabled:false} PrepareForFinalSync:false RunFinalSync:false Action:Failover KubeObjectProtection:<nil>} Status:{State: ProtectedPVCs:[] Conditions:[] ObservedGeneration:0 LastUpdateTime:0001-01-01 00:00:00 +0000 UTC KubeObjectProtection:{CaptureToRecoverFrom:<nil>} PrepareForFinalSyncComplete:false FinalSyncComplete:false LastGroupSyncTime:<nil> LastGroupSyncDuration:nil LastGroupSyncBytes:<nil>}} {"DRPC": "busybox-regional-rbd-deploy/busybox-regional-rbd-deploy-drpc", "rid": "2e059f82-a46a-48bd-b975-a0619a608aa7"}
return false | ||
log.Info("Delete event") | ||
|
||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not work, looks like FilterDRCluster is filtering only drcluster that are the failover clusters for some drpcs, including skiping for metro dr. This code is complex and we don't have enough time to modify it now.
I'll work on this later.
8bf9444
to
fd37ec5
Compare
fd37ec5
to
0e46740
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the use-case perspective of deleting a lost DRCluster and prior to that failing over workloads from the lost cluster to the surviving cluster the code changes are fine.
There are a set of things we need to take care in the future as follows:
- (existing issue?) DRCluster delete should delete volsync addon created at the hub
- We should fail to process newer DRPC instances created referencing a DRPolicy with a deleted DRCluster
- e.g.
// TODO: What if the DRCluster is deleted? If new DRPC fail reconciliation - We cannot error above, as existing DRPC would also detect the same and error out without processing newer actions (say failover from deleted cluster)
- Need to do this at a point before we apply the finalizer to the DRPC instance, such that we refuse to process it at all
- e.g.
- If current DRPC is Primary on deleted cluster, then secondary cluster resources may not get garbage collected; if DRPC is deleted in this state
- RBD secondary images may get left behind
- MetroDR case may have PVs left behind on the surviving primary cluster?
- Volsync resources on Secondary should be cleaned up for such situations
- Should we insist on failover before DRPC deletion? But how, DRPC is already deleted when detecting this and so further spec changes are not allowed. (unsure how to handle this at the moment)
- (good to have) On DRPC deletion it can garbage collect the ManifestWork for the namespace as well on the deleted DRCluster to leave a cleaner setup
- (tidier status)
cleanupSecondaries
may keep waiting for MCV to report VRG is deleted on the deleted DRCluster (which will never happen)- This leads to failedOver state, but DRPC stuck on cleaning up state, which can be avoided handling a deleted DRCluster in this scenario
- Similarly
cleanupForVolSync
may also be stuck attempting to cleanup - When we handle the above we would need to also ensure we do not attempt to setup volsync again
EnsureVolSyncReplicationSetup
- Failover to deleted DRCluster should be prevented
- Relocate to/from deleted cluster is currently allowed and should be prevented in both cases
createVRGManifestWorkAsPrimary
MAY update VRG as primary from secondary (volsync case) and later updated with correct S3 stores inensureVRGManifestWork
- Can be corrected, or marked as known for future code reading to avoid the question
0e46740
to
1bf23a5
Compare
When creating s3 profile names, include only the s3 profiles from available drclusters. This affects generating a VRG for manifestwork. When validating failover prerequisites we already use only the failover clsuter, and this cluster cannot be deleted. Another case of drcluster deploy/undeploy flow, which still use all drclusters and may require more work. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
- More consistent logging - "Creating ManifestWork" or "Updating ManifestWork". - Extract key variable to clean up the Client.Get() calls - Eliminate unneeded temporary err and retryErr variables - Remove commented code Signed-off-by: Nir Soffer <nsoffer@redhat.com>
When a drcluster is deleted, the VRG s3 profiles list changes, so we need to update the manifestwork. In ensureVRGManifestWork() we skipped updating the manifestwork if the VRG exists, now we always update the manifestwork. This requires little bit more work for every reconcile, but should be efficient enough since the vrg and manifestwork are cached. Issues: - It is not clear if ensureVRGManifestWork() is safe only for primary vrg, and it callers are using it correctly. But if we had wrong callers before, this change increases the impact of a wrong call, updating existing VRG which was not updated before. Add a TODO to look at this later, so we can test replace cluster flow now. - The VRG is updated few minutes after detecting a deleted drcluster, since we don't watch drcluster delete events, and we filter drclsuter events only for drpc failing over to updated clusters. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
1bf23a5
to
a841158
Compare
When cluster is not available, disabling DR get suck waiting for the vrgs.
On the remaining cluster, the vrg is stuck trying to delete data from the s3 store on the unavailable cluster.
On the hub, the drpc is stuck waiting for the stuck vrg, and for the stale vrg reported bu the managedclusterview of the unavailable cluster.
To fix the issue we require the admin to delete the failed drcluter after failing over to another cluster. The system will avoid accessing s3 profiles on deleted drcluster and do not wait for VRG on deleted clusters.
Disable DR flow when a cluster is not available
For each application on the failed cluster, fail over to the secondary cluster
Mark the failed drcluster for deletion on the hub
Using --wait=false since the drcluster will not be deleted until the drpolicy referencing it will be deleted, and the drpolicy will not be deleted before the drpc referencing it will be deleted.
For each application disable DR:
Ensure Placement is pointing to failover cluster
Wait until the vrg for the application has only one s3 profile:
This can take few minutes after the drcluster was deleted.
Delete the drpc resource for the OCM application on the hub
If the drpc delete does not finish, check that the vrg does not have s3 profile for the failed cluster. If it does, you need to edit the vrg and remove it manually.
Change the Placement to be reconciled by OCM
Known issues
Docs on disabling DR needed
It can take few minutes after deleting the drcluster until the vrg on failover cluster updates to exclude the s3 profile. If the drpc is deleted after the vrg is updated, the delete will complete quickly.
If disable DR is started before the vrg is updated on the failover cluster, disabling DR will not complete, since deleting a drpc deletes the manifestwork for the vrg, and after that no change in the manifest work is propagated to the managed cluster.
With volsync (cephfs) disabling dr also delete the PVC: [VolSync] Ensure Application PVC is not Automatically Garbage Collected upon VRG Deletion #1150
Testing
Update the the ramen-operator image in csv on the hub and drcluster:
If upgrading an older version (e.g. 4.14), apply the CRDs from main:
Status
Tested using RamenDR/ocm-ramen-samples#41 on top of #1153
basic-test
drenv:
ocp:
Disable DR flow
Tested without the commit updating the vrg.
drenv:
ocp:
Todo:
Fixes #1139