Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancing Hub Recovery: Reworking DRPC State Rebuilding Algorithm #1165

Merged

Commits on Dec 20, 2023

  1. Enhancing Hub Recovery: Reworking DRPC State Rebuilding Algorithm

    This commit tackles hub recovery issues by reworking the algorithm responsible for
    rebuilding the DRPC state. The changes align with the following expectations:
    
    1. Stop Condition for Both Failed Queries:
       If attempts to query 2 clusters result in failure for both, the process is halted.
    
    2. Initial Deployment without VRGs:
       If 2 clusters are successfully queried, and no VRGs are found, proceed with the
       initial deployment.
    
    3. Handling Failures with S3 Store Check:
       - If 2 clusters are queried, 1 fails, and 0 VRGs are found, perform the following checks:
          - If the VRG is found in the S3 store, ensure that the DRPC action matches the VRG action.
          If not, stop until the action is corrected, allowing failover if necessary (set PeerReady).
          - If the VRG is not found in the S3 store and the failed cluster is not the destination
          cluster, continue with the initial deployment.
    
    4. Verification and Failover for VRGs on Failover Cluster:
       If 2 clusters are queried, 1 fails, and 1 VRG is found on the failover cluster, check
       the action:
          - If the actions don't match, stop until corrected by the user.
          - If they match, also stop but allow failover if the VRG in-hand is a secondary.
          Otherwise, continue.
    
    5. Handling VRGs on Destination Cluster:
       If 2 clusters are queried successfully and 1 or more VRGs are found, and one of the
       VRGs is on the destination cluster, perform the following checks:
          - Continue with the action only if the DRPC and the found VRG action match.
          - Stop until someone investigates if there is a mismatch, but allow failover to
          take place (set PeerReady).
    
    6. Otherwise, default to allowing Failover:
       If none of the above conditions apply, allow failover (set PeerReady) but stop until
       someone makes the necessary change.
    
    Signed-off-by: Benamar Mekhissi <bmekhiss@ibm.com>
    Benamar Mekhissi committed Dec 20, 2023
    Configuration menu
    Copy the full SHA
    1107ea6 View commit details
    Browse the repository at this point in the history
  2. Fix one place where drpcNamespace is used instead of vrgNamespace whe…

    …n using AppSet
    
    Signed-off-by: Benamar Mekhissi <bmekhiss@ibm.com>
    Benamar Mekhissi committed Dec 20, 2023
    Configuration menu
    Copy the full SHA
    72f7af9 View commit details
    Browse the repository at this point in the history
  3. Fix unit test failures

    Signed-off-by: Benamar Mekhissi <bmekhiss@ibm.com>
    Benamar Mekhissi committed Dec 20, 2023
    Configuration menu
    Copy the full SHA
    f348ce1 View commit details
    Browse the repository at this point in the history

Commits on Dec 22, 2023

  1. Add unit tests for hub recovery

    Signed-off-by: Benamar Mekhissi <bmekhiss@ibm.com>
    Benamar Mekhissi committed Dec 22, 2023
    Configuration menu
    Copy the full SHA
    ea6fdba View commit details
    Browse the repository at this point in the history

Commits on Dec 23, 2023

  1. Check access to VRG on a MC before deleting the MW

    Signed-off-by: Benamar Mekhissi <bmekhiss@ibm.com>
    Benamar Mekhissi committed Dec 23, 2023
    Configuration menu
    Copy the full SHA
    accaed3 View commit details
    Browse the repository at this point in the history