New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Fixes issue when multiple VRGs conflict with the PVCs being protected #1535

Draft

asn1809 wants to merge 117 commits into RamenDR:main from asn1809:fixForMultipleVRGIssue

Contributor

asn1809 commented Aug 30, 2024 •

edited

Loading

Issue:
When multiple VRGs are created in the same namespace, PVCs having same labels will be protected by both the VRGs which might turn into error while restoring.

Idea in the PR is to:

Acquire lock on the ns for processing a VRG in case it is in non-admin ns and there are multiple VRGs present in the same namespace.
Need to update Error in the VRG in case lock is already acquired by some other VRG.
DRPC monitoring the error on VRG, will show to the user.


          Fixes https://issues.redhat.com/browse/OCSBZM-4691

ed18630

Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com>
Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com> Annaraya Narasagond <annarayanarasagond@gmail.com>

Contributor Author

asn1809 commented Sep 4, 2024

Flowchart of the solution proposed:

Member

ShyamsundarR commented Sep 4, 2024

@asn1809 Please do not refer to downstream tickets/trackers/links, access maybe restricted or not generally available.

Also in the commit message using the "Fixes" keyword would cause github to assume it is fixing an issue in this repository or organization, so avoiding that keyword at the bottom helps.

The commit message also needs some massaging.

asn1809 changed the title ~~Fixes https://issues.redhat.com/browse/OCSBZM-4691~~ Fixes issue when multiple VRGs conflict with the PVCs being protected

asn1809 added 2 commits

September 5, 2024 18:45


          Moving locks to util and some refactoring

e127357

Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com>
Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com> Annaraya Narasagond <annarayanarasagond@gmail.com>


          Correcting lint issues

ef385a5

Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com>
Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com> Annaraya Narasagond <annarayanarasagond@gmail.com>

asn1809 force-pushed the fixForMultipleVRGIssue branch from 6e0a946 to ef385a5 Compare

September 5, 2024 15:05

asn1809 commented

View reviewed changes

Contributor Author

asn1809 left a comment

Comments after discussion with @raghavendra-talur needs to be addressed.

internal/controller/util/nslock.go

+              }
+              // NewNamespaceLock returns new NamespaceLock
+              func NewNamespaceLock() *NamespaceLock {

Contributor Author

asn1809 Sep 5, 2024

@raghavendra-talur : Pass ns string and use for struct init.

Contributor Author

asn1809 Sep 9, 2024

Just gave a thought on what we discussed. volumereplicationgroup_controller -- SetupWithManager (locks are initialized) r.locks = rmnutil.NewNamespaceLock() ns cannot be passed here as this call is well before Reconcile is invoked. So instead of string with ns, set of namespaces(locked) ones would be better.

Member

raghavendra-talur Sep 9, 2024

In that case, it could be

type NamespaceLock struct {
	mux       map[string]sync.Mutex
}

internal/controller/util/nslock.go

+              // TryToAcquireLock tries to acquire the lock for processing VRG in a namespace having
+              // multiple VRGs and returns true if successful.
+              // If processing has already begun in the namespace, returns false.
+              func (nl *NamespaceLock) TryToAcquireLock(namespace string) bool {

Contributor Author

asn1809 Sep 5, 2024

@raghavendra-talur : Namespace arg is not required as it would be already acquired.

internal/controller/util/nslock.go Outdated

+              // If processing has already begun in the namespace, returns false.
+              func (nl *NamespaceLock) TryToAcquireLock(namespace string) bool {
+              	nl.mux.Lock()
+              	defer nl.mux.Unlock()

Contributor Author

asn1809 Sep 5, 2024

@raghavendra-talur : This should be removed.

internal/controller/util/nslock.go Outdated

+              	nl.mux.Lock()
+              	defer nl.mux.Unlock()
+              	if nl.namespace == namespace {

Contributor Author

asn1809 Sep 5, 2024

@raghavendra-talur : Handle this according to struct.

internal/controller/util/nslock.go Outdated

+              // Release removes lock on the namespace
+              func (nl *NamespaceLock) Release(namespace string) {
+              	nl.mux.Lock()

Contributor Author

asn1809 Sep 5, 2024

@raghavendra-talur : Remove lock line.

internal/controller/util/nslock_test.go

+              var _ = Describe("Testing Locks", func() {
+              	nsLock := util.NewNamespaceLock()
+              	Expect(nsLock.TryToAcquireLock("test")).To(BeTrue())

Contributor Author

asn1809 Sep 5, 2024

@raghavendra-talur : May have to use go func or anonymous functions.

internal/controller/volumereplicationgroup_controller.go

               		r.Log.Info("Kube object protection disabled; don't watch kube objects requests")
               	}
+              	r.locks = rmnutil.NewNamespaceLock()

Contributor Author

asn1809 Sep 5, 2024

Pass the vrg ns as arg.

internal/controller/volumereplicationgroup_controller.go Outdated

+              		// if the number of vrgs in the ns is more than 1, lock is needed.
+              		if len(vrgList.Items) > 1 {
+              			if isLockAcquired := v.reconciler.locks.TryToAcquireLock(ns); !isLockAcquired {

Contributor Author

asn1809 Sep 5, 2024

Since TryToAcquireLock is a blocking call, not of response will not be helpful. Think on alternative.

internal/controller/volumereplicationgroup_controller.go Outdated

+              				// Acquiring lock failed, VRG reconcile should be requeued
+              				return err
+              			}
+              			defer v.reconciler.locks.Release(ns)

Contributor Author

asn1809 Sep 5, 2024

The release of the locks should be done once PVC is verified for the ownership label.

internal/controller/volumereplicationgroup_controller.go

               				"Please install velero/oadp and restart the operator", v.instance.Namespace, v.instance.Name)
               	}
+              	err = v.vrgParallelProcessingCheck(adminNamespaceVRG)
+              	if err != nil {

Contributor Author

asn1809 Sep 5, 2024 •

edited

Loading

The err will not be not nil anytime.

raghavendra-talur mentioned this pull request

Don't allow two DRPCs with conflicting selectors in the same namespace #1081

Closed

ELENAGER and others added 21 commits

October 8, 2024 23:10


          Use PVC name and namespace for changing PVC conditions (RamenDR#1528)

6e45f89

Previously we used VolumeReplication name and namespace for updating PVC conditions due to the fact
that we had one-to-one relationship between VolumeReplication and PVC. Now when we are going to use
consistency group this handling will not be right anymore, because several PVCs can be related to
one VolumeReplication. So, we need to use PVC name and namespace for updating PVC conditions.

Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>


          rephrase alerts message

025adc3

Signed-off-by: rakeshgm <rakeshgm@redhat.com>


          Reformat python version list

37aae20

Using one item per line to make future changes easier.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Test more python versions

b015270

Current python on Fedora 39 and macOS is 3.12, and 3.13 development
version is available on github actions for a while. Add the versions so
we can ensure compatibility with current and next python version.

We keep python3.9 for compatibility with system using old python like
RHEL 9. At some point we will want to drop old version and consume
newer features in current python, but we don't have anything requiring
this yet, so let try to be compatible.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Upgrading capabilities in the CSVs

48b3643

For ramen, the capabilities for hub and
cluster bundles currently states "Basic Install",
Updating it to "Seamless Upgrades".

Fixes: [Bug-2303823](https://bugzilla.redhat.com/show_bug.cgi?id=2303823)
Signed-off-by: Abhijeet Shakya <abhijeetshakya21@gmail.com>


          Update open-cluster-management to use v0.13.0

f8ed0e5

Signed-off-by: Abhijeet Shakya <abhijeetshakya21@gmail.com>


          Delete namespace manifestwork for applications

93481df

The changeset includes a new DeleteNamespaceManifestWork() func,
which first checks if the mw.Spec has delete option or if it already
has a DeletionTimestamp. Accordingly, it proceeds to delete the
namespace manifestwork.
It also updates the namespace manifestwork with the deleteOption and
propogationPolicy of type orphan, whenever the createOrUpdateNamespaceManifest()
func is called.

Fixes: [Bug 2059669](https://bugzilla.redhat.com/show_bug.cgi?id=2059669)
Signed-off-by: Abhijeet Shakya <abhijeetshakya21@gmail.com>


          Update unit-tests to verify deletion of namespace manifestwork

02ba28b

Signed-off-by: Abhijeet Shakya <abhijeetshakya21@gmail.com>


          tests: keep all suite cleanup functions in suite_test.go

b511e77

Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>


          Bump CSI version to 0.9.1

b00e284

Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>


          Skip check for duplicate controller names

e198936

In newer controller-runtime version 0.19.0 the check for unique controller names.
In our tests we are registering the same controller (drcluster controller) several times - in
suite_test and in drcluster_mmode_test and in drcluster_drcconfig_tests. As a temporaty solution
I added a flag for skipping the unique controller name validation. Another solution can be
adding a name as a parameter for SetupWithManager function.

Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>


          Fix PlacementDecision Exclusion from Hub Backup Due to Missing Label

13e0c77

The issue where the "velero.io/exclude-from-backup" label was not applied
originates from our deployment workflow. Initially, we deploy the workload,
followed by enabling DR. During the first deployment, the Placement Operator
creates the "PlacementDecision" with default labels. However, when "Ramen" tries
to add the "velero.io/exclude-from-backup" label during DR setup, it skips because
the "PlacementDecision" already exists. Consequently, "Velero" backs up the
"PlacementDecision". And during hub recovery, it is restored without its status,
leading to the unintended deletion of the workload. This situation only occurs
when the current state wasn't updated before hub recovery was applied.

The fix in this PR does not address the scenario where the workload is deployed,
a hub backup is taken, DR is enabled, and then the hub is recovered before another
backup is created.

Fixes bug: 2308801

Signed-off-by: Benamar Mekhissi <bmekhiss@ibm.com>


          Ignore not found namespace in argocd test

be4cd85

Fixing this error seen in the CI:

    drenv.commands.Error: Command failed:
       command: ('addons/argocd/test', 'rdr-hub', 'rdr-dr1', 'rdr-dr2')
       exitcode: 1
       error:
          Traceback (most recent call last):
            ...
          drenv.commands.Error: Command failed:
             command: ('kubectl', 'delete', '--context', 'rdr-dr2', 'namespace', 'argocd-test', '--wait=false')
             exitcode: 1
             error:
                Error from server (NotFound): namespaces "argocd-test" not found

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          'AllowVolumeExpansion' flag enables PV resizing

a3cf1e9

This commit allows PVs to be resized by enabling the
'AllowVolumeExpansion' flag in the StorageClass.
When this flag is set to true, users can dynamically
adjust the size of PVCs by specifying the desired
capacity in the PVC specification.

Signed-off-by: rakeshgm <rakeshgm@redhat.com>


          exclude PV and PVC in velero backups

e907603

Signed-off-by: rakeshgm <rakeshgm@redhat.com>


          Update setup-envtest to 0.19 version

1ba302c

We updated controller SDK to the 0.19 version, moving the
envtest to the same version for required testing.

This does not fix any test or such, it is just a hygine change.

Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>


          Add sample commit-msg hook

49fa212

Developers can enable it by copying the file from the hack directory
instead of copying and pasting the code from docs/devel-quick-start.md
and making the hook executable.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          go.mod: run go mod tidy

5bddec4

k8s.io/component-base should have been a direct dependency.
Fixes the go mod error.

Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>


          Update Ramen to use latest CRD from external-snapshotter

0a467d6

- Integrated the latest CRD from external-snapshotter.
- Updated dependencies and relevant configuration files.
- Changed VolumeSnapshotRefList in VolumeGroupSnapshotStatus to PVCVolumeSnapshotRefList,
which allows to map a VolumeSnapshot to the application PVC.

Signed-off-by: Benamar Mekhissi <bmekhiss@ibm.com>


          Ensure VGS name contains at most 63 characters

4fb500d

Signed-off-by: Benamar Mekhissi <bmekhiss@ibm.com>


          Ensure the Secondary VRG is updated when DRPC is updated

88c742a

This is mainly to ensure that the VRG on the primary and secondary are in sync
in regards to labels and annotations

Signed-off-by: Benamar Mekhissi <bmekhiss@ibm.com>

nirs and others added 30 commits

October 8, 2024 23:10


          Add external-snapshotter addon to kubevirt envs

d17d112

Was forgetten when we replaced minikube volumesnapshot addons.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Reformat reconcileMissingVR comment for readability

fcdddfd

The comment is still unclear, but at least easier to read.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Fix logs in reconcileMissingVR

fb1d4fb

- Replace "VR under deletion" with "deleted VR", matching the
  terminology used in the code.
- Replace "detected as missing or deleted" with "is missing" when we
  know the VR does not exist.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Extract VRGInstance.validateVRCompletedStatus()

5fa81f5

Moved from validateVRStatus() to make room from checking VR VAlidated
status. This also make the function easier to understand, keeping the
same level of abstraction and getting rid of the uninteresting details.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Clarify isVRConditionMet

99093c5

Rename msg to errorMsg and document that this is an error message,
if the we could not get the condition value, because it is missing,
stale, or unknown.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Improve logging in delete VR flow

5661d38

Log after important changes to the system in delete VR flow to make it
easier to understand what the system is doing, and how ramen changed the
system.

New logs:
- delete the VR resource
- remove annotations from PV

Improve logs:
- remove annotations, labels, and finalizers from PVC

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Fix disable dr if VR failed validation

a4d0b54

When deleting a primary VRG, we wait until the VR Completed condition is
met. However if a VR precondition failed, for example using a drpolicy
without flattening enabled when the PVC needs flattening, the VR will
never complete and the vrg and drpc deletion will never complete.

Since csi-addons 0.10.0 we have a new Validated VR condition, set to
true if pre conditions are met, and false if not. VR is can be deleted
safely in this state, since mirroring was not enabled.

This changes modifies deleted VRG processing to check the new VR
Validated status. If the condition exist and the condition status is
false, validateVRStatus() return true, signaling that the VR is in the
desired state, and ramen completes the delete flow.

If the VR does not report the Validated condition (e.g. old csi-addon
version) or the condition status is true (mirroring in progress), we
continue in the normal flow. The VR will be deleted only when the
Completed condition status is true.

Tested with discovered deployment and vm using a pvc created from a
volume snapshot.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Updating broken clusteradm link in the doc

38654c6

Signed-off-by: Abhijeet Shakya <abhijeetshakya21@gmail.com>


          Add daily e2e job for refreshing the cache

8a70425

To refresh the cache we need to checkout ramen source and run drenv
cache with the environment files. Using a workflow for this make this
job easy to implement and manage without accessing the runner directly.

The job can also run manually from github UI. This is likely to work for
people with write access.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          drenv: add cluster name to storage id

86f5d17

Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>


          Fix intermittent VolSync unit test failure

1c04e61

Resolved an issue causing sporadic failures in the Ramen/VolSync related
unit test due to delay creation of the PVC resource.

Signed-off-by: Benamar Mekhissi <bmekhiss@ibm.com>


          Quote addresses to avoid yaml parsing surprises

7e88713

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Add drenv start --timeout option

5482b46

This is useful when you want to avoid drenv start getting stuck, and the
environment does not provide a way to time out.

Example run:

    % drenv start --timeout 60 envs/vm.yaml
    2024-09-21 04:27:43,555 INFO    [vm] Starting environment
    2024-09-21 04:27:43,581 INFO    [cluster] Starting lima cluster
    2024-09-21 04:28:43,785 ERROR   [cluster] did not receive an event with the "running" status
    2024-09-21 04:28:43,790 ERROR   Command failed
    Traceback (most recent call last):
      ...
    drenv.commands.Error: Command failed:
       command: ('limactl', '--log-format=json', 'start', '--timeout=60s', 'cluster')
       exitcode: 1
       error:

For lima provider, we pass the timeout to limactl. For minikube we use
commands.watch() timeout. External provider does not use the timeout
since we don't start the cluster.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Make the vm environment smaller

eede468

On github actions we cannot start a vm with more than one cpu, and our
test cluster is very small so it should work with 1g of ram. This makes
the test cluster consume less resources if a developer leave it running
for long time.

Using 1 cpu conflicts with kubeadm preflight checks:

    [init] Using Kubernetes version: v1.31.0
    ...
    error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
    [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

Since we know we are doing, we suppress this preflight error.

Running top shows that the cluster consumes only 9% cpu and 33.5% of the
available memory.

    top - 04:51:02 up 10 min,  1 user,  load average: 0.60, 0.29, 0.16
    Tasks: 123 total,   1 running, 122 sleeping,   0 stopped,   0 zombie
    %Cpu(s):   6.2/3.1     9[||||||                                                        ]
    MiB Mem : 33.5/1959.9   [|||||||||||||||||||||                                         ]
    MiB Swap:  0.0/0.0      [                                                              ]

        PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
       4124 root      20   0 1449480 244352  68224 S   2.3  12.2   0:18.47 kube-apiserver
       4120 root      20   0 1309260 102456  60800 S   0.7   5.1   0:06.17 kube-controller
       4273 root      20   0 1902692  92040  59904 S   2.0   4.6   0:12.47 kubelet
       4159 root      20   0 1288880  60156  44288 S   0.3   3.0   0:02.20 kube-scheduler
       3718 root      20   0 1267660  57880  31020 S   0.3   2.9   0:09.28 containerd
       4453 root      20   0 1289120  53540  42880 S   0.0   2.7   0:00.15 kube-proxy
       5100 65532     20   0 1284608  49656  37504 S   0.0   2.5   0:01.14 coredns
       5000 65532     20   0 1284608  49528  37376 S   0.0   2.5   0:00.86 coredns
       4166 root      20   0   11.2g  47360  21504 S   1.3   2.4   0:08.69 etcd

We could make the cluster even smaller, but kubeadm preflight check
requires 1700 MiB, and we don't have memory issue in github or on
developers machines.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Disable rosetta for vm environment

7043af7

We need rosetta only for complex setup when a component is not providing
arm64 images. For testing drenv we have an empty cluster with busybox,
and out busybox image is multi-arch.

Disabling rosetta may speed up the the cluster, and this may be
significant on github actions since the runners are about 6.5 times
slower compared to M1 mac.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Use lima also on darwin/x86_64

5f9c5bb

This is useful for running the tests on github runner, and also support
one option for macOS.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Scale down coredns to 1 replica

77dc276

Same optimization used in minikube. Minikube use 2 replicas only when
using HA configuration.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Don't wait for coredns deployment

It takes about 30 seconds until coredns is ready but we don't depend on
it in the current code. Removing this wait we can start deploying 30
seconds earlier. This reduce the time for starting a cluster from 155
seconds to 125 seconds, and regional-dr environment from 450 to 420
seconds.

Example run with vm environment:

    % drenv start envs/vm.yaml
    2024-09-22 18:39:55,726 INFO    [vm] Starting environment
    2024-09-22 18:39:55,743 INFO    [cluster] Starting lima cluster
    2024-09-22 18:41:57,818 INFO    [cluster] Cluster started in 122.07 seconds
    2024-09-22 18:41:57,819 INFO    [cluster/0] Running addons/example/start
    2024-09-22 18:42:18,966 INFO    [cluster/0] addons/example/start completed in 21.15 seconds
    2024-09-22 18:42:18,966 INFO    [cluster/0] Running addons/example/test
    2024-09-22 18:42:19,120 INFO    [cluster/0] addons/example/test completed in 0.15 seconds
    2024-09-22 18:42:19,121 INFO    [vm] Environment started in 143.40 seconds

    % drenv stop envs/vm.yaml
    2024-09-22 18:42:44,244 INFO    [vm] Stopping environment
    2024-09-22 18:42:44,317 INFO    [cluster] Stopping lima cluster
    2024-09-22 18:42:44,578 WARNING [cluster] [hostagent] dhcp: unhandled message type: RELEASE
    2024-09-22 18:42:49,441 INFO    [cluster] Cluster stopped in 5.13 seconds
    2024-09-22 18:42:49,441 INFO    [vm] Environment stopped in 5.20 seconds

    % drenv start envs/vm.yaml
    2024-09-22 18:42:53,132 INFO    [vm] Starting environment
    2024-09-22 18:42:53,156 INFO    [cluster] Starting lima cluster
    2024-09-22 18:43:34,436 INFO    [cluster] Cluster started in 41.28 seconds
    2024-09-22 18:43:34,437 INFO    [cluster] Looking up failed deployments
    2024-09-22 18:43:34,842 INFO    [cluster/0] Running addons/example/start
    2024-09-22 18:43:35,208 INFO    [cluster/0] addons/example/start completed in 0.37 seconds
    2024-09-22 18:43:35,208 INFO    [cluster/0] Running addons/example/test
    2024-09-22 18:43:35,371 INFO    [cluster/0] addons/example/test completed in 0.16 seconds
    2024-09-22 18:43:35,372 INFO    [vm] Environment started in 42.24 seconds

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Clean up probing for completion

cd6cffb

- Use `/readyz` endpoint for waiting until kubernetes cluster is ready
  instead of `kubectl version`. This matches the way we wait for the
  cluster in drenv.

- When waiting for coredns deployment, use `kubectl rollout status`,
  matching other code in drenv and addons.

- Nicer probe description, visible in drenv when using --verbose.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Simplify port forwarding rules

3692dc5

We can ignore all ports for all protocols for all addresses using a
simpler rule.

With this change we see this log in --verbose mode:

    2024-09-23 01:12:18,130 DEBUG   [cluster] [hostagent] TCP (except for SSH) and UDP port forwarding is disabled

And no "Not forwarding port ..." message. Previously we had lot of these
messages[1]. This requires lima commit[2].  pull current master if you
run older version.

[1] lima-vm/lima#2577
[2] lima-vm/lima@9a09350

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Group kubeconfig setup in same provision step

c10641d

And remove unneeded export - it was used to run kubectl commands before
we setup root/.kube/config, but this step does not run any kubectl
commands.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Remove unneeded sudo usage

b911b9a

Provision scripts using `mode: system` run as root and do not need use
sudo. Looks like the script were copied from code not running as root.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Remove unneeded route fix

da3dc20

This is fixed now in lima[1] so we don't need to keep the fix in drenv.
Developers should pull latest lima from git to use this fix.

[1] lima-vm/lima#2632

Signed-off-by: Nir Soffer <nsoffer@redhat.com>


          Disallow relocation execution when a cluster is unreachable

030adfa

Added a check to stop relocation reconciliation if one of the clusters is
unreachable. This prevents potential misclassification after hub recovery,
which could lead to undesired results and inconsistencies.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2304182

Signed-off-by: Benamar Mekhissi <bmekhiss@ibm.com>


          Fixes https://issues.redhat.com/browse/OCSBZM-4691

79be579

Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com>
Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com> Annaraya Narasagond <annarayanarasagond@gmail.com>


          Moving locks to util and some refactoring

9611e95

Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com>
Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com> Annaraya Narasagond <annarayanarasagond@gmail.com>


          Adding changes to the locking mechanism

87c5a1d

Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com>
Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com> Annaraya Narasagond <annarayanarasagond@gmail.com>


          TODO: Comments check

62faf94

Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com>
Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com> Annaraya Narasagond <annarayanarasagond@gmail.com>


          Some more changes

af60e2a

Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com>
Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com> Annaraya Narasagond <annarayanarasagond@gmail.com>


          Condition changes

04f99b1

Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com>
Signed-off-by: Annaraya Narasagond <annarayanarasagond@gmail.com> Annaraya Narasagond <annarayanarasagond@gmail.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet