Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add design for data mover preserve local snapshot #7002

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelogs/unreleased/7002-Lyndon-Li
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add design for data mover preserve local snapshot
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,14 @@ Moreover, we would like to create a general workflow to variations during the da
- Support different snapshot types, i.e., CSI snapshot, volume snapshot API from storage vendors
- Support different snapshot accesses, i.e., through PV generated from snapshots, and through direct access API from storage vendors
- Reuse the existing Velero generic data path as creatd in [Unified Repository design][1]
- Allow users to retain configurable number of native snapshots after data movement completes

## Non-Goals

- The current support for block level access is through file system uploader, so it is not aimed to deliver features of an ultimate block level backup. Block level backup will be included in a future design
- Most of the components are generic, but the Exposer is snapshot type specific or snapshot access specific. The current design covers the implementation details for exposing CSI snapshot to host path access only, for other types or accesses, we may need a separate design
- The current workflow focuses on snapshot-based data movements. For some application/SaaS level data sources, snapshots may not be taken explicitly. We don’t take them into consideration, though we believe that some workflows or components may still be reusable.
- It is possible that native snapshot creation succeeds but data movement fails technically, in this case, we don't support to retain the native snapshot. The backup will be set to ParitiallyFailed and no data will be restored for the volume of the failed data movement. We don't foresee cases that snapshot creation succeeds but data movement always fail, so we will leave this for further design on future requirements.

## Architecture of Volume Snapshot Data Movement

Expand Down Expand Up @@ -269,6 +271,10 @@ spec:
description: OperationTimeout specifies the time used to wait internal
operations, e.g., wait the CSI snapshot to become readyToUse.
type: string
retainSnapshot:
description: RetainSnapshot specifies whether to retain the snapshot
after backup completes.
type: boolean
snapshotType:
description: SnapshotType is the type of the snapshot to be backed
up.
Expand Down Expand Up @@ -335,6 +341,10 @@ spec:
format: int64
type: integer
type: object
retainedSnapshot:
description: RetainedSnapshot is name of the snapshot that has been
retained.
type: string
snapshotID:
description: SnapshotID is the identifier for the snapshot in the
backup repository.
Expand Down Expand Up @@ -637,9 +647,45 @@ In DUCR/DDCR’s status, we have fields like ```totalBytes``` and ```doneBytes``
- Call ```kubectl get dataupload -n velero xxx or kubectl get datadownload -n velero xxx```.
- Call ```velero backup describe –details```. This is implemented as part of BIA/RIA V2, the above values are transferred to async operation and this command retrieve them from the async operation instead of DUCR/DDCR. See [general progress monitoring design][2] for details

## Retain Native Snapshots
Users are allowed to specify a global limit number of native snapshots to be retained for each volume, then:
- If the number is not specified, or if it is not a positive value, it means no native snapshot is to be retained
- Otherwise, the value defines the max snapshots to be retained for each volume, a.k.a, the limit. If the limit is exceeded, the oldest retained snapshot will be removed
Copy link
Collaborator

@anshulahuja98 anshulahuja98 Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a nuance here of setting limits per Driver type.
for example disk.csi.cloud.com might support retaining upto 100snapshots, while file.csi.cloud.com might only support 10 and backup might contain both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add this as a CSI plugin specific configuration overwriting the global number, as the CSI Driver is not generic for all DMPs.

Copy link
Contributor Author

@Lyndon-Li Lyndon-Li Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added below statement to the design:
For the same reason, if a DMP supports snapshot from more than one kinds of storages, it may need to set different snapshot limits for volumes from different storages, based on different criteria (E.g., CSI plugin may needs to set different numbers by CSI driver types). In this case, the DMP could overwrite the global limit number based on its own configurations. The configurations are private to the DMP itself since the storage information is only known by the specific DMP, and how the configurations are structured and interpreted is also private to the specific DMP.

This indicates that the DMP specific configuration is supported to overwrite the global number and the configuration is managed by the DMP itself.
So we don't need to give a generic configuration structure for all DMPs.
And for CSI plugin, I don't want to mention the concrete configuration structure for now, the requirement is still obscure right now, e.g., sometimes we want to differentiate it by CSI driver type, sometimes we want to do it by volumeType, etc. Let's wait and see the future requirements after this feature is release, and then we come back to design the structure for CSI.

@anshulahuja98 Let me know you are good with this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be good enough for now.
Will keep this comment open for others to have a look. Otherwise consider this resolved for now.


There may be snapshot number limit from storage side, different storages may/may not have this limit and may define different numbers as a trade off of resource consumption, performance, data redundancy, etc. This means users should be aware of the limit from their storage and always configure a Velero retained snapshot number that is no larger than their storage limit, otherwise, the snapshot creation and the data movement for the volume may fail.
For the same reason, if a DMP supports snapshot from more than one kinds of storages, it may need to set different snapshot limits for volumes from different storages, based on different criteria (E.g., CSI plugin may needs to set different numbers by CSI driver types). In this case, the DMP could overwrite the global limit number based on its own configurations. The configurations are private to the DMP itself since the storage information is only known by the specific DMP, and how the configurations are structured and interpreted is also private to the specific DMP.

The global limit number is set as an annotation of a backup CR, the annotation is called ```backup.velero.io/data-mover-snapshot-to-retain```. In this way, DMPs are able to query the number from the backup CR.

DMP is responsible to maintain the native snapshots. Specifically, it is able to list/count the retained snapshots for each volume, and remove the old ones once the limit is exceeded.
Practically, DMP creates a snapshot associated object(SAO) for each retained snapshot, then it could list/count the retained snapshots by listing/counting the SAOs through Kubernetes API any time after the backup. For some DMPs, some Kubernetes objects are created as part of the snapshot creation, then they could choose any of the objects as SAOs. For others, if no Kubernetes objects are created necessarily during the snapshot creation, they can create some Kubernetes objects (i.e. configMaps) purposefully as SAOs. For Velero CSI plugin, the VolumeSnapshotContent objects will act as SAOs.
To assist on the listing/counting, several labels are applied to SAOs:
- ```velero.io/snapshot-alias-name```: The DMP gives an alias to each SAO. As mentioned above, Velero/DMP should not expect DMs to keep the snapshots and their associated objects unchanged, e.g., a DM may delete the VS/VSC created by CSI plugin and create new ones from them (so the new ones also represent the same snapshots). By giving an alias, DMPs are guaranteed to find the SAOs by the alias label even though they have no idea to know where the DMs have cloned the SAOs. This also means that DMs should inherit all the labels from the original SAOs
Copy link
Contributor

@blackpiglet blackpiglet Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the snapshot alias name be an SAO's name or a snapshot name?
Since it's inherited from the original one, what is the original SAO's snapshot alias name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

itemToUpdate is returned from DMP, DMP doesn't exchange the SAO's name with DMs, while DMs will could create the SAO on its own, they DMP will not be able to know the exact name.

The alias could be anything that uniquely identify a SAO, e.g., a random UID.

- ```velero.io/snapshot-volume-name```: This label stores the volume's identity for which a snapshot is created. Its value could be anything that could uniquely identify a volume, for example, for a PVC, the value could be its namespaced name or UID. In this way, the DMP is able to tell all the retained snapshots for a specific volume

DMPs then return the SAOs as ```itemToUpdate``` with their aliases to Velero backup, in this way, when the DM execution finishes, Velero gets the latest SAOs by their aliases and persist them to the backup storage.
Velero, specifically the backup sync controller, is responsible to sync the SAOs to the target cluster for restore if they are not there. In this way, it is guaranteed that all the SAOs are available as long as their associated backups are there, or for every restore, DMPs always see the full list of SAOs for all the volumes to be restored.

After the retained snapshots are counted against the limit and if limit is exceeded, DMP needs to make sure the old snapshots are first removed before creating the new snapshot, otherwise, the snapshot creation may fail as some storage has a hard limit of the snapshot numbers.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that storage snapshot limit will cause snapshot creation to fail. While if the backup finally fails, deleting the old snapshot firstly may cause the user to lose a recovery point.

What will be the behavior if the configured global limit number > the storage supported snapshot numbers? does it make sense to limit the global number to a relative reasonable value (a number that most storages won't exceed)?

Copy link
Contributor Author

@Lyndon-Li Lyndon-Li Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the backup finally fails, deleting the old snapshot firstly may cause the user to lose a recovery point

We delete the native snapshots from an old completed backup. Since the backup is completed, the data mover has also completed. It means, a completed backup contains native snapshots + data mover backup data.
So here we delete the native snapshots for the backup only, it won't loose recovery point. The recovery point will refer to the data mover data.
This approach is actually to avoid the situation that the snapshot creation fails due to snapshot limit for the current backup, so it delete the native snapshot of the old backup first.

Copy link
Contributor Author

@Lyndon-Li Lyndon-Li Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will be the behavior if the configured global limit number > the storage supported snapshot numbers?

The snapshot creation will fail if the number of snapshots exceeds the storage limit. This means, users need to be aware of their storage limit and set a number < the storage limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to limit the global number to a relative reasonable value (a number that most storages won't exceed)

I am afraid we cannot deduce a number that works for most storages. The snapshot limit is storage specific. Actually, for a storage, there is no technical point to support a fixed number as the limit, it is a compromise of resource consumption and performance, so generally this number is configurable on the storage side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me add some part to discuss the storage limit in the doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done on the addition.

Since the retained snapshot from an old backup has been deleted before the completion of the new backup, it is possible that the new backup fails and the retained snapshot from the old backup is deleted unnecessarily. In this case, restore from the old backup has to go through data movement since the retained snapshot has been deleted, which may lead longer time. However, the data should still be restored appropriately.

DMPs also need to tell DMs to retain a snasphot, it is done through the ```retainSnapshot``` field in the DUCR's spec. This is a boolean value and if it is true, DMs will not delete the snapshots after its execution finishes.
After data movement completes, the retained snapshot name is recorded in DUCR's ```retainedSnapshot``` field, the names are different varying from snapshot types. For CSI snapshots, they are the VSC names for the retained snapshots.

During restore, DMPs checks the existence of the SAO for a volume of a backup, if it exists, it restore the data from the native snapshot; otherwise, it submits DDCRs to do a data movement.

For Velero CSI plugin, the existing logics are reused for restoring retained native snapshots, with some adjustment in the workflow.
Below diagram shows how snapshot retained backup happens for Velero and Velero CSI plugin:
![backup-sequence-retained-snapshot.png](backup-sequence-retained-snapshot.png)
Below diagram shows how snapshot retained restore happens for Velero and Velero CSI plugin when native snapshot is not available:
![restore-sequence-retained-snapshot-not-avai.png](restore-sequence-retained-snapshot-not-avai.png)
Below diagram shows how snapshot retained restore happens for Velero and Velero CSI plugin when native snapshot is available:
![restore-sequence-retained-snapshot-avai.png](restore-sequence-retained-snapshot-avai.png)

## Backup Sync
DUCR contains the information that is required during restore but as mentioned above, it will not be synced because during restore its information is retrieved dynamically. Therefore, we have no change to Backup Sync.
For snapshot retained, as mentioned above, the backup sync controller finds the SAOs for a backup by the ```velero.io/snapshot-alias-name``` in the backup storage and sync them to the target cluster during backup sync. For CSI snapshot, Velero has existing logics to sync VolumeSnapshotContents, this logic will be reused.


## Backup Deletion
Once a backup is deleted, the data in the backup repository should be deleted as well. On the other hand, the data is created by the specific DM, Velero doesn't know how to delete the data. Therefore, Velero relies on the DM to delete the backup data.
Expand All @@ -655,6 +701,9 @@ As the current workflow, when ```velero backup delete``` CLI is called, a ```del
- Otherwise, if any error happens during the processing, the ```deletebackuprequests``` CR will be left there with the ```velero.io/dm-delete-backup``` finalizer, as well as the failed DUCRs
- DMs may use a periodical manner to retry the failed delete requests

Once the backup is deleted, the native snapshots retained should also be deleted. Velero has the ability to delete the SAOs as they are part of the backup, if any particular operations is required for deleting the snapshots, the DMP who creates the snapshots needs to implement DIA (DeleteItemAction) on the SAOs.
For CSI snapshot, CSI plugin implements DIA on VolumeSnapshotContent objects, so that the snapshots could be removed appropriately.

## Restarts
If Velero restarts during a data movement activity, the backup/restore will be marked as failed when Velero server comes back, by this time, Velero will request a cancellation to the ongoing data movement.
If DM restarts, Velero has no way to detect this, DM is expected to:
Expand Down Expand Up @@ -890,11 +939,12 @@ Conclusively, below are the steps plugin DMs need to do in order to integrate to
- Set PV's ```claimRef``` to the provided PVC and set ```velero.io/dynamic-pv-restore``` label

## Working Mode
It doesn’t mean that once the data movement feature is enabled users must move every snapshot. We will support below two working modes:
It doesn’t mean that once the data movement feature is enabled users must move every snapshot. We will support below three working modes:
- Don’t move snapshots. This is same with the existing CSI snapshot feature, that is, native snapshots are taken and kept
- Move snapshot data and delete native snapshots. This means that once the data movement completes, the native snapshots will be deleted.
- Move snapshot data and keep X native snapshots for each volume. This means snapshot data is moved first and also several native snapshots will be kept according to users' configuration.

For this purpose, we need to add a new option in the backup command as well as the Backup CRD.
For this purpose, we need to add new options in the backup command, the Backup CRD and Velero server parameters.
The same option for restore will be retrieved from the specified backup, so that the working mode is consistent.

## Backup and Restore CRD Changes
Expand All @@ -911,10 +961,13 @@ We add below new fields in the Backup CRD:
DataMover string `json:"datamover,omitempty"`
```
SnapshotMoveData will be used to decide the Working Mode.
DataMover will be used to decide the data mover to handle the DUCR. DUCR's DataMover value is derived from this value.
DataMover will be used to decide the data mover to handle the DUCR. DUCR's DataMover value is derived from this value.

As mentioned in the Plugin Data Movers section, the data movement information for a restore should be the same with the backup. Therefore, the working mode for restore should be decided by checking the corresponding Backup CR; when creating a DDCR, the DataMover value should be retrieved from the corresponding Backup Result.

## Velero Server parameter Changes
We add a new flag to Velero server parameter ```data-mover-snapshot-to-retain``` as a global configuration for users to specify how many native snapshots should be retained for each volume, the default value is 0, which means snapshots are not retained. For more information of how native snapshot retain works, check the Retain Native Snapshots section.

## Logging
The logs during the data movement are categorized as below:
- Logs generated by Velero
Expand Down
Loading