-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perf improvements for existing resource restore #6723
Perf improvements for existing resource restore #6723
Conversation
76b72c1
to
346f742
Compare
7c07591
to
e2f2d83
Compare
Codecov Report
@@ Coverage Diff @@
## main #6723 +/- ##
==========================================
+ Coverage 60.69% 60.81% +0.11%
==========================================
Files 249 249
Lines 26493 26607 +114
==========================================
+ Hits 16081 16180 +99
- Misses 9268 9278 +10
- Partials 1144 1149 +5
|
e2f2d83
to
0425f4a
Compare
@shubham-pampattiwar @kaovilai please review as you have time :) |
0425f4a
to
25ff2b4
Compare
25ff2b4
to
106074b
Compare
106074b
to
b9b3105
Compare
@sseago |
@Lyndon-Li issue is here: #6845 |
pkg/cmd/cli/install/install.go
Outdated
@@ -120,6 +121,7 @@ func (o *Options) BindFlags(flags *pflag.FlagSet) { | |||
flags.BoolVar(&o.DefaultVolumesToFsBackup, "default-volumes-to-fs-backup", o.DefaultVolumesToFsBackup, "Bool flag to configure Velero server to use pod volume file system backup by default for all volumes on all backups. Optional.") | |||
flags.StringVar(&o.UploaderType, "uploader-type", o.UploaderType, fmt.Sprintf("The type of uploader to transfer the data of pod volumes, the supported values are '%s', '%s'", uploader.ResticType, uploader.KopiaType)) | |||
flags.BoolVar(&o.DefaultSnapshotMoveData, "default-snapshot-move-data", o.DefaultSnapshotMoveData, "Bool flag to configure Velero server to move data by default for all snapshots supporting data movement. Optional.") | |||
flags.BoolVar(&o.UseInformerCacheForGet, "use-informer-cache-for-get", o.UseInformerCacheForGet, "Use informer cache for Get calls on restore. This will speed up restore in cases where there are backup resources which already exist in the cluster, but for very large clusters this will increase velero memory usage. Default is true. Optional.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we give a user friendly name for this flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lyndon-Li @kaovilai I agree it's a bit awkward. But I couldn't come up with a better name at the time. Is shortening it to "use-informer-cache" OK? Maybe the "for get" is implied, since that's really the only call where the informer cache helps us much. I'll make that change unless someone comes up with a better name before I get to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah use-informer-cache
sounds good, or maybe enable-informer-cache
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Actually, since we're defaulting it to enabled, maybe we should call it disable-informer-cache
and if false, we use informer cache, if true we don't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah works for me.
f0f8f58
b9b3105
to
f0f8f58
Compare
Updated based on review feedback. |
cf7904f
to
930f0dd
Compare
@@ -122,6 +123,7 @@ func (o *Options) BindFlags(flags *pflag.FlagSet) { | |||
flags.BoolVar(&o.DefaultVolumesToFsBackup, "default-volumes-to-fs-backup", o.DefaultVolumesToFsBackup, "Bool flag to configure Velero server to use pod volume file system backup by default for all volumes on all backups. Optional.") | |||
flags.StringVar(&o.UploaderType, "uploader-type", o.UploaderType, fmt.Sprintf("The type of uploader to transfer the data of pod volumes, the supported values are '%s', '%s'", uploader.ResticType, uploader.KopiaType)) | |||
flags.BoolVar(&o.DefaultSnapshotMoveData, "default-snapshot-move-data", o.DefaultSnapshotMoveData, "Bool flag to configure Velero server to move data by default for all snapshots supporting data movement. Optional.") | |||
flags.BoolVar(&o.DisableInformerCache, "disable-informer-cache", o.DisableInformerCache, "Disable informer cache for Get calls on restore. With this enabled, it will speed up restore in cases where there are backup resources which already exist in the cluster, but for very large clusters this will increase velero memory usage. Default is false (don't disable). Optional.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think cobra already append default info to help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually no it doesnt. it just appends what type things are.. ie string or bool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, so better to keep the default text, thanks @kaovilai !
@sseago Could you help to rebase from the main branch? The codespell errors have been fixed in the latest code. |
Use informer cache with dynamic client for Get calls on restore When enabled, also make the Get call before create. Add server and install parameter to allow disabling this feature, but enable by default Signed-off-by: Scott Seago <sseago@redhat.com>
930f0dd
to
7750e12
Compare
I set DCO to pass because the errors that showed up weren't for my PR.
|
@Lyndon-Li done |
@sseago I believe we wanted to CP this to v.1.11 for oadp-1.2 |
@weshayutin @sseago keep in mind that there's an issue with the functionality noted in #7263 |
Use informer cache with dynamic client for Get calls on restore
When enabled, also make the Get call before create attempt.
Add server and install parameter to allow disabling this feature, but enable by default
This change results in significant restore performance improvements in cases where resources to be restored already exist in the cluster. As a test example, backed up a namespace with 31k secrets in it (the number was chosen to approximately match the total number of items in a real backup of a user who was concerned with restore performance numbers for the existing resource case).
Without this change:
restore into new namespace: 52 minutes
restore with all resources existing (existingResourcePolicy=none): 52 minutes
restore with all resources existing (existinResourcePolicy=update): 1 hour, 18 minutes
With this change:
restore into new namespace: 52 minutes
restore with all resources existing (existingResourcePolicy=none): 26 minutes
restore with all resources existing (existinResourcePolicy=update): 26 minutes
Does your change fix a particular issue?
Fixes #(issue)
Please indicate you've done the following:
/kind changelog-not-required
as a comment on this pull request.site/content/docs/main
.