Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass assume_storage_prezeroed option when formatting ext4 if possible #4948

Open
idryomov opened this issue Nov 8, 2024 · 3 comments · May be fixed by #4996
Open

Pass assume_storage_prezeroed option when formatting ext4 if possible #4948

idryomov opened this issue Nov 8, 2024 · 3 comments · May be fixed by #4996
Assignees
Labels
component/rbd Issues related to RBD good first issue Good for newcomers keepalive This label can be used to disable stale bot activiity in the repo

Comments

@idryomov
Copy link
Contributor

idryomov commented Nov 8, 2024

Describe the feature you'd like to have

Instead of passing lazy_itable_init=1 and lazy_journal_init=1 to mkfs.ext4, Ceph CSI should pass assume_storage_prezeroed=1 which is stronger and allows the filesystem to skip inode table zeroing completely instead of simply doing it lazily (it's a superset of lazy_itable_init=1 and lazy_journal_init=1). As before with lazy_itable_init=1 and lazy_journal_init=1, this should be limited to dynamically provisioned volumes -- a case where Ceph CSI can guarantee that mkfs.ext4 invocation immediately follows the creation of the RBD image.

assume_storage_prezeroed option became available in e2fsprogs 1.47.0 last year, so it should probably be "discovered" similar to xfsSupportsReflink().

What is the value to the end user? (why is it a priority?)

Freshly created RBD volumes would consume less space. Quoting assume_storage_prezeroed implementation patch:

    - Avoiding zeroing out the inode table and journal reduces the
      initial metadata space allocation from 0.48% to 0.01%.
    - Lazy inode table zeroing results in a further 1.45% of logical
      volume space getting allocated for inode tables, even if no file
      data is added to the filesystem. With assume_storage_prezeroed,
      the metadata allocation remains at 0.01%.

How will we know we have a good solution? (acceptance criteria)

Run before and after tests that would create and attach batches of e.g. 100G, 500G, 1T and 5T RBD volumes, noting space usage as reported by ceph df. The volumes would need to be attached for a while to allow for lazy inode table zeroing to complete in the before case.

@nixpanic nixpanic added component/rbd Issues related to RBD good first issue Good for newcomers labels Nov 8, 2024
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Nov 11, 2024

cc @ceph/ceph-csi-contributors

@black-dragon74 black-dragon74 self-assigned this Nov 26, 2024
black-dragon74 added a commit to black-dragon74/ceph-csi that referenced this issue Nov 27, 2024
Instead of passing lazy_itable_init=1 and lazy_journal_init=1 to
mkfs.ext4, pass assume_storage_prezeroed=1 which is
stronger and allows the filesystem to skip inode table zeroing
completely instead of simply doing it lazily.

Closes: ceph#4948

Signed-off-by: Niraj Yadav <niryadav@redhat.com>
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the wontfix This will not be worked on label Dec 26, 2024
@idryomov
Copy link
Contributor Author

This appears to be in progress in #4996. @Madhu-1 Please don't let this issue get closed.

@black-dragon74 black-dragon74 added the keepalive This label can be used to disable stale bot activiity in the repo label Dec 27, 2024
@Madhu-1 Madhu-1 removed the wontfix This will not be worked on label Jan 3, 2025
black-dragon74 added a commit to black-dragon74/ceph-csi that referenced this issue Jan 6, 2025
Instead of passing lazy_itable_init=1 and lazy_journal_init=1 to
mkfs.ext4, pass assume_storage_prezeroed=1 which is
stronger and allows the filesystem to skip inode table zeroing
completely instead of simply doing it lazily.

The support for this flag is checked by trying to format a fake
temporary image with mkfs.ext4 and checking its STDERR.

Closes: ceph#4948

Signed-off-by: Niraj Yadav <niryadav@redhat.com>
black-dragon74 added a commit to black-dragon74/ceph-csi that referenced this issue Jan 6, 2025
Instead of passing lazy_itable_init=1 and lazy_journal_init=1 to
mkfs.ext4, pass assume_storage_prezeroed=1 which is
stronger and allows the filesystem to skip inode table zeroing
completely instead of simply doing it lazily.

The support for this flag is checked by trying to format a fake
temporary image with mkfs.ext4 and checking its STDERR.

Closes: ceph#4948

Signed-off-by: Niraj Yadav <niryadav@redhat.com>
black-dragon74 added a commit to black-dragon74/ceph-csi that referenced this issue Jan 6, 2025
Instead of passing lazy_itable_init=1 and lazy_journal_init=1 to
mkfs.ext4, pass assume_storage_prezeroed=1 which is
stronger and allows the filesystem to skip inode table zeroing
completely instead of simply doing it lazily.

The support for this flag is checked by trying to format a fake
temporary image with mkfs.ext4 and checking its STDERR.

Closes: ceph#4948

Signed-off-by: Niraj Yadav <niryadav@redhat.com>
black-dragon74 added a commit to black-dragon74/ceph-csi that referenced this issue Jan 8, 2025
Instead of passing lazy_itable_init=1 and lazy_journal_init=1 to
mkfs.ext4, pass assume_storage_prezeroed=1 which is
stronger and allows the filesystem to skip inode table zeroing
completely instead of simply doing it lazily.

The support for this flag is checked by trying to format a fake
temporary image with mkfs.ext4 and checking its STDERR.

Closes: ceph#4948

Signed-off-by: Niraj Yadav <niryadav@redhat.com>
black-dragon74 added a commit to black-dragon74/ceph-csi that referenced this issue Jan 8, 2025
Instead of passing lazy_itable_init=1 and lazy_journal_init=1 to
mkfs.ext4, pass assume_storage_prezeroed=1 which is
stronger and allows the filesystem to skip inode table zeroing
completely instead of simply doing it lazily.

The support for this flag is checked by trying to format a fake
temporary image with mkfs.ext4 and checking its STDERR.

Closes: ceph#4948

Signed-off-by: Niraj Yadav <niryadav@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rbd Issues related to RBD good first issue Good for newcomers keepalive This label can be used to disable stale bot activiity in the repo
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants