Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSD replication factor applies to all pools #385

Open
masnax opened this issue Aug 28, 2024 · 2 comments
Open

OSD replication factor applies to all pools #385

masnax opened this issue Aug 28, 2024 · 2 comments
Labels
Bug Confirmed to be a bug Jira Allows the synchronization of a GitHub issue in Jira

Comments

@masnax
Copy link
Contributor

masnax commented Aug 28, 2024

We adjust the replication factor for OSDs up to 3 when adding systems to a MicroCloud.

We do this by specifying Pools: []string{"*"} which applies the replication factor to all pools. However, if these pools' sizes were manually modified, MicroCloud will not have this context and will overwrite the user configuration on each run of microcloud add, microcloud service add, and even microcloud init with an existing MicroCeph.

We will need to add a GET /1.0/pools-op to MicroCeph to fetch current pool sizes from ceph.

With this endpoint, we can inspect the current pool configuration and make inferences about whether we should change it:

  • on microcloud init, if the total OSD count is fewer than 3, we will set all "managed" OSD pool sizes to max(3, count(OSDs)).

  • on microcloud add, and microcloud service add, present a list of "unmanaged" OSD pools whose size is smaller (or larger) than the new OSD count. The user will have to select the pools they want to increase in replication size. "Managed" pools will automatically be updated.

A "managed" pool is one that MicroCloud sets up. So that means lxd_remote, lxd_cephfs, lxd_cephfs_meta, lxd_cephfs_data, and .mgr.

@masnax masnax added the Bug Confirmed to be a bug label Aug 28, 2024
@roosterfish
Copy link
Contributor

Can we use the LXD client to query for both the remote and remote-fs pools?
This way we don't have to query Ceph directly.

@masnax
Copy link
Contributor Author

masnax commented Aug 29, 2024

Even that isn't quite enough because we also need to make sure the replication size isn't already customized to a higher value. Otherwise we would reset them all back to 3 each time a node is added.

@mseralessandri mseralessandri added the Jira Allows the synchronization of a GitHub issue in Jira label Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Confirmed to be a bug Jira Allows the synchronization of a GitHub issue in Jira
Projects
None yet
Development

No branches or pull requests

3 participants