Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update disk resize to work with lvms-storage-operator/topolvm #4114

Closed
wants to merge 8 commits into from

Conversation

anjannath
Copy link
Member

@anjannath anjannath commented Apr 9, 2024

fixes #4117

needs crc-org/snc#877

@openshift-ci openshift-ci bot requested review from cfergeau and gbraad April 9, 2024 12:30
Copy link

openshift-ci bot commented Apr 9, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from anjannath. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

pkg/crc/config/settings.go Outdated Show resolved Hide resolved
cmd/crc/cmd/start.go Outdated Show resolved Hide resolved
@anjannath

This comment was marked as outdated.

@praveenkumar
Copy link
Member

/wip

@praveenkumar
Copy link
Member

/hold

@anjannath anjannath force-pushed the extradisk branch 3 times, most recently from 92fa114 to efefed3 Compare June 6, 2024 11:38
@anjannath anjannath changed the title Install LVMCluster resource during crc start Update disk resize to work with lvms-storage-operator/topolvm Jun 6, 2024
Copy link
Contributor

@cfergeau cfergeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by ' update disk resize to work with new disk layout ', could you detail more in the commit log why we need to move the topolvm partition (ie what fails if we don't do this)? What is the order before, what is the order after? Could this be done proactively in snc? And can we resize both the root partition and the PVs without any issue?

pkg/crc/machine/start.go Outdated Show resolved Hide resolved
pkg/crc/machine/start.go Show resolved Hide resolved
@cfergeau
Copy link
Contributor

cfergeau commented Jul 1, 2024

this PR is not needed anymore, the snc PRs:

I assume this is no longer accurate as you updated the PR?

@anjannath
Copy link
Member Author

I'm a bit confused by ' update disk resize to work with new disk layout ', could you detail more in the commit log why we need to move the topolvm partition (ie what fails if we don't do this)? What is the order before, what is the order after? Could this be done proactively in snc? And can we resize both the root partition and the PVs without any issue?

i've updated the commit message.. but mentioning here as well: after adding a new partition at the end of disk, the partition layout is |efi|boot|root|topolvm|, now if we want to increase the size of the root partition we have to first move the topolvm partition to create the required free space after the root partition, earlier the root partition was the last partition on the disk and after increasing the disk size the additional free space was added after it, which is not the case anymore because the topolvm partition is there now

@cfergeau
Copy link
Contributor

cfergeau commented Jul 8, 2024

I'm a bit confused by ' update disk resize to work with new disk layout ', could you detail more in the commit log why we need to move the topolvm partition (ie what fails if we don't do this)? What is the order before, what is the order after? Could this be done proactively in snc? And can we resize both the root partition and the PVs without any issue?

i've updated the commit message.. but mentioning here as well: after adding a new partition at the end of disk, the partition layout is |efi|boot|root|topolvm|, now if we want to increase the size of the root partition we have to first move the topolvm partition to create the required free space after the root partition, earlier the root partition was the last partition on the disk and after increasing the disk size the additional free space was added after it, which is not the case anymore because the topolvm partition is there now

Ah ok, we need to go from |efi|boot|root|topolvm|free space| to |efi|boot|root|free space|topolvm|. The topolvm partition is always the last one, but its end also has to be aligned with the end of the disk so that the freed space comes before rather than after it.

return 15
default:
// should not be reached
return 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be enough? because registry also going to use the pvc which is going to part of this partition and if a user mirror an image to openshift or even build a image from openshift then this space is going to be filled very fast :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, we might have to increase the default disk-size, but there's very limited free space available, and currently that space is shared for both image-registry and host container image storage, with this since we have to create a separate partition this becomes more apparent:

[core@crc ~]$ df -h /dev/vda4
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda4        31G   23G  7.6G  76% /

We only have 7.6G free space, of which now 4.6G is for host container images and 3G for PVs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can increase the default disk-size to 40 and default persistent-volume-size to 7, these are just arbitrary numbers, but it'd leave around 7gb free space on the root partition (closer to the current situation)

(the bundle sizes are also increasing which would mean less free space for workloads, right? i saw the 4.16.3 bundle from CI in 400MB larger than the 4.16.0 bundle)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, increasing the guest disk size should not increase the bundle size nor the used disk space after crc start as long as the additional space is empty.

return errors.Wrap(err, "Failed to connect to the CRC VM with SSH -- virtual machine might be unreachable")
}
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This restart means add more time to startup even it takes ~1 mins.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, its less than 1 min, but that's on linux, on windows i expect it to be even longer than 1 minute

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the restart required? If I run the sfdisk command manually, I get

The partition table has been altered.
Calling ioctl() to re-read partition table.
Re-reading the partition table failed.: Device or resource busy
The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or partx(8).
Syncing disks.

Running partprobe could be enough?

I'd really prefer to avoid restarts of the VM/... during a crc start, this feels like a complication which will cause us problems later on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

running partprobe was not enough, it needed the reboot, yeah i agree that this makes it complicated and now it works because the VM ip doesn't change after a reboot


if preset == crcPreset.OpenShift {
pvPartition := "/dev/vda5"
if _, _, err := sshRunner.RunPrivileged(fmt.Sprintf("Growing %s partition", pvPartition), "/usr/bin/growpart", pvPartition[:len("/dev/.da")], pvPartition[len("/dev/.da"):]); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The add helper to grow root partition should come before this commit, this way you can reuse it here instead of duplicating already existing code.

}
if exitErr.ExitStatus() != 1 {
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to ignore errors when the command exited with 1 as its exit code? Or is it a copy and paste error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we need to ignore it, after moving the partition it returns exit code 1 (i think because of Re-reading the partition table failed.: Device or resource busy but couldn't find anything about exit codes in sfdisk man page)

func growRootFileSystem(ctx context.Context, startConfig types.StartConfig, vm *virtualMachine, sshRunner *crcssh.Runner) error {
if startConfig.Preset == crcPreset.OpenShift {
sizeToMove := ocpGetPVShiftSize(startConfig.DiskSize, startConfig.PersistentVolumeSize)
_, _, err := sshRunner.RunPrivileged("move topolvm partition to end of disk", fmt.Sprintf("echo '+%dG,' | sudo sfdisk --move-data /dev/vda -N 5 --force", sizeToMove))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would squash add helper to move topolvm partition with this commit, less code to review overall :)

return errors.Wrap(err, "Failed to connect to the CRC VM with SSH -- virtual machine might be unreachable")
}
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the restart required? If I run the sfdisk command manually, I get

The partition table has been altered.
Calling ioctl() to re-read partition table.
Re-reading the partition table failed.: Device or resource busy
The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or partx(8).
Syncing disks.

Running partprobe could be enough?

I'd really prefer to avoid restarts of the VM/... during a crc start, this feels like a complication which will cause us problems later on.

return diskSize - constants.DefaultDiskSize
}

func growRootFileSystem(ctx context.Context, startConfig types.StartConfig, vm *virtualMachine, sshRunner *crcssh.Runner) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already the case before this commit, but growRootFileSystem is badly named, it grows the root filesystem, and the PVs filesystem. growFilesystems would be a better name for it.

if pvSize > defaultPvSize {
return (diskSize - constants.DefaultDiskSize) - (pvSize - defaultPvSize)
}
return diskSize - constants.DefaultDiskSize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend making use of github.com/containers/common/pkg/strongunits here (and in more places in our codebase :). At the very least, add a GiB suffix to the function name so that there is no confusion about what we are manipulating.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd favour calculating this from sfdisk output though, if the disk layout is in an unexpected state, these calculations could break as they rely on external values, not on the actual disk state. What we want to do is to align the last partition to the end of the disk, so we can do it without any knowledge of the default disk size, the disk size stored in json metadata ,...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we want to do is to align the last partition to the end of the disk, so we can do it without any knowledge of the default disk size, the disk size stored in json metadata

that is not the case always, sometime we want to move the partition not to the very end and leave space to increase the PV partition size

suppose disk-size is 40 and persistent-volume-size is 5 (all in GB) then after increasing the entire disk to 40GB from 31GB the last partition needs to be moved by 7GB (not 9GB) leaving 2GB of space at the end so the PV partition can grow to become 5GB (default size is 3GB)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could grow the partition before moving it since it's at the end, or we can take into account its new size desired size when moving it, this is fine as this is data we cannot know from the current layout.

My impression from the current is that we determine the offset we need to move the partition without ever looking at the actual on disk layout, but only using hardcoded constants and crc configuration values. If the disk layout (unexpectedly) goes out of sync with this, bad things will happen I think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we can replace constant.DefaultDiskSize and defaultPvSize with the current disk size and current pv partition size fetching it with sfdisk, but i think it'll be an edge case if the config values do not reflect the actual values

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added d129b86 to address this

i'll try to incorporate this into the original commit, but since the functions gets moved to another file, its taking a bit of time to do this

return err
}
_, _, err = sshRunner.RunPrivileged("move topolvm partition to end of disk",
fmt.Sprintf("echo '+%dG,' | sudo sfdisk --move-data /dev/vda -N %s --force", shiftSize, pvPartition[len("/dev/.da"):]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use len("/dev/.da"), it might be worth also making the /dev/vda more generic? Iirc it's /dev/sda on Windows?

return getDeviceID(sshRunner, "-t PARTLABEL=topolvm")
}

func getDeviceID(sshRunner *crcssh.Runner, query string) (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the api to this function a bit odd, this query string is quite magic. Might be better to name it runBlkidQuery or such?

}

func getTopolvmPartition(sshRunner *crcssh.Runner) (string, error) {
return getDeviceID(sshRunner, "-t PARTLABEL=topolvm")
Copy link
Contributor

@cfergeau cfergeau Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also rework getDeviceID to accept a ...string rather than just a string as the correct (in theory) way of calling it is getDeviceID(sshRunner, "-t", "PARTLABEL=topolvm")

I made this change on top of yours:

diff --git a/pkg/crc/machine/disks.go b/pkg/crc/machine/disks.go
index 4283d1501..2ea169de0 100644
--- a/pkg/crc/machine/disks.go
+++ b/pkg/crc/machine/disks.go
@@ -129,19 +131,20 @@ func growPersistentVolume(sshRunner *crcssh.Runner, preset crcPreset.Preset, per
 }
 
 func getrootPartition(sshRunner *crcssh.Runner, preset crcPreset.Preset) (string, error) {
-	query := "--label root"
 	if preset == crcPreset.Microshift {
-		query = "-t TYPE=LVM2_member"
+		return getDeviceID(sshRunner, "-t", "TYPE=LVM2_member")
 	}
-	return getDeviceID(sshRunner, query)
+	return getDeviceID(sshRunner, "--label", "root")
 }
 
 func getTopolvmPartition(sshRunner *crcssh.Runner) (string, error) {
-	return getDeviceID(sshRunner, "-t PARTLABEL=topolvm")
+	return getDeviceID(sshRunner, "-t", "PARTLABEL=topolvm")
 }
 
-func getDeviceID(sshRunner *crcssh.Runner, query string) (string, error) {
-	part, _, err := sshRunner.RunPrivileged("Get device id", "/usr/sbin/blkid", query, "-o", "device")
+func getDeviceID(sshRunner *crcssh.Runner, query ...string) (string, error) {
+	cmd := []string{"/usr/sbin/blkid", "-o", "device"}
+	cmd = append(cmd, query...)
+	part, _, err := sshRunner.RunPrivileged("Get device id", cmd...)
 	if err != nil {
 		return "", err
 	}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, i applied this patch on the commit, and rename the function to runBlkidQuery

after using topolvm as storage provisioner there's
two partitions of type xfs and its ambiguos  which
is the root partition but the root partition  also
has a label 'root' in case of the ocp bundle
this allows users to also configure the persistent volume's partition
size for the ocp preset, in case of ocp preset the default size of the
persistent volume partition (topolvm partition) is 3G
to be able to use topolvm we need a separate parition this new
partition is added at the end of the disk and used by  topolvm
so the new disk layout is |efi|boot|root|pv|

earlier the root parition could be easily extended after  increasing
the disk size as there was no patition after it, with the new layout
after increasing the disk size we have to move the "pv" partition to
create free space after the "root" partition to increase its size

this adds code to move topolvm partition and restart the VM which is
needed to re-read the changed partition table, after moving the part
the root partition can be grown
this makes it a bit easier to navigate the start.go file by moving
some of the disk size and partition modification functions into  a
separate disks.go file in the same machine package
this uses `blkid -t PARTLABEL=topolvm` to get the topolvm
partition name and introduces a helper 'runBlkidQuery' to
run different forms of the `blkid` command
since it does more than growing only the root file system
it additionally grows the partition used by  topolvm/lvms
operator
instead of relying on the config values and the constants for default
disk-size and pv partition size, this fetches the size of disk,  root
and pv partition using `lsblk` and uses these values to calculate the
size by which the pv partition needs to be shifted to apply requested
new values for 'disk-size' and 'persistent-volume-size' configs

it uses the 'strongunits' package from containers/common for bytes to
GiB conversions and its types to deal with storage sizes
Copy link

openshift-ci bot commented Jul 31, 2024

@anjannath: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/security d129b86 link false /test security
ci/prow/e2e-crc d129b86 link true /test e2e-crc
ci/prow/integration-crc d129b86 link true /test integration-crc

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@anjannath
Copy link
Member Author

seems this is not a good return on investment, closing it, sorry about the noise!

if users want to use topolvm they can manually enable/install it, we'll add a wiki entry with the steps

@anjannath anjannath closed this Aug 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Use LVM storage operator for dynamic PV provisioning
4 participants