-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanups and minor improments in lima provider #1568
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Member
nirs
commented
Sep 22, 2024
- Remove unneeded sudo usage
- Group kubeconfig setup in same provision step
- Simplify port forwarding rules
- Clean up probing for completion
- Don't wait for coredns deployment
- Scale down coredns to 1 replica
- Use lima also on darwin/x86_64
- Disable rosetta for vm environment
- Make the vm environment smaller
- Add drenv start --timeout option
- Quote addresses to avoid yaml parsing surprises
nirs
force-pushed
the
lima-cleanups
branch
2 times, most recently
from
September 22, 2024 22:18
b62ec4d
to
0049f65
Compare
rakeshgm
approved these changes
Sep 23, 2024
3 tasks
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
This is useful when you want to avoid drenv start getting stuck, and the environment does not provide a way to time out. Example run: % drenv start --timeout 60 envs/vm.yaml 2024-09-21 04:27:43,555 INFO [vm] Starting environment 2024-09-21 04:27:43,581 INFO [cluster] Starting lima cluster 2024-09-21 04:28:43,785 ERROR [cluster] did not receive an event with the "running" status 2024-09-21 04:28:43,790 ERROR Command failed Traceback (most recent call last): ... drenv.commands.Error: Command failed: command: ('limactl', '--log-format=json', 'start', '--timeout=60s', 'cluster') exitcode: 1 error: For lima provider, we pass the timeout to limactl. For minikube we use commands.watch() timeout. External provider does not use the timeout since we don't start the cluster. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
On github actions we cannot start a vm with more than one cpu, and our test cluster is very small so it should work with 1g of ram. This makes the test cluster consume less resources if a developer leave it running for long time. Using 1 cpu conflicts with kubeadm preflight checks: [init] Using Kubernetes version: v1.31.0 ... error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR NumCPU]: the number of available CPUs 1 is less than the required 2 [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` Since we know we are doing, we suppress this preflight error. Running top shows that the cluster consumes only 9% cpu and 33.5% of the available memory. top - 04:51:02 up 10 min, 1 user, load average: 0.60, 0.29, 0.16 Tasks: 123 total, 1 running, 122 sleeping, 0 stopped, 0 zombie %Cpu(s): 6.2/3.1 9[|||||| ] MiB Mem : 33.5/1959.9 [||||||||||||||||||||| ] MiB Swap: 0.0/0.0 [ ] PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4124 root 20 0 1449480 244352 68224 S 2.3 12.2 0:18.47 kube-apiserver 4120 root 20 0 1309260 102456 60800 S 0.7 5.1 0:06.17 kube-controller 4273 root 20 0 1902692 92040 59904 S 2.0 4.6 0:12.47 kubelet 4159 root 20 0 1288880 60156 44288 S 0.3 3.0 0:02.20 kube-scheduler 3718 root 20 0 1267660 57880 31020 S 0.3 2.9 0:09.28 containerd 4453 root 20 0 1289120 53540 42880 S 0.0 2.7 0:00.15 kube-proxy 5100 65532 20 0 1284608 49656 37504 S 0.0 2.5 0:01.14 coredns 5000 65532 20 0 1284608 49528 37376 S 0.0 2.5 0:00.86 coredns 4166 root 20 0 11.2g 47360 21504 S 1.3 2.4 0:08.69 etcd We could make the cluster even smaller, but kubeadm preflight check requires 1700 MiB, and we don't have memory issue in github or on developers machines. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
We need rosetta only for complex setup when a component is not providing arm64 images. For testing drenv we have an empty cluster with busybox, and out busybox image is multi-arch. Disabling rosetta may speed up the the cluster, and this may be significant on github actions since the runners are about 6.5 times slower compared to M1 mac. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
This is useful for running the tests on github runner, and also support one option for macOS. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Same optimization used in minikube. Minikube use 2 replicas only when using HA configuration. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
It takes about 30 seconds until coredns is ready but we don't depend on it in the current code. Removing this wait we can start deploying 30 seconds earlier. This reduce the time for starting a cluster from 155 seconds to 125 seconds, and regional-dr environment from 450 to 420 seconds. Example run with vm environment: % drenv start envs/vm.yaml 2024-09-22 18:39:55,726 INFO [vm] Starting environment 2024-09-22 18:39:55,743 INFO [cluster] Starting lima cluster 2024-09-22 18:41:57,818 INFO [cluster] Cluster started in 122.07 seconds 2024-09-22 18:41:57,819 INFO [cluster/0] Running addons/example/start 2024-09-22 18:42:18,966 INFO [cluster/0] addons/example/start completed in 21.15 seconds 2024-09-22 18:42:18,966 INFO [cluster/0] Running addons/example/test 2024-09-22 18:42:19,120 INFO [cluster/0] addons/example/test completed in 0.15 seconds 2024-09-22 18:42:19,121 INFO [vm] Environment started in 143.40 seconds % drenv stop envs/vm.yaml 2024-09-22 18:42:44,244 INFO [vm] Stopping environment 2024-09-22 18:42:44,317 INFO [cluster] Stopping lima cluster 2024-09-22 18:42:44,578 WARNING [cluster] [hostagent] dhcp: unhandled message type: RELEASE 2024-09-22 18:42:49,441 INFO [cluster] Cluster stopped in 5.13 seconds 2024-09-22 18:42:49,441 INFO [vm] Environment stopped in 5.20 seconds % drenv start envs/vm.yaml 2024-09-22 18:42:53,132 INFO [vm] Starting environment 2024-09-22 18:42:53,156 INFO [cluster] Starting lima cluster 2024-09-22 18:43:34,436 INFO [cluster] Cluster started in 41.28 seconds 2024-09-22 18:43:34,437 INFO [cluster] Looking up failed deployments 2024-09-22 18:43:34,842 INFO [cluster/0] Running addons/example/start 2024-09-22 18:43:35,208 INFO [cluster/0] addons/example/start completed in 0.37 seconds 2024-09-22 18:43:35,208 INFO [cluster/0] Running addons/example/test 2024-09-22 18:43:35,371 INFO [cluster/0] addons/example/test completed in 0.16 seconds 2024-09-22 18:43:35,372 INFO [vm] Environment started in 42.24 seconds Signed-off-by: Nir Soffer <nsoffer@redhat.com>
- Use `/readyz` endpoint for waiting until kubernetes cluster is ready instead of `kubectl version`. This matches the way we wait for the cluster in drenv. - When waiting for coredns deployment, use `kubectl rollout status`, matching other code in drenv and addons. - Nicer probe description, visible in drenv when using --verbose. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
We can ignore all ports for all protocols for all addresses using a simpler rule. With this change we see this log in --verbose mode: 2024-09-23 01:12:18,130 DEBUG [cluster] [hostagent] TCP (except for SSH) and UDP port forwarding is disabled And no "Not forwarding port ..." message. Previously we had lot of these messages[1]. This requires lima commit[2]. pull current master if you run older version. [1] lima-vm/lima#2577 [2] lima-vm/lima@9a09350 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
And remove unneeded export - it was used to run kubectl commands before we setup root/.kube/config, but this step does not run any kubectl commands. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Provision scripts using `mode: system` run as root and do not need use sudo. Looks like the script were copied from code not running as root. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
This is fixed now in lima[1] so we don't need to keep the fix in drenv. Developers should pull latest lima from git to use this fix. [1] lima-vm/lima#2632 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
raghavendra-talur
approved these changes
Oct 7, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.