Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Linux kernel from 6.6 to 6.12 #2300

Draft
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

ader1990
Copy link
Contributor

@ader1990 ader1990 commented Sep 10, 2024

Upgrade Linux kernel from the 6.6.y stable branch to 6.12.y stable branch (when it gets released).

See: flatcar/Flatcar#1527

This PR is mostly to reveal any possible big blockers before getting to the new 6.12 LTS release.

Currently, upstream Gentoo package has 6.10 and 6.11.

Tested 6.10.y and it works as expected.

Now testing 6.11.y.

Testing done

[Describe the testing you have done before submitting this PR. Please include both the commands you issued as well as the output you got.]

  • Changelog entries added in the respective changelog/ directory (user-facing change, bug fix, security fix, update)
  • Inspected CI output for image differences: /boot and /usr size, packages, list files for any missing binaries, kernel modules, config files, kernel modules, etc.

Boot partition size:

arm64: /dev/nvme0n1p1     129039    63368     65672  50% /boot
amd64: /dev/vda1          129039    62852     66187  49% /boot

@ader1990
Copy link
Contributor Author

ZFS 2.2.5 does not support kernel 6.10, the zfs upgrade patches will be dropped after portage stable update PR gets merged (with 2.2.6 zfs): #2298

Copy link

github-actions bot commented Sep 10, 2024

Test report for 4138.0.0+nightly-20241029-2100 / amd64 arm64

Platforms tested : qemu_uefi-amd64 qemu_update-amd64 qemu_uefi-arm64 qemu_update-arm64

ok bpf.execsnoop 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok bpf.local-gadget 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.basic 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.cgroupv1 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.cloudinit.basic 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.cloudinit.multipart-mime 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.cloudinit.script 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.disk.raid0.data 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.disk.raid0.root 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.disk.raid1.data 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.disk.raid1.root 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.etcd-member.discovery 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.etcd-member.etcdctlv3 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.etcd-member.v2-backup-restore 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.filesystem 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.flannel.udp 🟢 Succeeded: qemu_uefi-amd64 (1)

ok cl.flannel.vxlan 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.instantiated.enable-unit 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.kargs 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.luks 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.oem.indirect 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.oem.indirect.new 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.oem.regular 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (2) ❌ Failed: qemu_uefi-arm64 (1)

                Diagnostic output for qemu_uefi-arm64, run 1
    L1: "  "
    L2: " Error: _oem.go:199: Couldn_t reboot machine: machine __3d00a1f0-361d-480b-9f8d-df018a4ae36e__ failed basic checks: some systemd units failed:"
    L3: "??? ldconfig.service loaded failed failed Rebuild Dynamic Linker Cache"
    L4: "status: "
    L5: "journal:-- No entries --"
    L6: "harness.go:602: Found systemd unit failed to start (?[0;1;39mldconfig.service?[0m - Rebuild Dynamic Linker Cache.  ) on machine 3d00a1f0-361d-480b-9f8d-df018a4ae36e console_"
    L7: " "

ok cl.ignition.oem.regular.new 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (2) ❌ Failed: qemu_uefi-arm64 (1)

                Diagnostic output for qemu_uefi-arm64, run 1
    L1: "  "
    L2: " Error: _oem.go:199: Couldn_t reboot machine: machine __a8d8ccdb-a3ab-46a8-b4df-52f613003741__ failed basic checks: some systemd units failed:"
    L3: "??? ldconfig.service loaded failed failed Rebuild Dynamic Linker Cache"
    L4: "status: "
    L5: "journal:-- No entries --"
    L6: "harness.go:602: Found systemd unit failed to start (?[0;1;39mldconfig.service?[0m - Rebuild Dynamic Linker Cache.  ) on machine a8d8ccdb-a3ab-46a8-b4df-52f613003741 console_"
    L7: " "

ok cl.ignition.oem.reuse 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.oem.wipe 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.partition_on_boot_disk 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.symlink 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.translation 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.btrfsroot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.ext4root 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.groups 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.once 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.sethostname 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.users 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.xfsroot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2.btrfsroot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2.ext4root 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2.users 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2.xfsroot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2_1.ext4checkexisting 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2_1.swap 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2_1.vfat 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.install.cloudinit 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.internet 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.locksmith.cluster 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.misc.falco 🟢 Succeeded: qemu_uefi-amd64 (1)

ok cl.network.initramfs.second-boot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.network.listeners 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.network.wireguard 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.omaha.ping 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.osreset.ignition-rerun 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.overlay.cleanup 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.swap_activation 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.sysext.boot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.sysext.fallbackdownload # SKIP 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tang.nonroot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tang.root 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.toolbox.dnf-install 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tpm.eventlog 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tpm.nonroot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tpm.root 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tpm.root-cryptenroll 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tpm.root-cryptenroll-pcr-noupdate 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tpm.root-cryptenroll-pcr-withupdate 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.update.badverity 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.update.grubnop 🟢 Succeeded: qemu_uefi-amd64 (1)

ok cl.update.payload 🟢 Succeeded: qemu_update-amd64 (1); qemu_update-arm64 (1)

ok cl.update.reboot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.users.shells 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.verity 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.auth.verify 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.groups 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.once 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.resource.local 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.resource.remote 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.resource.s3.versioned 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.security.tls 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.sethostname 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.systemd.enable-service 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.locksmith.reboot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.locksmith.tls 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.selinux.boolean 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.selinux.enforce 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.tls.fetch-urls 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.update.badusr 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok devcontainer.docker 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok devcontainer.systemd-nspawn 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.btrfs-storage 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.containerd-restart 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.enable-service.sysext 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.lib-coreos-dockerd-compat 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.network-openbsd-nc 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.selinux 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.userns 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok extra-test.[first_dual].cl.update.docker-btrfs-compat 🟢 Succeeded: qemu_update-amd64 (1); qemu_update-arm64 (1)

ok extra-test.[first_dual].cl.update.payload 🟢 Succeeded: qemu_update-amd64 (1); qemu_update-arm64 (1)

ok kubeadm.v1.29.2.calico.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.29.2.calico.cgroupv1.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.29.2.cilium.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.29.2.cilium.cgroupv1.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.29.2.flannel.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.29.2.flannel.cgroupv1.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.30.1.calico.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.30.1.cilium.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.30.1.flannel.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.31.0.calico.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.31.0.cilium.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.31.0.flannel.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok linux.nfs.v3 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok linux.nfs.v4 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok linux.ntp 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok misc.fips 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok packages 🟢 Succeeded: qemu_uefi-amd64 (3); qemu_uefi-arm64 (1) ❌ Failed: qemu_uefi-amd64 (1, 2)

                Diagnostic output for qemu_uefi-amd64, run 2
    L1: " _packages/sys-block/open-iscsi (19.51s)"
    L2: "cluster.go:125: Unable to find image _ghcr.io/flatcar/targetcli-fb:latest_ locally"
    L3: "cluster.go:125: latest: Pulling from flatcar/targetcli-fb"
    L4: "cluster.go:125: a2318d6c47ec: Pulling fs layer"
    L5: "cluster.go:125: 3d3086a1439f: Pulling fs layer"
    L6: "cluster.go:125: a2318d6c47ec: Download complete"
    L7: "cluster.go:125: 3d3086a1439f: Verifying Checksum"
    L8: "cluster.go:125: 3d3086a1439f: Download complete"
    L9: "cluster.go:125: a2318d6c47ec: Pull complete"
    L10: "cluster.go:125: 3d3086a1439f: Pull complete"
    L11: "cluster.go:125: Digest: sha256:b6cd65db981974e8b74938617218dd023775b969f9a059ced21e6ce6fa4763c1"
    L12: "cluster.go:125: Status: Downloaded newer image for ghcr.io/flatcar/targetcli-fb:latest"
    L13: "cluster.go:125: mke2fs 1.47.1 (20-May-2024)"
    L14: "cluster.go:125: Created symlink /etc/systemd/system/remote-fs.target.wants/iscsi.service ??? /usr/lib/systemd/system/iscsi.service."
    L15: "cluster.go:145: __sudo /check__ failed: output no /dev/sda device after reboot, status Process exited with status 1_"
    L16: " "
                Diagnostic output for qemu_uefi-amd64, run 1
    L1: " _packages/sys-block/open-iscsi (19.27s)"
    L2: "cluster.go:125: Unable to find image _ghcr.io/flatcar/targetcli-fb:latest_ locally"
    L3: "cluster.go:125: latest: Pulling from flatcar/targetcli-fb"
    L4: "cluster.go:125: a2318d6c47ec: Pulling fs layer"
    L5: "cluster.go:125: 3d3086a1439f: Pulling fs layer"
    L6: "cluster.go:125: a2318d6c47ec: Download complete"
    L7: "cluster.go:125: 3d3086a1439f: Verifying Checksum"
    L8: "cluster.go:125: 3d3086a1439f: Download complete"
    L9: "cluster.go:125: a2318d6c47ec: Pull complete"
    L10: "cluster.go:125: 3d3086a1439f: Pull complete"
    L11: "cluster.go:125: Digest: sha256:b6cd65db981974e8b74938617218dd023775b969f9a059ced21e6ce6fa4763c1"
    L12: "cluster.go:125: Status: Downloaded newer image for ghcr.io/flatcar/targetcli-fb:latest"
    L13: "cluster.go:125: mke2fs 1.47.1 (20-May-2024)"
    L14: "cluster.go:125: Created symlink /etc/systemd/system/remote-fs.target.wants/iscsi.service ??? /usr/lib/systemd/system/iscsi.service."
    L15: "cluster.go:145: __sudo /check__ failed: output no /dev/sda device after reboot, status Process exited with status 1_"
    L16: " "
    L17: "  "

ok sysext.custom-docker.sysext 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok sysext.custom-oem 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok sysext.disable-containerd 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok sysext.disable-docker 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok sysext.simple 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok systemd.journal.remote 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok systemd.journal.user 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok systemd.sysusers.gshadow 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

@@ -36,6 +36,5 @@ IUSE=""
# local patches overlap with the upstream patch.
UNIPATCH_LIST="
${PATCH_DIR}/z0001-kbuild-derive-relative-path-for-srctree-from-CURDIR.patch \
${PATCH_DIR}/z0002-revert-pahole-flags.patch \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested this?
When pahole is executed with -j (parallel) then btf metadata order is non-deterministic and the built kernel and modules don't match.

It doesn't have to be a revert, but we need to carry some patch (unless something significant changed).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely, working on it. pahole flags moved to scripts/Makefile.btf, so that needs to be addressed, was working now on a patch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We recently updated pahole to a newer version (1.27) that was supposed to be reproducible regardless of how many threads it uses, but dropping the patch didn't work for me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we may still need a kernel patch to pass --btf_features=all,reproducible_build: https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?h=v1.27&id=43bd3efa85656565129063cdd6dd7499e44a7867

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be upstreamed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will test it asap and send it to LKML if it works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the reproducible_build flag to the pahole params, although I don't like how that addition is done, that file is a beautiful soup and needs some better management.

@jepio
Copy link
Member

jepio commented Sep 10, 2024

Things like this make me want to wait with an upgrade to 6.10:
"[regression] significant delays when secureboot is enabled since 6.10" https://lore.kernel.org/lkml/92fbcc4c252ec9070d71a6c7d4f1d196ec67eeb0.camel@huaweicloud.com/T/#mb17f32470541d54f7ee45987d510aa45b7557969

It takes a couple minor releases on a new stable branch before it is ready to make its way into Flatcar.

@ader1990 ader1990 marked this pull request as draft September 10, 2024 10:36
@ader1990
Copy link
Contributor Author

Things like this make me want to wait with an upgrade to 6.10: "[regression] significant delays when secureboot is enabled since 6.10" https://lore.kernel.org/lkml/92fbcc4c252ec9070d71a6c7d4f1d196ec67eeb0.camel@huaweicloud.com/T/#mb17f32470541d54f7ee45987d510aa45b7557969

It takes a couple minor releases on a new stable branch before it is ready to make its way into Flatcar.

Adding the blocker bug here: https://bugzilla.kernel.org/show_bug.cgi?id=219229

Possible resolution from the bug discussion in the kernel config:

CONFIG_TCG_TPM2_HMAC=n

@ader1990
Copy link
Contributor Author

The feature CONFIG_TCG_TPM2_HMAC has been introduced in 6.10 as extra security layer: https://github.com/torvalds/linux/blob/master/drivers/char/tpm/Kconfig#L37

@ader1990
Copy link
Contributor Author

Managed to get the ARM64 image built, but the AMD64 image fails at the initrd/grub stage with error: cpio: premature end of file.

Full error bellow:

2024-09-11T13:33:24.1344201Z INFO    grub_install.sh: Installing GRUB x86_64-xen in flatcar_production_image.bin
2024-09-11T13:33:24.1537425Z INFO    grub_install.sh: Compressing modules in flatcar/grub/x86_64-xen
2024-09-11T13:33:25.2833839Z INFO    grub_install.sh: Generating flatcar/grub/x86_64-xen/load.cfg
2024-09-11T13:33:25.3866147Z INFO    grub_install.sh: Generating flatcar/grub/x86_64-xen/core.elf
2024-09-11T13:33:25.4519108Z INFO    grub_install.sh: Installing default x86_64 Xen bootloader.
2024-09-11T13:33:25.5266195Z INFO    grub_install.sh: Elapsed time (grub_install.sh): 0m2s
2024-09-11T13:33:25.5754372Z INFO    build_image: Generating flatcar_production_image_pcr_policy.zip
2024-09-11T13:33:25.8790423Z INFO    build_image: Writing flatcar_production_image_contents.txt
2024-09-11T13:33:26.7383193Z INFO    build_image: Writing flatcar_production_image_contents_wtd.txt
2024-09-11T13:33:26.9908326Z cpio: premature end of file
2024-09-11T13:33:26.9914934Z rmdir: failed to remove '/home/sdk/trunk/src/scripts/artifacts/amd64-usr/developer-4089.0.0+nightly-20240910-2100-12-g6dd0a5b3f7-a1/tmp_initrd_contents/rootfs-0': Directory not empty
2024-09-11T13:33:27.0063435Z ERROR   build_image: script called: build_image '--board=amd64-usr' '--group=developer' '--output_root=/home/sdk/trunk/src/scripts/artifacts' 'prodtar' 'container' 'sysext'
2024-09-11T13:33:27.0069179Z ERROR   build_image: Backtrace:  (most recent call is last)
2024-09-11T13:33:27.0086074Z ERROR   build_image:   file build_image, line 173, called: create_prod_image 'flatcar_production_image.bin' 'base' 'developer' 'coreos-base/coreos' 'containerd-flatcar:app-containers/containerd,docker-flatcar:app-containers/docker&app-containers/docker-cli&app-containers/docker-buildx'
2024-09-11T13:33:27.0103055Z ERROR   build_image:   file prod_image_util.sh, line 169, called: finish_image 'flatcar_production_image.bin' 'base' '/home/sdk/trunk/src/scripts/artifacts/amd64-usr/developer-4089.0.0+nightly-20240910-2100-12-g6dd0a5b3f7-a1/rootfs' 'flatcar_production_image_contents.txt' 'flatcar_production_image_contents_wtd.txt' 'flatcar_production_image.vmlinuz' 'flatcar_production_image_pcr_policy.zip' 'flatcar_production_image.grub' 'flatcar_production_image.shim' 'flatcar_production_image_kernel_config.txt' 'flatcar_production_image_initrd_contents.txt' 'flatcar_production_image_initrd_contents_wtd.txt' 'flatcar_production_image_disk_usage.txt'
2024-09-11T13:33:27.0112878Z ERROR   build_image:   file build_image_util.sh, line 903, called: die_err_trap '"${BUILD_LIBRARY_DIR}/extract-initramfs-from-vmlinuz.sh" "${root_fs_dir}/boot/flatcar/vmlinuz-a" "${BUILD_DIR}/tmp_initrd_contents"' '1'
2024-09-11T13:33:27.0118365Z ERROR   build_image: 
2024-09-11T13:33:27.0124923Z ERROR   build_image: Command failed:
2024-09-11T13:33:27.0132629Z ERROR   build_image:   Command '"${BUILD_LIBRARY_DIR}/extract-initramfs-from-vmlinuz.sh" "${root_fs_dir}/boot/flatcar/vmlinuz-a" "${BUILD_DIR}/tmp_initrd_contents"' exited with nonzero code: 1

@ader1990
Copy link
Contributor Author

Successful build for the AMD64:

 uname -a
Linux localhost 6.10.9-flatcar #1 SMP PREEMPT_DYNAMIC Wed Sep 11 17:33:15 -00 2024 x86_64 Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz GenuineIntel GNU/Linux
root@localhost ~ # cat /etc/os-release
NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=4089.0.0+nightly-20240910-2100-14-g5595c96aa4
VERSION_ID=4089.0.0
BUILD_ID=nightly-20240910-2100-14-g5595c96aa4
SYSEXT_LEVEL=1.0
PRETTY_NAME="Flatcar Container Linux by Kinvolk 4089.0.0+nightly-20240910-2100-14-g5595c96aa4 (Oklo)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar.org/"
BUG_REPORT_URL="https://issues.flatcar.org"
FLATCAR_BOARD="amd64-usr"
CPE_NAME="cpe:2.3:o:flatcar-linux:flatcar_linux:4089.0.0+nightly-20240910-2100-14-g5595c96aa4:*:*:*:*:*:*:*"

@ader1990
Copy link
Contributor Author

The bpf amd64 bpf.execsnoop mantle test should be fixed by a new image of iovisor/bcc iovisor/bcc@5d2ef17. I triggered an image update https://github.com/flatcar/mantle/actions/runs/10832754186/job/30057878029.

@ader1990
Copy link
Contributor Author

@t-lo I observed that from Linux kernel 6.10, there is a change in name of a hyper-v daemon binary - see torvalds/linux@82b0945. Should we leave the same systemd unit name though?

I wonder how the https://github.com/microsoft/azurelinux will be doing it (have not seen yet any patch).

I am oscillating between this 4ad039e vs changing the name in all places.

@t-lo
Copy link
Member

t-lo commented Sep 17, 2024

@t-lo I observed that from Linux kernel 6.10, there is a change in name of a hyper-v daemon binary - see torvalds/linux@82b0945. Should we leave the same systemd unit name though?

I wonder how the https://github.com/microsoft/azurelinux will be doing it (have not seen yet any patch).

I am oscillating between this 4ad039e vs changing the name in all places.

I think we should rename the systemd service to prevent confusion down the road.

@ader1990
Copy link
Contributor Author

@t-lo I observed that from Linux kernel 6.10, there is a change in name of a hyper-v daemon binary - see torvalds/linux@82b0945. Should we leave the same systemd unit name though?
I wonder how the https://github.com/microsoft/azurelinux will be doing it (have not seen yet any patch).
I am oscillating between this 4ad039e vs changing the name in all places.

I think we should rename the systemd service to prevent confusion down the road.

The thing is that the binaries do the same thing / have the same interface, but just internally have a different implementation aka uio_hv_generic. The weird part is that the old implementation is still present, but has build disabled.

I will add a new service definition (as it also has a different device path trigger) for the new version, to keep things separate.

@ader1990
Copy link
Contributor Author

The /boot partition is very close to a critical level, 49% already used, leaving around 1.5MB free to use:

/dev/vda1          129039    62852     66187  49% /boot

@ader1990
Copy link
Contributor Author

Note: on AMD64 vmlinuz-a, the build_library/extract-initramfs-from-vmlinuz.sh fails due to the fact now that the scripts finds the corrupted CPIO first. Need to do some more debugging on why this issue happens in the first place (what has changed upstream).

@ader1990 ader1990 self-assigned this Sep 20, 2024
@ader1990 ader1990 changed the title Upgrade Linux kernel from 6.6 to 6.10 Upgrade Linux kernel from 6.6 to 6.12 Oct 28, 2024
@@ -17,7 +17,7 @@ if [[ -z ${NUM_JOBS} ]] || [[ ${NUM_JOBS} -eq 0 ]]; then
NUM_JOBS=$(grep -c "^processor" /proc/cpuinfo)
fi
# Ensure that any sub scripts we invoke get the max proc count.
export NUM_JOBS
export NUM_JOBS=350
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs removal after testing.

@@ -900,7 +900,8 @@ EOF
write_contents_with_technical_details "${root_fs_dir}" "${BUILD_DIR}/${image_contents_wtd}"

if [[ -n "${image_initrd_contents}" ]] || [[ -n "${image_initrd_contents_wtd}" ]]; then
"${BUILD_LIBRARY_DIR}/extract-initramfs-from-vmlinuz.sh" "${root_fs_dir}/boot/flatcar/vmlinuz-a" "${BUILD_DIR}/tmp_initrd_contents"
echo ">>>>DEBUG<<<${root_fs_dir}/boot/flatcar/vmlinuz-a"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs removal.

@ader1990
Copy link
Contributor Author

Bug report and fix information for the arm64 cross-build lis hyper-v daemons were sent to the mailing list https://lore.kernel.org/linux-hyperv/PR3PR09MB54119DB2FD76977C62D8DD6AB04D2@PR3PR09MB5411.eurprd09.prod.outlook.com/T/#u

@ader1990
Copy link
Contributor Author

Latest release 2.2.6 is not compatible with Linux kernel 6.11. The new releases that are compatible with 6.11 are in RC stage https://github.com/openzfs/zfs/releases.

Upgrade to 6.11 is blocked until openzfs release 2.3.0 is available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants