Skip to content

manifests/fedora-coreos-base: apply dracut fix in postcript #3508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: testing-devel
Choose a base branch
from

Conversation

jcapiitao
Copy link
Member

We are awaiting some dracut patches to land in Fedora mirror for [1]. We could wait for it, but we'd need it also for older rhcos branches and those branches won't have the new dracut. So we decided to carry part of the dracut-ng patch [2] in a post-script. This will be backported in older rhcos branches, and will be reverted on testing-devel once new dracut 107 available in mirror.

[1] coreos/fedora-coreos-tracker#1937
[2] dracut-ng/dracut-ng#1306

@jcapiitao jcapiitao requested review from dustymabe and jlebon May 16, 2025 17:10
Copy link
Member

@dustymabe dustymabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add review here in a bit.

Copy link
Member

@dustymabe dustymabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the long wait for the review here. I wanted to put some of the groundwork in place for you to be able to implement my suggestion.

This patch is something we want to carry in FCOS and RHCOS and is also something that we foresee we will drop progressively over time as it enters our various streams (i.e. we'll drop it from rawhide first, then next-devel/next, eventually we'll drop it from el10 and maybe never drop it from el9??

Anyway we need to set ourselves up here for success. Let's move this into a manifest that is shared with el9/10.

I should have made this easier with #3511 If you don't mind review that and test it out (both an FCOS build and downstream el9/10 build would be useful).

@jcapiitao jcapiitao force-pushed the fix_mpath_postscript branch from eba1a79 to 9737c5c Compare May 19, 2025 15:02
@jcapiitao
Copy link
Member Author

Sorry for the long wait for the review here. I wanted to put some of the groundwork in place for you to be able to implement my suggestion.

This patch is something we want to carry in FCOS and RHCOS and is also something that we foresee we will drop progressively over time as it enters our various streams (i.e. we'll drop it from rawhide first, then next-devel/next, eventually we'll drop it from el10 and maybe never drop it from el9??

Anyway we need to set ourselves up here for success. Let's move this into a manifest that is shared with el9/10.

I should have made this easier with #3511 If you don't mind review that and test it out (both an FCOS build and downstream el9/10 build would be useful).

Understood, I'll rebase my PR on top of #3511 and put the post-script in shared-el.yaml manifest

@dustymabe dustymabe changed the title NO-JIRA: manifests/fedora-coreos-base: apply dracut fix in postcript manifests/fedora-coreos-base: apply dracut fix in postcript May 19, 2025
@jcapiitao jcapiitao force-pushed the fix_mpath_postscript branch from 9737c5c to 2e42d95 Compare May 20, 2025 10:07
We are awaiting some dracut patches to land in Fedora mirror for [1].
We could wait for it, but we'd need it also for older rhcos branches
and those branches won't have the new dracut. So we decided to
carry part of the dracut-ng patch [2] in a post-script.
This will be backported in older rhcos branches, and will be reverted
on testing-devel once new dracut 107 available in mirror.

[1] coreos/fedora-coreos-tracker#1937
[2] dracut-ng/dracut-ng#1306
@jcapiitao jcapiitao force-pushed the fix_mpath_postscript branch from 2e42d95 to 9d804c9 Compare May 20, 2025 10:09
@jcapiitao jcapiitao changed the title manifests/fedora-coreos-base: apply dracut fix in postcript manifests/shared-el: apply dracut fix in postcript May 20, 2025
@dustymabe
Copy link
Member

Can we add some text to the body of the 2nd commit. A nice description of why we can re-enable this test would be useful.

@dustymabe dustymabe changed the title manifests/shared-el: apply dracut fix in postcript manifests/fedora-coreos-base: apply dracut fix in postcript May 20, 2025
@dustymabe
Copy link
Member

CI failures here:

[2025-05-20T13:14:34.721Z] --- FAIL: multipath.day2 (84.86s)
[2025-05-20T13:14:34.721Z]         multipath.go:155: mount /sysroot has non-multipath source /dev/mapper/0x150a7519b1eaaa71p4
[2025-05-20T13:15:01.175Z] --- FAIL: multipath.day1 (36.71s)
[2025-05-20T13:15:01.175Z]         multipath.go:155: mount /sysroot has non-multipath source /dev/mapper/0x79b7cca15d2f4a68p4

which is interesting. @jlebon would you have expected multipath regressions as a result of dracut-ng/dracut-ng#1306 ?

@jlebon
Copy link
Member

jlebon commented May 20, 2025

CI failures here:

[2025-05-20T13:14:34.721Z] --- FAIL: multipath.day2 (84.86s)
[2025-05-20T13:14:34.721Z]         multipath.go:155: mount /sysroot has non-multipath source /dev/mapper/0x150a7519b1eaaa71p4
[2025-05-20T13:15:01.175Z] --- FAIL: multipath.day1 (36.71s)
[2025-05-20T13:15:01.175Z]         multipath.go:155: mount /sysroot has non-multipath source /dev/mapper/0x79b7cca15d2f4a68p4

which is interesting. @jlebon would you have expected multipath regressions as a result of dracut-ng/dracut-ng#1306 ?

I think it's more that the test needs to be adapted so it doesn't verify that we're on multipath based on whether the device is /dev/mapper/mpath*. Instead do something like this.

Aside: a lot (probably all) of those multipath tests could nowadays be moved to external tests.

@dustymabe
Copy link
Member

I think it's more that the test needs to be adapted so it doesn't verify that we're on multipath based on whether the device is /dev/mapper/mpath*. Instead do something like this.

That seems reasonable, but is it possible here we're going to break people who somehow made assumptions based on what the device names were in the past?

@dustymabe
Copy link
Member

Aside: a lot (probably all) of those multipath tests could nowadays be moved to external tests.

I would love that

This test can be re-enable as we are applying part of the dracut-ng
patch [1] in a postprocess script. This script is a workaround
solution while awaiting the bump of dracut RPM NVR including [1] in
Fedora, and will be removed afterward.

[1] dracut-ng/dracut-ng#1306
@jcapiitao
Copy link
Member Author

Can we add some text to the body of the 2nd commit. A nice description of why we can re-enable this test would be useful.

Done in last patch set.

which is interesting. @jlebon would you have expected multipath regressions as a result of dracut-ng/dracut-ng#1306 ?

I think it's more that the test needs to be adapted so it doesn't verify that we're on multipath based on whether the device is /dev/mapper/mpath*. Instead do something like this.

Aside: a lot (probably all) of those multipath tests could nowadays be moved to external tests.

I proposed a PR to do so coreos/coreos-assembler#4113 (i.e still in go tests though).
I added a task to move them to external test in my TODO list.

With my cosa PR, it's passing the verification step, but still fails while rebooting:

Timed out waiting for device dev-disk-by\x2dlabel-dm\x2dmpath\x2dboot.device - /dev/disk/by-label/dm-mpath-boot.
Warning: /dev/disk/by-label/dm-mpath-root does not exist

@jcapiitao jcapiitao force-pushed the fix_mpath_postscript branch from 4344bd0 to 0f9e91c Compare May 21, 2025 10:59
@dustymabe
Copy link
Member

@jlebon with user_friendly_names=no we are going to need to adapt all of our tests to not hardcode in mpatha, correct?

@jlebon
Copy link
Member

jlebon commented May 21, 2025

I think it's more that the test needs to be adapted so it doesn't verify that we're on multipath based on whether the device is /dev/mapper/mpath*. Instead do something like this.

That seems reasonable, but is it possible here we're going to break people who somehow made assumptions based on what the device names were in the past?

The problem is that those names were not stable to begin with and so not recommended (in favour of the WWID, but the problem with that is that it's node-specific). I guess it is reliable if you only have a single multipathed device (in which case you're guaranteed I think to get /dev/mapper/mpatha) and don't plan to ever connect another multipathed device.

That said, a really important factor is that upgrading nodes will not get this change because the /etc/multipath.conf generated here is propagated into the real root. So any existing nodes that did hardcode /dev/mapper/mpath* paths in some settings will "keep working" (the symlinks will exist, but this changes nothing about them being unreliable).

That said, there's definitely an argument to be made for not backporting this change because it's too significant to change mid-release (even if it would only take effect with new boot media). I think the workaround if you don't want friendly names then on those older releases is to add an /etc/multipath.conf in your Ignition config equivalent to what mpathconf --enable --user_friendly_names n would output (and thus neuter the propagation logic). We should sanity-check this flow though before recommending it to users.

cc @bmarzins in case I got some details wrong.

@jlebon
Copy link
Member

jlebon commented May 21, 2025

@jlebon with user_friendly_names=no we are going to need to adapt all of our tests to not hardcode in mpatha, correct?

Yes, indeed. Which... I think those previously fell under:

I guess it is reliable if you only have a single multipathed device (in which case you're guaranteed I think to get /dev/mapper/mpatha) and don't plan to ever connect another multipathed device.

But at the same time, in realistic scenarios you very often do have multiple multpath devices and our provisioning flows should be able to handle those cases without having to jump through hoops. I.e. those should not be treated in a special way.

Anyway, arguing with myself a bit here but I think this is still the right change to make overall even though it might/likely will cause discomfort.

@dustymabe
Copy link
Member

so that means we need to update all the uses of mpatha in

  • ext.config.root-reprovision.luks.multipath from f-c-c
  • mantle/kola/tests/misc/multipath.go from COSA

Is that correct?

@dustymabe
Copy link
Member

dustymabe commented May 21, 2025

That said, there's definitely an argument to be made for not backporting this change because it's too significant to change mid-release (even if it would only take effect with new boot media).

So should we kill #3506 (the backport to 4.18 that we were holding up to pull in this patch) then?

Note that what we land here will slip straight into rhel-9.6 stream which will affect OCP 4.19.

@bmarzins
Copy link

I think it's more that the test needs to be adapted so it doesn't verify that we're on multipath based on whether the device is /dev/mapper/mpath*. Instead do something like this.

That seems reasonable, but is it possible here we're going to break people who somehow made assumptions based on what the device names were in the past?

The problem is that those names were not stable to begin with and so not recommended (in favour of the WWID, but the problem with that is that it's node-specific). I guess it is reliable if you only have a single multipathed device (in which case you're guaranteed I think to get /dev/mapper/mpatha) and don't plan to ever connect another multipathed device.

That depends on what you mean when you say "stable". Multipath stores the mapping between user_friendly_name and WWID in /etc/multipath/bindings, and once there is a mapping there, multipath will always use it unless there already is a device with that name. That's actually the origin of the problem that dracut-ng/dracut-ng#1306 is trying to fix. Since there is no /etc/multipath/bindings in the initramfs, the devices get their user_friendly_name there by the order that they are discovered. However once you switch-root, multipath can't always switch a device to the user_friendly_name it is supposed to have, since it might be in use by another device. In this case it will fail back to using the WWID as the name.

As long as you only have one multipath device, it will always be named mpatha. If you later add another device, the original device will still always be named mpatha, unless the other device gets named mpatha in the initramfs, in which case the original device will be named by its WWID for that boot.

That said, a really important factor is that upgrading nodes will not get this change because the /etc/multipath.conf generated here is propagated into the real root. So any existing nodes that did hardcode /dev/mapper/mpath* paths in some settings will "keep working" (the symlinks will exist, but this changes nothing about them being unreliable).

That said, there's definitely an argument to be made for not backporting this change because it's too significant to change mid-release (even if it would only take effect with new boot media). I think the workaround if you don't want friendly names then on those older releases is to add an /etc/multipath.conf in your Ignition config equivalent to what mpathconf --enable --user_friendly_names n would output (and thus neuter the propagation logic). We should sanity-check this flow though before recommending it to users.

Since the multipath user_friendly_names are supposed to be persistent (in RHEL, we tell users that they must remake their initramfs whenever a new multipath device is added, to propagate /etc/multipath/bindings to the initramfs and avoid the name mismatch issue) users may well use that specific name in their settings. As long as you don't change /etc/multipath.conf on existing nodes, you should be safe. But having multipath devices suddenly named differently on new nodes may well confuse users, so I'm not sure that you should make this change mid-release.

@jlebon
Copy link
Member

jlebon commented May 26, 2025

@bmarzins Thanks for the additional context!

We actually do have facilities to inject back the bindings into the initramfs as well. Though I'm not sure of the payoff of trying to wire that through in all the right places. It just seems to me like user friendly names should be avoided in general anyway.

That said, a really important factor is that upgrading nodes will not get this change because the /etc/multipath.conf generated here is propagated into the real root. So any existing nodes that did hardcode /dev/mapper/mpath* paths in some settings will "keep working" (the symlinks will exist, but this changes nothing about them being unreliable).

In discussing this with @dustymabe, we realized it's a little more nuanced than that. Upgrading nodes will indeed have a carried over /etc/multipath.conf, but not in the initramfs. Which means that any node using root=/dev/mapper/mpatha4 for example will fail after upgrade because the /etc/multipath.conf generated in the initramfs will now turn off those symlinks.

However, before this latest push to drop the root= karg requirement, it's been pretty well documented that you must use root=/dev/disk/by-label/dm-mpath-root (see e.g. https://github.com/openshift/os/blob/master/docs/faq.md#q-does-rhcos-support-multipath-on-the-primary-disk and https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/installing_on_bare_metal/user-provisioned-infrastructure#rhcos-enabling-multipath_installing-bare-metal).

So in practice, there should be no or few systems using root=/dev/mapper/mpatha4. The fix there is simple of course if you do hit this: change your root karg, or better, use its UUID.

So should we kill #3506 (the backport to 4.18 that we were holding up to pull in this patch) then?

Yeah, let's not backport it.

Note that what we land here will slip straight into rhel-9.6 stream which will affect OCP 4.19.

Yeah, I think doing this in 4.19+ is fine. We're fast-tracking a patch which will land in 9.6 anyway eventually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants