Skip to content

CORENET-5972: Consume openvswitch-ipsec systemd service for OVN IPsec deployment #2662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

pperiyasamy
Copy link
Member

@pperiyasamy pperiyasamy commented Mar 10, 2025

This ipsec machine config extension is now installing openvswitch3.5-ipsec package on the node, so this PR consumes that package to configure, enable and start openvswitch-ipsec systemd service which basically moves away running ovs-monitor-ipsec process from container to host.

It fixes following issues.

  1. There is no need for patching auto=start parameter into each IPsec connections which is introdued by the PR OCPBUGS-52280, SDN-5330: Add ipsec connect wait service machine-config-operator#4854.
  2. The ovs-monitor-ipsec is now solving auth problem while using default crypto policies loaded by libreswan.

In order to consume openvswitch-ipsec systemd service, this PR does the following:

  • Stop spawning ovs-monitor-ipsec as foreground process in the ovn-ipsec container, Instead setup required IPsec configuration parameters in the /etc/sysconfig/openvswitch file, enable and start the openvswitch-ipsec service
    on the host. This is done at the of when ovn-ipsec-host pod is coming up for the first time, for the pod restart scenarios, it just checks openvswitch-ipsec service is running on the host, otherwise exit from the container with error.

  • Enable and start openvswitch-ipsec systemd service from IPsec Machine configs when service is already configured for east west traffic.

  • Keep running an ovn-ipsec container and redirects /var/log/openvswitch/ovs-monitor-ipsec.log to the ovn-ipsec container's stdout console.

  • There is no necessity of doing ipsec state and policy cleanup in ovn-ipsec-cleanup container when OVN IPsec is handled via openvswitch-ipsec systemd service.

  • During the OCP upgrade, the new ipsec os extension takes while to deploy with openvswitch3.5-ipsec package, so by the time ovn-ipsec-host daemonset is rendered, We need to handle that scenario by running ovs-monitor-ipsec in the container. so this PR is also considering the transition phase of the process that is moving from container to host.

  • The ovn-keys init container configures ovs with IPsec certificate paths, so the container uses same host directory path to store and configure ovs with certificates because the ovs-monitor-ipsec process is running on the host now.

/assign @igsilya

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-upgrade openshift/machine-config-operator#4854 openshift/os#1718 openshift/machine-config-operator#4878 openshift/ovn-kubernetes#2472

2 similar comments
@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-upgrade openshift/machine-config-operator#4854 openshift/os#1718 openshift/machine-config-operator#4878 openshift/ovn-kubernetes#2472

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-upgrade openshift/machine-config-operator#4854 openshift/os#1718 openshift/machine-config-operator#4878 openshift/ovn-kubernetes#2472

@pperiyasamy
Copy link
Member Author

The testwith doesn't take openshift/os#1718 changes, ipsec os extension is still not installing openvswitch-ipsec package, so we must get 1718 landed first for testing CNO changes.

@igsilya
Copy link

igsilya commented Mar 11, 2025

@pperiyasamy I agree, we should get OVS 3.5 first into rhcos / ovn-k / microshift. We can install openvswitch3.5-ipsec at the same time, it should not be a problem since the service is disabled until CNO activates it. We need OVS 3.5 either way for other purposes (rhel 10 support, for example). Once we have OVS 3.5 and the openvswitch-ipsec service we can more easily test CNO and other changes.

@pperiyasamy pperiyasamy changed the title Consume openvswitch-ipsec systemd service for OVN IPsec deployment SDN-5330: Consume openvswitch-ipsec systemd service for OVN IPsec deployment Mar 11, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 11, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 11, 2025

@pperiyasamy: This pull request references SDN-5330 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

The ovn-ipsec-host daemonset pod currently spins up ovs-monitor-ipsec process to configure IPsec connections with the peer nodes. This would make ipsec connections to be established for the existing nodes a bit later after kubelet is started at the time node/service restart scenario, but by the time workloads are scheduled on the node started hitting traffic drops because of unavailability of IPsec connections between nodes. This makes IPsec jobs in CI so unstable and monitor jobs always failing during IPsec upgrade.

The FDP story (https://issues.redhat.com/browse/FDP-1051) gets openvswitch-ipsec systemd service (runs ovs-monitor-ipsec) with required configurable parameters, It's available with OVS 3.5 version. So this commit does the following.

  1. Stop spawning ovs-monitor-ipsec as foreground process in the ovn-ipsec container, Instead setup required IPsec configuration parameters in the /etc/sysconfig/openvswitch file, enable and start the openvswitch-ipsec service on the host. This is done at the of when ovn-ipsec-host pod is coming up for the first time, for the pod restart scenarios, it just checks openvswitch-ipsec service is running on the host, otherwise exit from the container with error. There would an update with mcp PR OCPBUGS-52280, SDN-5330: Add ipsec connect wait service machine-config-operator#4854 to make openvswitch-ipsec service is started before kubelet.

  2. Keep running an ovn-ipsec container and redirects /var/log/openvswitch/ovs-monitor-ipsec.log to the ovn-ipsec container's stdout console.

  3. There is no necessity of having ovn-ipsec-cleanup container anymore with openvswitch-ipsec service as it's going to handle OVN IPsec states appropriately.

Depends on: openshift/os#1718.

/assign @igsilya

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@pperiyasamy pperiyasamy force-pushed the use-openvswitch-ipsec-systemd-service branch from 51fb402 to 70f121e Compare March 17, 2025 12:22
@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-upgrade openshift/machine-config-operator#4854 openshift/os#1718 openshift/machine-config-operator#4878 openshift/ovn-kubernetes#2472

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-upgrade openshift/os#1718 openshift/machine-config-operator#4878 openshift/ovn-kubernetes#2472

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-upgrade openshift/ovn-kubernetes#2472 openshift/machine-config-operator#4878

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-serial openshift/ovn-kubernetes#2472 openshift/machine-config-operator#4878

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-upgrade openshift/ovn-kubernetes#2472 openshift/machine-config-operator#4878

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-serial openshift/ovn-kubernetes#2472 openshift/machine-config-operator#4878

@pperiyasamy pperiyasamy force-pushed the use-openvswitch-ipsec-systemd-service branch from 45b106a to 555d31c Compare April 10, 2025 08:52
@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-upgrade openshift/ovn-kubernetes#2472 openshift/machine-config-operator#4878

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-serial openshift/ovn-kubernetes#2472 openshift/machine-config-operator#4878

@pperiyasamy
Copy link
Member Author

/assign @anuragthehatter @huiran0826

@pperiyasamy pperiyasamy force-pushed the use-openvswitch-ipsec-systemd-service branch from 555d31c to 5b0839a Compare April 14, 2025 10:04
@pperiyasamy pperiyasamy force-pushed the use-openvswitch-ipsec-systemd-service branch 3 times, most recently from 01be1b3 to ac1108b Compare April 15, 2025 08:11
@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-upgrade openshift/ovn-kubernetes#2472 openshift/machine-config-operator#4878

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-serial openshift/ovn-kubernetes#2472 openshift/machine-config-operator#4878

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-upgrade openshift/machine-config-operator#4878

@pperiyasamy
Copy link
Member Author

/assign @jcaamano

@openshift-ci-robot
Copy link
Contributor

@pperiyasamy: This pull request references SDN-5330 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

This ipsec machine config extension is now installing openvswitch3.5-ipsec package on the node, so this PR consumes that package to configure, enable and start openvswitch-ipsec systemd service which basically moves away running ovs-monitor-ipsec process from container to host.

It fixes following issues.

  1. There is no need for patching auto=start parameter into each IPsec connections which is introdued by the PR OCPBUGS-52280, SDN-5330: Add ipsec connect wait service machine-config-operator#4854.
  2. The ovs-monitor-ipsec is now solving auth problem while using default crypto policies loaded by libreswan.

In order to consume openvswitch-ipsec systemd service, this PR does the following:

  • Stop spawning ovs-monitor-ipsec as foreground process in the ovn-ipsec container, Instead setup required IPsec configuration parameters in the /etc/sysconfig/openvswitch file, enable and start the openvswitch-ipsec service
    on the host. This is done at the of when ovn-ipsec-host pod is coming up for the first time, for the pod restart scenarios, it just checks openvswitch-ipsec service is running on the host, otherwise exit from the container with error.

  • Enable and start openvswitch-ipsec systemd service from IPsec Machine configs when service is already configured for east west traffic.

  • Keep running an ovn-ipsec container and redirects /var/log/openvswitch/ovs-monitor-ipsec.log to the ovn-ipsec container's stdout console.

  • There is no necessity of doing ipsec state and policy cleanup in ovn-ipsec-cleanup container when OVN IPsec is handled via openvswitch-ipsec systemd service.

  • During the OCP upgrade, the new ipsec os extension takes while to deploy with openvswitch3.5-ipsec package, so by the time ovn-ipsec-host daemonset is rendered, We need to handle that scenario by running ovs-monitor-ipsec in the container. so this PR is also considering the transition phase of the process that is moving from container to host.

  • The ovn-keys init container configures ovs with IPsec certificate paths, so the container uses same host directory path to store and configure ovs with certificates because the ovs-monitor-ipsec process is running on the host now.

/assign @igsilya

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@pperiyasamy pperiyasamy changed the title SDN-5330: Consume openvswitch-ipsec systemd service for OVN IPsec deployment CORENET-5972: Consume openvswitch-ipsec systemd service for OVN IPsec deployment Apr 30, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Apr 30, 2025

@pperiyasamy: This pull request references CORENET-5972 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

This ipsec machine config extension is now installing openvswitch3.5-ipsec package on the node, so this PR consumes that package to configure, enable and start openvswitch-ipsec systemd service which basically moves away running ovs-monitor-ipsec process from container to host.

It fixes following issues.

  1. There is no need for patching auto=start parameter into each IPsec connections which is introdued by the PR OCPBUGS-52280, SDN-5330: Add ipsec connect wait service machine-config-operator#4854.
  2. The ovs-monitor-ipsec is now solving auth problem while using default crypto policies loaded by libreswan.

In order to consume openvswitch-ipsec systemd service, this PR does the following:

  • Stop spawning ovs-monitor-ipsec as foreground process in the ovn-ipsec container, Instead setup required IPsec configuration parameters in the /etc/sysconfig/openvswitch file, enable and start the openvswitch-ipsec service
    on the host. This is done at the of when ovn-ipsec-host pod is coming up for the first time, for the pod restart scenarios, it just checks openvswitch-ipsec service is running on the host, otherwise exit from the container with error.

  • Enable and start openvswitch-ipsec systemd service from IPsec Machine configs when service is already configured for east west traffic.

  • Keep running an ovn-ipsec container and redirects /var/log/openvswitch/ovs-monitor-ipsec.log to the ovn-ipsec container's stdout console.

  • There is no necessity of doing ipsec state and policy cleanup in ovn-ipsec-cleanup container when OVN IPsec is handled via openvswitch-ipsec systemd service.

  • During the OCP upgrade, the new ipsec os extension takes while to deploy with openvswitch3.5-ipsec package, so by the time ovn-ipsec-host daemonset is rendered, We need to handle that scenario by running ovs-monitor-ipsec in the container. so this PR is also considering the transition phase of the process that is moving from container to host.

  • The ovn-keys init container configures ovs with IPsec certificate paths, so the container uses same host directory path to store and configure ovs with certificates because the ovs-monitor-ipsec process is running on the host now.

/assign @igsilya

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@pperiyasamy pperiyasamy force-pushed the use-openvswitch-ipsec-systemd-service branch from ac1108b to fd6e0e3 Compare May 7, 2025 07:22
Copy link
Contributor

openshift-ci bot commented May 7, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pperiyasamy
Once this PR has been reviewed and has the lgtm label, please ask for approval from jcaamano. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-upgrade openshift/machine-config-operator#4878

The ovn-ipsec-host daemonset pod currently spins up ovs-monitor-ipsec process
to configure IPsec connections with the peer nodes. This would make ipsec
connections to be established for the existing nodes a bit later after kubelet
is started at the time node/service restart scenario, but by the time workloads
are scheduled on the node started hitting traffic drops because of
unavailability of IPsec connections between nodes. This makes IPsec jobs in CI
so unstable and monitor jobs always failing during IPsec upgrade.

The FDP story (https://issues.redhat.com/browse/FDP-1051) gets openvswitch-ipsec
systemd service (runs ovs-monitor-ipsec) with required configurable parameters,
It's available with OVS 3.5 version. So this commit does the following.

2. Stop spawning ovs-monitor-ipsec as foreground process in the ovn-ipsec
container, Instead setup required IPsec configuration parameters in the
/etc/sysconfig/openvswitch file, enable and start the openvswitch-ipsec service
on the host. This is done at the of when ovn-ipsec-host pod is coming up
for the first time, for the pod restart scenarios, it just checks
openvswitch-ipsec service is running on the host, otherwise exit from the
container with error.

2. Enable and start openvswitch-ipsec systemd service from IPsec Machine
configs when service is already configured for east west traffic.

3. Keep running an ovn-ipsec container and redirects /var/log/openvswitch/ovs-monitor-ipsec.log
to the ovn-ipsec container's stdout console.

4. There is no necessity of doing ipsec state and policy cleanup in
ovn-ipsec-cleanup container when OVN IPsec is handled via
openvswitch-ipsec systemd service.

5. During the OCP upgrade, the new ipsec os extension takes while to deploy
with openvswitch3.5-ipsec package, so by the time ovn-ipsec-host daemonset
is rendered, We need to handle that scenario by running ovs-monitor-ipsec
in the container. so this commit is also considering the transition phase
of the process that is moving from container to host.

6. The ovn-keys init container configures ovs with IPsec certificate paths,
so the container uses same host directory path to store and configure ovs
with certificates because the ovs-monitor-ipsec process is running on the
host now.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
Having any changes in IPsec machine configs lead machine configs
rolled out two times and nodes are rebooted twice, because of this
apiserver pod's container are exited excessive amount of times, so
this commit removes the changes from IPsec machine configs and moves
those changes to be part of mco wait-for-ipsec-connect.service.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
@pperiyasamy pperiyasamy force-pushed the use-openvswitch-ipsec-systemd-service branch from fd6e0e3 to 4303e9b Compare May 8, 2025 08:45
@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-upgrade openshift/machine-config-operator#4878

Copy link
Contributor

openshift-ci bot commented May 8, 2025

@pperiyasamy: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/4.19-upgrade-from-stable-4.18-e2e-azure-ovn-upgrade fd6e0e3 link false /test 4.19-upgrade-from-stable-4.18-e2e-azure-ovn-upgrade
ci/prow/4.19-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade fd6e0e3 link false /test 4.19-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade
ci/prow/4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-upgrade fd6e0e3 link false /test 4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-upgrade
ci/prow/security 4303e9b link false /test security
ci/prow/e2e-azure-ovn 4303e9b link false /test e2e-azure-ovn
ci/prow/e2e-aws-ovn-upgrade 4303e9b link true /test e2e-aws-ovn-upgrade
ci/prow/okd-scos-e2e-aws-ovn 4303e9b link false /test okd-scos-e2e-aws-ovn
ci/prow/4.20-upgrade-from-stable-4.19-e2e-azure-ovn-upgrade 4303e9b link false /test 4.20-upgrade-from-stable-4.19-e2e-azure-ovn-upgrade
ci/prow/e2e-gcp-ovn-techpreview 4303e9b link true /test e2e-gcp-ovn-techpreview
ci/prow/e2e-vsphere-ovn-dualstack-primaryv6 4303e9b link false /test e2e-vsphere-ovn-dualstack-primaryv6
ci/prow/e2e-aws-ovn-serial 4303e9b link false /test e2e-aws-ovn-serial
ci/prow/e2e-metal-ipi-ovn-ipv6-ipsec 4303e9b link true /test e2e-metal-ipi-ovn-ipv6-ipsec
ci/prow/e2e-metal-ipi-ovn-ipv6 4303e9b link true /test e2e-metal-ipi-ovn-ipv6
ci/prow/e2e-network-mtu-migration-ovn-ipv6 4303e9b link false /test e2e-network-mtu-migration-ovn-ipv6
ci/prow/e2e-gcp-ovn-upgrade 4303e9b link true /test e2e-gcp-ovn-upgrade
ci/prow/e2e-aws-hypershift-ovn-kubevirt 4303e9b link false /test e2e-aws-hypershift-ovn-kubevirt
ci/prow/4.20-upgrade-from-stable-4.19-e2e-aws-ovn-upgrade 4303e9b link false /test 4.20-upgrade-from-stable-4.19-e2e-aws-ovn-upgrade
ci/prow/hypershift-e2e-aks 4303e9b link true /test hypershift-e2e-aks

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@pperiyasamy
Copy link
Member Author

In the latest CI runs openvswitch3.5-ipsec package is not included/installed in ipsec os extension, need to check why it's happening now.

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-serial openshift/machine-config-operator#4878

@pperiyasamy
Copy link
Member Author

/testwith openshift/cluster-network-operator/master/e2e-aws-ovn-ipsec-upgrade openshift/machine-config-operator#4878

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants