Skip to content

This series is a re-rewrite of the image generator code and related manifests #375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Feb 28, 2024

Conversation

bpradipt
Copy link
Contributor

@bpradipt bpradipt commented Feb 14, 2024

Closes https://issues.redhat.com/browse/KATA-2633

Current flow : Create podvm image -> create mcp, kata rpm etc -> create kata runtimeclass -> create kata-remote runtimeclass
New flow: create mcp, kata rpm etc -> create kata runtimeclass -> create podvm image -> create kata-remote runtimeclass

The image generation status events are posted as Kubernetes events for monitoring

How to check for events during image generation when creating kataconfig?

oc get events -n openshift-sandboxed-containers-operator --field-selector involvedObject.name=osc-podvm-image-creation

How to check for events during image deletion when deleting kataconfig?

oc get events -n openshift-sandboxed-containers-operator --field-selector involvedObject.name=osc-podvm-image-deletion

@bpradipt bpradipt marked this pull request as draft February 14, 2024 13:16
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 14, 2024
@openshift-ci openshift-ci bot requested review from littlejawa and snir911 February 14, 2024 13:20
@bpradipt bpradipt marked this pull request as ready for review February 14, 2024 13:34
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 14, 2024
@bpradipt
Copy link
Contributor Author

Catalog for testing: quay.io/bpradipt/openshift-sandboxed-containers-operator-catalog:v1.5.2

@openshift-ci openshift-ci bot requested a review from pmores February 14, 2024 13:35
@bpradipt
Copy link
Contributor Author

@vvoronko @snir911

@bpradipt
Copy link
Contributor Author

oc get events -n openshift-sandboxed-containers-operator --field-selector involvedObject.name=osc-podvm-image-creation 
LAST SEEN   TYPE     REASON                 OBJECT                         MESSAGE
153m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
153m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
148m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
148m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   PodVMImageJobRunning   job/osc-podvm-image-creation   PodVM image job (osc-podvm-image-creation) is running
155m        Normal   SuccessfulCreate       job/osc-podvm-image-creation   Created pod: osc-podvm-image-creation-q9d76
141m        Normal   Completed              job/osc-podvm-image-creation   Job completed

@bpradipt bpradipt force-pushed the podvm-builder branch 3 times, most recently from 6fd6c68 to da4a26e Compare February 14, 2024 18:26
Copy link
Contributor

@snir911 snir911 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bpradipt ! added some comments, do you have a build for this?
If @abhbaner have some cycles for pre-merge test it, it would be perfect (as he found the corner cases in the previous implementation)

@bpradipt
Copy link
Contributor Author

@snir911 catalog for testing - quay.io/bpradipt/openshift-sandboxed-containers-operator-catalog:v1.5.2

Copy link
Contributor

@snir911 snir911 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, assuming pre-merge testing was ok, thanks!

@bpradipt bpradipt force-pushed the podvm-builder branch 4 times, most recently from 22672be to f71d64f Compare February 16, 2024 17:25
@vvoronko
Copy link

new AMI generation flow was pre-merge tested

@bpradipt bpradipt force-pushed the podvm-builder branch 3 times, most recently from 29f36d8 to f95c5c1 Compare February 21, 2024 17:23
@vvoronko
Copy link

I appreciate Pradipta's effort to produce quality code, after few iterations we managed to:

  • keep AMI ID at single place
  • keep creation and deletion pods (I don't think really, that single pod demands much OCP resources and always could be deleted manually by admin)
  • AMI job status provided via kataconfig status is visible in Console
    Overall IMHO this improves customer experience

status:
conditions:

  • lastTransitionTime: "2024-02-22T06:51:26Z"
    message: Creating Pod VM Image
    reason: PodVMImageJobRunning
    status: "True"
    type: InProgress
    kataNodes:
    installed:
    • vvoron415azr22-5hft5-worker-eastus1-8szq5
      nodeCount: 1
      readyNodeCount: 1
      runtimeClasses:
  • kata
    waitingForMcoToStart: false
    Screenshot from 2024-02-22 08-53-28

1. Fix reading of required attributes from secret and configmap
2. Refactor code for better readibility and modularity
3. Add event notification to indicate the status of the job
4. Improve job status evaluation
5. Convert image generator instance to singleton

Signed-off-by: Pradipta Banerjee <[email protected]>
The event "reason" code is used as the cache key.
The default time for suppressing the event having the same reason code
is 2 min.

Signed-off-by: Pradipta Banerjee <[email protected]>
Move the podvm image creation post kata installation. Otherwise
the kata installation remains stuck with no indication on what is
happening

Signed-off-by: Pradipta Banerjee <[email protected]>
@bpradipt
Copy link
Contributor Author

@gkurz @pmores @littlejawa @jensfr if you have time, it will be great to have your comments on this PR, especially on the following:

  1. New kataconfig status to indicate the pod vm image job creation/deletion
  2. I took the liberty to fix some staticcheck warnings and include those as separate commits in the same PR.

The latest code is pre-merge tested by @vvoronko Thanks a lot for his help.
Thanks to Snir for initial round of reviews and feedback.

@vvoronko
Copy link

@jensfr @bpradipt @snir911 Does this PR affect the default AMI image in downstream?

There are cloud provider specific handler scripts for image creation and
deletion.
Also better error handling and documented scripts

Signed-off-by: Pradipta Banerjee <[email protected]>
The controller-manager was gettign OOMkilled with
latest changes. Hence bumping up the memory limit

Signed-off-by: Pradipta Banerjee <[email protected]>
Consolidate all podvm image related constant in a single place

Signed-off-by: Pradipta Banerjee <[email protected]>
This can be used in openshift controller for updating
KataConfig CRD Status

Signed-off-by: Pradipta Banerjee <[email protected]>
Add status update to KataConfig as well in addition
to Kubernetes events

Signed-off-by: Pradipta Banerjee <[email protected]>
Modify error strings to not use caps

Signed-off-by: Pradipta Banerjee <[email protected]>
Adjust the return values to ensure error is the last return value.
Also avoids go-staticcheck warnings

Signed-off-by: Pradipta Banerjee <[email protected]>
getConditionReason and kataOcExists methods are no longer used.
However let's keep it for now but ignore it as part of staticcheck
warnings

Signed-off-by: Pradipta Banerjee <[email protected]>
@bpradipt
Copy link
Contributor Author

@jensfr @bpradipt @snir911 Does this PR affect the default AMI image in downstream?

If you meant whether this PR changes the AMI image that gets created using the existing code, then answer is yes. Since this will use latest pod vm components (like agent, protocol-forwarder etc)
Other than that there are no changes

Copy link
Contributor

@littlejawa littlejawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have the environment to test it with, but I trust @vvoronko on that.
LGTM from the code standpoint.

Thanks @bpradipt

@bpradipt
Copy link
Contributor Author

Thanks @littlejawa @snir911 for acks
Thanks @vvoronko for pre-merge testing.

@bpradipt
Copy link
Contributor Author

/override ci/prow/check

1 similar comment
@gkurz
Copy link
Member

gkurz commented Feb 28, 2024

/override ci/prow/check

Copy link

openshift-ci bot commented Feb 28, 2024

@bpradipt: Overrode contexts on behalf of bpradipt: ci/prow/check

In response to this:

/override ci/prow/check

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

openshift-ci bot commented Feb 28, 2024

@gkurz: Overrode contexts on behalf of gkurz: ci/prow/check

In response to this:

/override ci/prow/check

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

openshift-ci bot commented Feb 28, 2024

@bpradipt: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/sandboxed-containers-operator-e2e fe0fdc0 link false /test sandboxed-containers-operator-e2e

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@bpradipt bpradipt merged commit a96f806 into openshift:devel Feb 28, 2024
@bpradipt bpradipt deleted the podvm-builder branch February 28, 2024 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants