-
Notifications
You must be signed in to change notification settings - Fork 243
Failing "odo link" tests are blocking CI system #4301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
What's interesting is that this test hasn't failed remotely as often on the integration tests as it has failed in periodic e2e tests. Before I make any change in odo code w.r.t this issue, I'd prefer we have more information about the cause for such erratic beahviour. @prietyc123 @mohammedzee1000 can you folks shed any light on why we're seeing this mostly in periodic tests only? |
This is really a weird behaviour and more interestingly we are hitting it on PSI as well https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_odo/4063/pull-ci-openshift-odo-master-v4.6-integration-e2e/1344158957064687616#1:build-log.txt%3A1371 . Kind of blocker for POC pr #4063
I think if we want to get more info on this then the only way would be adding more debug logs. /high-priority |
Can we have access to the system that's hosting the tests when we hit this failure? That would be much more helpful for me to troubleshoot the problem. cc @mohammedzee1000 (for PSI). |
The above explanation (in the issue description) is based on initial observation. After logging into a cluster where we are seeing this issue repeatedly as a part of PR #4063, I observed that the test spec is filled with a lot of checks. https://github.com/openshift/odo/blob/e6be2586f5824bade91599f24686def26b53f1ee/tests/integration/operatorhub/cmd_service_test.go#L410 The check where things are failing is trying to test an edge case which could be put in a separate spec of its own. https://github.com/openshift/odo/blob/e6be2586f5824bade91599f24686def26b53f1ee/tests/integration/operatorhub/cmd_service_test.go#L448-L453 |
@mohammedzee1000 @prietyc123 I'm going to open a PR that separates the edge case check mentioned in #4301 (comment) into a separate spec. I think it should help fix the issue. |
@prietyc123 @mohammedzee1000 PTAL #4338 (comment). I won't be surprised if the problem we were tracking in this issue is still a troublemaker for #4063 and other places (like periodic jobs.) |
Also hitting the issue more on CI. Recently observed https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_odo/4317/pull-ci-openshift-odo-master-v4.6-integration-e2e/1349188058104205312#1:build-log.txt%3A1091 |
we are hitting this issue on PSI CI jobs for PR and periodic jobs. |
This issue is happening because of failure in finding a pod that belongs to the components. Exact place where it's failing is https://github.com/openshift/odo/blob/ede98da442cdfdd8691a9b7ba4b3421f5507bb33/pkg/devfile/adapters/kubernetes/component/adapter.go#L118 |
When I looked into the system, pod failed to come up: $ kubectl get pods
NAME READY STATUS RESTARTS AGE
example-95glrr7px7 1/1 Running 0 129m
example-c4n2564mcj 1/1 Running 0 129m
example-f42jp26w7m 1/1 Running 0 129m
ydpohr-69b6b8cfb5-gl4km 0/1 CreateContainerConfigError 0 129m
The reason for failure from Events:
|
My understanding right now is as follows.
One thing I'm thinking to try out is to add some time out after the |
FWIW, it did help to add some sleep. I've opened #4428 which should hopefully fix this! 🤞 |
@dharmit still I can observe the same failure 🙁 on periodic jobs.
Note: Its not ocp version specific as I can also observe it on 4.7 |
@prietyc123 thanks for the info! I expect this to get fixed with #4554 wherein we stop looking for an assumed secret name. The assumption was valid till SBO changed the nomenclature for Secret. |
Closed via #4554 |
/kind bug
/area linking
odo doesn't fail if at the time of doing
odo link <cr-name>/<cr-instance-name>
, it fails to create a Secret on the cluster. Creation of this Secret is a task performed by Service Binding Operator. But if odo doesn't fail when such a Secret doesn't get created, it can cause user confusion as well as CI issues as we're seeing in our environment.Originally posted by @dharmit in #3256 (comment)
The text was updated successfully, but these errors were encountered: