[Feature:Builds] forcePull should affect pulling builder images ForcePull test case execution docker #17596

0xmichalis · 2017-12-05T10:23:40Z

/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/test/extended/builds/forcepull.go:98
Expected error:
    <*util.ExitError | 0xc421317230>: {
        Cmd: "oc adm --config=/tmp/cluster-admin.kubeconfig --namespace=extended-test-forcepull-l9lm8-zs2kp policy remove-cluster-role-from-user system:build-strategy-custom extended-test-forcepull-l9lm8-zs2kp-user",
        StdErr: "Error from server (Conflict): Operation cannot be fulfilled on clusterrolebindings.authorization.openshift.io \"system:build-strategy-custom\": the object has been modified; please apply your changes to the latest version and try again",
        ExitError: {
            ProcessState: {
                pid: 18479,
                status: 256,
                rusage: {
                    Utime: {Sec: 0, Usec: 280068},
                    Stime: {Sec: 0, Usec: 33854},
                    Maxrss: 53748,
                    Ixrss: 0,
                    Idrss: 0,
                    Isrss: 0,
                    Minflt: 14580,
                    Majflt: 0,
                    Nswap: 0,
                    Inblock: 0,
                    Oublock: 0,
                    Msgsnd: 0,
                    Msgrcv: 0,
                    Nsignals: 0,
                    Nvcsw: 1483,
                    Nivcsw: 37,
                },
            },
            Stderr: nil,
        },
    }
    exit status 1
not to have occurred
/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/test/extended/builds/forcepull.go:90

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/17534/test_pull_request_origin_extended_conformance_gce/12371/

/sig developer-experience
/kind test-flake
/assign bparees

The text was updated successfully, but these errors were encountered:

bparees · 2017-12-05T15:05:32Z

@jim-minter this seems likely to be a race condition since we're modifying the system wide cluster role to perform this test. We need to find a different way to do it.

bparees · 2017-12-05T15:07:23Z

actually I guess we're just modifying the user. A simple retry might be sufficient here, though I also kinda wonder if we should even be exposed to this or if RBAC should be handling it for us. @enj ?

we're running

oc adm --config=/tmp/cluster-admin.kubeconfig --namespace=extended-test-forcepull-l9lm8-zs2kp policy remove-cluster-role-from-user system:build-strategy-custom extended-test-forcepull-l9lm8-zs2kp-user"

and getting

"Error from server (Conflict): Operation cannot be fulfilled on clusterrolebindings.authorization.openshift.io \"system:build-strategy-custom\": the object has been modified; please apply your changes to the latest version and try again"

bparees · 2017-12-05T15:08:49Z

(Is the race here that our call has to update the system:build-strategy-custom object to add our user? If so, should oc adm policy be handling that via an apply instead?)

bparees · 2017-12-05T15:08:55Z

@deads2k ^^

enj · 2017-12-07T15:11:00Z

@adelton @mrogers950 @simo5 any thoughts on adding some generic conflict handling to the policy commands? They already handle races with regards to creation of objects.

simo5 · 2017-12-07T16:11:28Z

@enj I have no objection if this is not going to cause unexpected (fail open) results

tnozicka · 2017-12-21T12:18:22Z

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/17827/test_pull_request_origin_extended_conformance_gce/13358/

deads2k · 2018-01-08T13:26:42Z

(Is the race here that our call has to update the system:build-strategy-custom object to add our user? If so, should oc adm policy be handling that via an apply instead?)

I think we should try writing a patch first. If a patch won't work (I suspect compound merge keys don't work), then I'm ok retrying on a remove, but not on an add. For adds, we should start using the oc create clusterrolebinding

deads2k · 2018-01-08T13:27:37Z

Oh, this is in a test? Don't change the command then. Go and update the test to create its own clusterrolebinding and delete its own clusterrolebinding. That's more in keeping with upstream and where we're going.

bparees · 2018-01-08T14:51:05Z

@deads2k it's in a test, but this behavior hurts everyone, why wouldn't we improve the behavior?

Go and update the test to create its own clusterrolebinding and delete its own clusterrolebinding. That's more in keeping with upstream and where we're going.

The test is running this command:

oc adm --config=/tmp/cluster-admin.kubeconfig --namespace=extended-test-forcepull-l9lm8-zs2kp policy remove-cluster-role-from-user system:build-strategy-custom extended-test-forcepull-l9lm8-zs2kp-user"

Switching the test to not use the commands we expect administrators to use, instead of fixing those commands so they can work in a parallel environment, seems user hostile to me.

bparees · 2018-01-09T03:40:20Z

then I'm ok retrying on a remove, but not on an add.

why does it matter? it's all race conditions. Whether you retry, or the admin happened to run the add slightly after someone else did the remove, either way the result is the same. If you want guarantees about who has what access in a policy, you need to either have only a single person making changes, or the changes need to be coordinated/ordered.

As soon as you have 2 different people running policy update commands independently(which is how you get the conflict in the first place), you've got no guarantee of the outcome whether or not the command retries conflicts itself or not.

Expecting a conflict error to catch the mistake you were about to make is a hail mary.

Automatic merge from submit-queue. avoid conflict updating custom build role fixes #17596 (I don't really consider this a proper fix, but apparently oc adm isn't going to get the proper fix so this is as close as we're getting).

openshift-ci-robot assigned bparees Dec 5, 2017

openshift-ci-robot added sig/developer-experience kind/test-flake Categorizes issue or PR as related to test flakes. labels Dec 5, 2017

0xmichalis mentioned this issue Dec 5, 2017

Exclude local tests from conformance #17534

Merged

bparees assigned jim-minter and unassigned bparees Dec 5, 2017

bparees added the priority/P1 label Dec 5, 2017

bparees mentioned this issue Dec 15, 2017

[Feature:Builds] forcePull should affect pulling builder images ForcePull test case execution docker [Suite:openshift/conformance/parallel] #17829

Closed

bparees assigned enj and unassigned jim-minter Dec 15, 2017

bparees mentioned this issue Jan 2, 2018

[Feature:Prometheus][Conformance] Prometheus when installed to the cluster should start and expose a secured proxy and unsecured metrics [Suite:openshift/conformance/parallel] 1m51s #17901

Closed

bparees mentioned this issue Jan 8, 2018

retry cluster policy updates in test logic #18024

Closed

bparees mentioned this issue Jan 10, 2018

avoid conflict updating custom build role #18054

Merged

openshift-merge-robot closed this as completed in #18054 Jan 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature:Builds] forcePull should affect pulling builder images ForcePull test case execution docker #17596

[Feature:Builds] forcePull should affect pulling builder images ForcePull test case execution docker #17596

0xmichalis commented Dec 5, 2017

bparees commented Dec 5, 2017

Uh oh!

bparees commented Dec 5, 2017

Uh oh!

bparees commented Dec 5, 2017

Uh oh!

bparees commented Dec 5, 2017

Uh oh!

enj commented Dec 7, 2017

Uh oh!

simo5 commented Dec 7, 2017

Uh oh!

tnozicka commented Dec 21, 2017

Uh oh!

deads2k commented Jan 8, 2018

Uh oh!

deads2k commented Jan 8, 2018

Uh oh!

bparees commented Jan 8, 2018

Uh oh!

bparees commented Jan 9, 2018

Uh oh!

[Feature:Builds] forcePull should affect pulling builder images ForcePull test case execution docker #17596

[Feature:Builds] forcePull should affect pulling builder images ForcePull test case execution docker #17596

Comments

0xmichalis commented Dec 5, 2017

bparees commented Dec 5, 2017

Uh oh!

bparees commented Dec 5, 2017

Uh oh!

bparees commented Dec 5, 2017

Uh oh!

bparees commented Dec 5, 2017

Uh oh!

enj commented Dec 7, 2017

Uh oh!

simo5 commented Dec 7, 2017

Uh oh!

tnozicka commented Dec 21, 2017

Uh oh!

deads2k commented Jan 8, 2018

Uh oh!

deads2k commented Jan 8, 2018

Uh oh!

bparees commented Jan 8, 2018

Uh oh!

bparees commented Jan 9, 2018

Uh oh!