FailedMount and FailedSync events related to builder and deployer tokens #14383

jwforres · 2017-05-26T19:47:37Z

We are frequently seeing errors related to mounting secrets as volumes. Commonly appears for the builder-token and deployer-token. Possible this is something only happening on docker for mac. Reproduced by myself and @spadgett so far.

LASTSEEN   FIRSTSEEN   COUNT     NAME            KIND                    SUBOBJECT                     TYPE      REASON              SOURCE                        MESSAGE
1h         1h          1         test-2-build    Pod                                                   Warning   FailedMount         kubelet, localhost            Unable to mount volumes for pod "test-2-build_myproject(3dc52cca-4242-11e7-8550-ce8c6a9659b1)": timeout expired waiting for volumes to attach/mount for pod "myproject"/"test-2-build". list of unattached/unmounted volumes=[docker-socket builder-dockercfg-bsf9q-push builder-token-xpkt3]
1h         1h          1         test-2-build    Pod                                                   Warning   FailedSync          kubelet, localhost            Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "myproject"/"test-2-build". list of unattached/unmounted volumes=[docker-socket builder-dockercfg-bsf9q-push builder-token-xpkt3]
1h         1h          1         test-2-deploy   Pod                                                   Warning   FailedMount         kubelet, localhost            Unable to mount volumes for pod "test-2-deploy_myproject(7c6989e5-4242-11e7-8550-ce8c6a9659b1)": timeout expired waiting for volumes to attach/mount for pod "myproject"/"test-2-deploy". list of unattached/unmounted volumes=[deployer-token-900cs]
1h         1h          1         test-2-deploy   Pod                                                   Warning   FailedSync          kubelet, localhost            Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "myproject"/"test-2-deploy". list of unattached/unmounted volumes=[deployer-token-900cs]

The text was updated successfully, but these errors were encountered:

jwforres · 2017-05-26T19:48:18Z

cc @smarterclayton

smarterclayton · 2017-05-26T19:53:07Z

@derekwaynecarr @childsb

derekwaynecarr · 2017-07-05T19:09:44Z

@smarterclayton this looks similar to what @sjenning and i were debugging earlier this week.

derekwaynecarr · 2017-07-05T19:11:35Z

we think the problem was introduced in the kube 1.6.1 bump, need to isolate specific commit id.

sjenning · 2017-07-05T19:17:24Z

b282780 is known good and 359f62c is known bad. The commits in between don't build :-/

 $ git log --oneline --no-merges v3.6.0-alpha.1..359f62ccfb
vendor/k8s.io/kubernetes/
359f62ccfb (HEAD) UPSTREAM: 44970: CRI: Fix StopContainer timeout
dc20aea9e9 UPSTREAM: 44730: Check for terminating Pod prior to
launching successor in StatefulSet
6de438291c UPSTREAM: <drop>: disable apiserver loopback loop in generic context
566b41bffa UPSTREAM: 00000: make AsVersionedObjects default cleanly
3c064ffa17 UPSTREAM: <carry>: openapi test, patch in updated package name
e5fa6896b8 UPSTREAM: 44221: validateClusterInfo: use clientcmdapi.NewCluster()
91525e786e UPSTREAM: 44570: Explicit namespace from kubeconfig should
override in-cluster config
62d77ec05c UPSTREAM: <carry>: add OpenShift resources to garbage
collector ignore list
e51a861e52 UPSTREAM: 44859: e2e: handle nil ReplicaSet in
checkDeploymentRevision
6b12679660 UPSTREAM: 44861: NotRegisteredErr for known kinds not
registered in target GV
3adf4d3895 UPSTREAM: <drop>: Run make-update – vendor/ changes
e8d67d3ea0 bump(*): Kubernetes 1.6.1
b282780a04 UPSTREAM: 43375: Set permission for volume subPaths

sjenning · 2017-07-05T19:37:32Z

What is weird is that I can't reproduce on kube 1.6.0, 1.6.1, 1.6.6, or 1.7

sjenning · 2017-07-06T02:07:46Z

I found this BZ which was suppose to fix this.

https://bugzilla.redhat.com/show_bug.cgi?id=1444096#c4

Even passed QE.

https://bugzilla.redhat.com/show_bug.cgi?id=1444096#c18

Supposed to be fixed by #14144

Either 1) it didn't 2) there has been a regression since 3) this is a slightly different but related issue

@eparis @wongma7

sjenning · 2017-07-06T02:19:50Z

@ origin commit 22888b2

busybox pod

apiVersion: v1
kind: Pod
metadata:
  name: busybox
spec:
  containers:
  - name: busybox
    image: busybox
    command:
    - sleep
    - "10"
  terminationGracePeriodSeconds: 0
  restartPolicy: Never

Additional information from the node log:

I0705 21:11:16.624834    6139 reconciler.go:242] VerifyControllerAttachedVolume operation started for volume "kubernetes.io/secret/648bdd9a-61f0-11e7-8394-7085c20cf2ab-default-token-1qdbk" (spec.Name: "default-token-1qdbk") pod "648bdd9a-61f0-11e7-8394-7085c20cf2ab" (UID: "648bdd9a-61f0-11e7-8394-7085c20cf2ab")
I0705 21:11:16.777596    6139 operation_generator.go:609] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/648bdd9a-61f0-11e7-8394-7085c20cf2ab-default-token-1qdbk" (spec.Name: "default-token-1qdbk") pod "648bdd9a-61f0-11e7-8394-7085c20cf2ab" (UID: "648bdd9a-61f0-11e7-8394-7085c20cf2ab").
I0705 21:11:16.825685    6139 kuberuntime_manager.go:458] Container {Name:busybox Image:busybox Command:[sleep 10] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:default-token-1qdbk ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath:}] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:Always SecurityContext:&SecurityContext{Capabilities:&Capabilities{Add:[],Drop:[MKNOD SYS_CHROOT],},Privileged:*false,SELinuxOptions:&SELinuxOptions{User:,Role:,Type:,Level:s0:c1,c0,},RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
I0705 21:11:21.143315    6139 operation_generator.go:609] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/648bdd9a-61f0-11e7-8394-7085c20cf2ab-default-token-1qdbk" (spec.Name: "default-token-1qdbk") pod "648bdd9a-61f0-11e7-8394-7085c20cf2ab" (UID: "648bdd9a-61f0-11e7-8394-7085c20cf2ab").
I0705 21:11:22.148872    6139 operation_generator.go:609] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/648bdd9a-61f0-11e7-8394-7085c20cf2ab-default-token-1qdbk" (spec.Name: "default-token-1qdbk") pod "648bdd9a-61f0-11e7-8394-7085c20cf2ab" (UID: "648bdd9a-61f0-11e7-8394-7085c20cf2ab").

10s later when pod terminates

I0705 21:11:31.171556    6139 operation_generator.go:609] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/648bdd9a-61f0-11e7-8394-7085c20cf2ab-default-token-1qdbk" (spec.Name: "default-token-1qdbk") pod "648bdd9a-61f0-11e7-8394-7085c20cf2ab" (UID: "648bdd9a-61f0-11e7-8394-7085c20cf2ab").
I0705 21:11:31.266705    6139 reconciler.go:201] UnmountVolume operation started for volume "kubernetes.io/secret/648bdd9a-61f0-11e7-8394-7085c20cf2ab-default-token-1qdbk" (spec.Name: "default-token-1qdbk") from pod "648bdd9a-61f0-11e7-8394-7085c20cf2ab" (UID: "648bdd9a-61f0-11e7-8394-7085c20cf2ab").
I0705 21:11:31.305820    6139 operation_generator.go:684] UnmountVolume.TearDown succeeded for volume "kubernetes.io/secret/648bdd9a-61f0-11e7-8394-7085c20cf2ab-default-token-1qdbk" (OuterVolumeSpecName: "default-token-1qdbk") pod "648bdd9a-61f0-11e7-8394-7085c20cf2ab" (UID: "648bdd9a-61f0-11e7-8394-7085c20cf2ab"). InnerVolumeSpecName "default-token-1qdbk". PluginName "kubernetes.io/secret", VolumeGidValue ""
I0705 21:11:31.367046    6139 reconciler.go:363] Detached volume "kubernetes.io/secret/648bdd9a-61f0-11e7-8394-7085c20cf2ab-default-token-1qdbk" (spec.Name: "default-token-1qdbk") devicePath: ""
W0705 21:11:32.168613    6139 pod_container_deletor.go:77] Container "7b001d050e618d2d2f186b327e143e1bba7baf6d1f9e6131cd9ff53ff433c73f" not found in pod's containers

2m later

E0705 21:13:31.144859    6139 kubelet.go:1554] Unable to mount volumes for pod "busybox_default(648bdd9a-61f0-11e7-8394-7085c20cf2ab)": timeout expired waiting for volumes to attach/mount for pod "default"/"busybox". list of unattached/unmounted volumes=[default-token-1qdbk]; skipping pod
E0705 21:13:31.145105    6139 pod_workers.go:182] Error syncing pod 648bdd9a-61f0-11e7-8394-7085c20cf2ab ("busybox_default(648bdd9a-61f0-11e7-8394-7085c20cf2ab)"), skipping: timeout expired waiting for volumes to attach/mount for pod "default"/"busybox". list of unattached/unmounted volumes=[default-token-1qdbk]

The following bad events occur:

2017-07-05 21:13:31 -0500 CDT   2017-07-05 21:13:31 -0500 CDT   1         busybox   Pod                 Warning   FailedMount   kubelet, cerebellum.local.variantweb.net   Unable to mount volumes for pod "busybox_default(648bdd9a-61f0-11e7-8394-7085c20cf2ab)": timeout expired waiting for volumes to attach/mount for pod "default"/"busybox". list of unattached/unmounted volumes=[default-token-1qdbk]
2017-07-05 21:13:31 -0500 CDT   2017-07-05 21:13:31 -0500 CDT   1         busybox   Pod                 Warning   FailedSync   kubelet, cerebellum.local.variantweb.net   Error syncing pod

wongma7 · 2017-07-06T02:59:43Z

this is slightly different. My patch was for after a kubelet restart, the events would get spammed because kubelet "forgot" what name to use for a volume. Since there's only one/two events here it's something else

sjenning · 2017-07-06T03:03:49Z

@wongma7 thanks for clarifying

sjenning · 2017-07-10T14:45:44Z

I captured the stray pod worker goroutine that times out after podAttachAndMountTimeout and produces the undesired events. This is a trace from after the test pod terminates but before the timeout expires.

goroutine 44251 [chan receive]:
github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/wait.WaitFor(0xc4261b0300, 0xc43059f640, 0xc420060120, 0x4c47c40, 0xc4261b0300)
	/home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:270 +0x5d
github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/wait.pollInternal(0xc4261b0300, 0xc43059f640, 0xc4261b0300, 0xc43059f640)
	/home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:187 +0x41
github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/wait.Poll(0x11e1a300, 0x1bf08eb000, 0xc43059f640, 0xc42ef59220, 0x1)
	/home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:183 +0x4d
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/volumemanager.(*volumeManager).WaitForAttachAndMount(0xc433da9420, 0xc42598b200, 0x0, 0x0)
	/home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/volumemanager/volume_manager.go:357 +0x20f
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet.(*Kubelet).syncPod(0xc43156f800, 0x0, 0xc42598b200, 0x0, 0xc428032ee0, 0x0, 0xc42616bf20, 0xc4306fb4a0)
	/home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1552 +0xd2f
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet.(*Kubelet).(github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet.syncPod)-fm(0x0, 0xc42598b200, 0x0, 0xc428032ee0, 0x0, 0xefbf440, 0xc428032ee0)
	/home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:760 +0x51
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet.(*podWorkers).managePodLoop.func1(0xc427953f08, 0xc433da9490, 0xc427953ef0, 0x1, 0x0)
	/home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:173 +0x146
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet.(*podWorkers).managePodLoop(0xc433da9490, 0xc4325ca6c0)
	/home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:176 +0x17d
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet.(*podWorkers).UpdatePod.func1(0xc433da9490, 0xc4325ca6c0)
	/home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:220 +0x6c
created by github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet.(*podWorkers).UpdatePod
	/home/sjennings/projects/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/pod_workers.go:221 +0x350

sjenning · 2017-07-10T17:05:59Z

Added some logging

E0710 11:14:48.294323   10862 kubelet_pods.go:728] SETH podIsTerminated status.Phase: Running
E0710 11:14:48.711388   10862 kubelet.go:1903] SETH dispatchWork syncType: sync               <-- pod terminated 
E0710 11:14:48.711417   10862 kubelet_pods.go:728] SETH podIsTerminated status.Phase: Running <-- check in dispatchWork()
E0710 11:14:48.711427   10862 kubelet.go:1914] SETH kl.podWorkers.UpdatePod
E0710 11:14:48.711547   10862 kubelet_pods.go:728] SETH podIsTerminated status.Phase: Running <-- check in syncPod()
E0710 11:14:48.711639   10862 kubelet_pods.go:728] SETH podIsTerminated status.Phase: Succeeded
E0710 11:14:48.711761   10862 kubelet.go:1552] SETH syncPod @ kl.volumeManager.WaitForAttachAndMount() <-- waiting in syncPod()
E0710 11:14:50.292323   10862 kubelet_pods.go:728] SETH podIsTerminated status.Phase: Succeeded

The pod shows as Running in the podIsTerminated() calls from dispatchWork() and syncPod().

sjenning · 2017-07-10T18:07:42Z

mistake in my last comment. findAndAddActivePods() calls volumehelper.IsPodTerminated() not kubelet.podIsTerminated(). They do the same thing. Amended previous comment to avoid confusion.

sjenning · 2017-07-11T20:58:24Z

Found the difference between kube and origin:

I0710 16:29:10.136351   30372 kubelet.go:1848] SyncLoop (PLEG): "busybox_default(c71acf05-65b6-11e7-84de-7085c20cf2ab)", event: &pleg.PodLifecycleEvent{ID:"c71acf05-65b6-11e7-84de-7085c20cf2ab", Type:"ContainerDied", Data:"c6c9c5db3709186875c813deb09aa03fc02a78f89d794a6f6b3a3c0e17b3102c"}
I0710 16:29:10.136503   30372 kubelet_pods.go:1169] Generating status for "busybox_default(c71acf05-65b6-11e7-84de-7085c20cf2ab)"
I0710 16:29:10.136805   30372 volume_manager.go:345] Waiting for volumes to attach and mount for pod "busybox_default(c71acf05-65b6-11e7-84de-7085c20cf2ab)"
I0710 16:29:10.140055   30372 wrap.go:42] GET /api/v1/namespaces/default/pods/busybox: (2.812685ms) 200 [[openshift/v1.6.1+5115d708d7 (linux/amd64) kubernetes/fff65cf] 10.42.10.23:59568]
I0710 16:29:10.145230   30372 wrap.go:42] PUT /api/v1/namespaces/default/pods/busybox/status: (3.972476ms) 200 [[openshift/v1.6.1+5115d708d7 (linux/amd64) kubernetes/fff65cf] 10.42.10.23:59568]
I0710 16:29:10.145704   30372 status_manager.go:444] Status for pod "busybox_default(c71acf05-65b6-11e7-84de-7085c20cf2ab)" updated successfully: {status:{Phase:Succeeded Conditions:[{Type:Initialized Status:True LastProbeTime:{Time:{sec:0 nsec:0 loc:<nil>}} LastTransitionTime:{Time:{sec:63635318935 nsec:0 loc:0xefc0440}} Reason:PodCompleted Message:} {Type:Ready Status:False LastProbeTime:{Time:{sec:0 nsec:0 loc:<nil>}} LastTransitionTime:{Time:{sec:63635318950 nsec:0 loc:0xefc0440}} Reason:PodCompleted Message:} {Type:PodScheduled Status:True LastProbeTime:{Time:{sec:0 nsec:0 loc:<nil>}} LastTransitionTime:{Time:{sec:63635318935 nsec:0 loc:0xefc0440}} Reason: Message:}] Message: Reason: HostIP:10.42.10.23 PodIP:172.17.0.2 StartTime:0xc434215340 InitContainerStatuses:[] ContainerStatuses:[{Name:busybox State:{Waiting:<nil> Running:<nil> Terminated:0xc42bf72ee0} LastTerminationState:{Waiting:<nil> Running:<nil> Terminated:<nil>} Ready:false RestartCount:0 Image:docker.io/busybox:latest ImageID:docker-pullable://docker.io/busybox@sha256:be3c11fdba7cfe299214e46edc642e09514dbb9bbefcd0d3836c05a1e0cd0642 ContainerID:docker://c6c9c5db3709186875c813deb09aa03fc02a78f89d794a6f6b3a3c0e17b3102c}] QOSClass:BestEffort} version:3 podName:busybox podNamespace:default}
I0710 16:29:10.150088   30372 secret.go:186] Setting up volume default-token-jw0zp for pod c71acf05-65b6-11e7-84de-7085c20cf2ab at /home/sjennings/projects/go/src/github.com/openshift/origin/openshift.local.volumes/pods/c71acf05-65b6-11e7-84de-7085c20cf2ab/volumes/kubernetes.io~secret/default-token-jw0zp
I0710 16:29:10.153911   30372 wrap.go:42] GET /api/v1/namespaces/default/secrets/default-token-jw0zp: (3.093498ms) 200 [[openshift/v1.6.1+5115d708d7 (linux/amd64) kubernetes/fff65cf] 10.42.10.23:59568]
I0710 16:29:10.155250   30372 secret.go:217] Received secret default/default-token-jw0zp containing (4) pieces of data, 4109 total bytes
I0710 16:29:10.155732   30372 operation_generator.go:609] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/c71acf05-65b6-11e7-84de-7085c20cf2ab-default-token-jw0zp" (spec.Name: "default-token-jw0zp") pod "c71acf05-65b6-11e7-84de-7085c20cf2ab" (UID: "c71acf05-65b6-11e7-84de-7085c20cf2ab").
I0710 16:29:10.250346   30372 reconciler.go:201] UnmountVolume operation started for volume "kubernetes.io/secret/c71acf05-65b6-11e7-84de-7085c20cf2ab-default-token-jw0zp" (spec.Name: "default-token-jw0zp") from pod "c71acf05-65b6-11e7-84de-7085c20cf2ab" (UID: "c71acf05-65b6-11e7-84de-7085c20cf2ab").
I0710 16:29:10.250512   30372 util.go:367] Tearing down volume default-token-jw0zp for pod c71acf05-65b6-11e7-84de-7085c20cf2ab at /home/sjennings/projects/go/src/github.com/openshift/origin/openshift.local.volumes/pods/c71acf05-65b6-11e7-84de-7085c20cf2ab/volumes/kubernetes.io~secret/default-token-jw0zp
I0710 16:29:10.302562   30372 operation_generator.go:684] UnmountVolume.TearDown succeeded for volume "kubernetes.io/secret/c71acf05-65b6-11e7-84de-7085c20cf2ab-default-token-jw0zp" (OuterVolumeSpecName: "default-token-jw0zp") pod "c71acf05-65b6-11e7-84de-7085c20cf2ab" (UID: "c71acf05-65b6-11e7-84de-7085c20cf2ab"). InnerVolumeSpecName "default-token-jw0zp". PluginName "kubernetes.io/secret", VolumeGidValue ""
I0710 16:29:10.350655   30372 reconciler.go:363] Detached volume "kubernetes.io/secret/c71acf05-65b6-11e7-84de-7085c20cf2ab-default-token-jw0zp" (spec.Name: "default-token-jw0zp") devicePath: ""

Origin unmounts the volumes (last 4 lines) for the terminated pod while kube does not, despite pkg/kubelet/volumemanager being identical between the two.

Still don't have an answer for why one unmounts and the other doesn't. I think the expected behavior is unmounting after kubernetes/kubernetes#37228

sjenning · 2017-07-26T21:02:17Z

This is coarse and very likely has side effects, but does prevent the events from occuring

diff --git a/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go b/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go
index e1b7512ee0..b83603821f 100644
--- a/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go
+++ b/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go
@@ -1473,6 +1473,11 @@ func (kl *Kubelet) syncPod(o syncPodOptions) error {
        // Update status in the status manager
        kl.statusManager.SetPodStatus(pod, apiPodStatus)
 
+       // Exit syncPod if pod is terminated
+       if kl.podIsTerminated(pod) {
+               return nil
+       }
+
        // Kill pod if it should not be running
        if !runnable.Admit || pod.DeletionTimestamp != nil || apiPodStatus.Phase == v1.PodFailed {
                var syncErr error

It does seem like much of the rest of syncPod() past this point should not be done for terminated pods.

sjenning · 2017-07-26T21:49:26Z

So everything about upstream kube not being effected by this was wrong. I was just running hack/local-up-cluster.sh that defaults to --keep-terminated-pod-volume=true, which was masking the issue.

Upstream issue opened.

@derekwaynecarr

Automatic merge from submit-queue (batch tested with PRs 49284, 49555, 47639, 49526, 49724) skip WaitForAttachAndMount for terminated pods in syncPod Fixes #49663 I tried to tread lightly with a small localized change because this needs to be picked to 1.7 and 1.6 as well. I suspect this has been as issue since we started unmounting volumes on pod termination #37228 xref openshift/origin#14383 @derekwaynecarr @eparis @smarterclayton @saad-ali @jwforres /release-note-none

eparis · 2017-08-01T16:10:03Z

it's spam, dropping to p2. (seth is going to open a bz to track in 3.6)

@derekwaynecarr

Automatic merge from submit-queue UPSTREAM: 49724: skip WaitForAttachAndMount for terminated pods in syncPod Fixes #14383 xref kubernetes/kubernetes#49724 @derekwaynecarr @eparis @smarterclayton @jwforres

pweil- added component/kubernetes kind/bug Categorizes issue or PR as related to a bug. priority/P1 labels May 30, 2017

pweil- assigned childsb May 30, 2017

derekwaynecarr assigned sjenning Jul 5, 2017

sjenning mentioned this issue Jul 26, 2017

spurious FailedSync and FailedMount events 2 minutes after pod terminates kubernetes/kubernetes#49663

Closed

This was referenced Jul 27, 2017

skip WaitForAttachAndMount for terminated pods in syncPod kubernetes/kubernetes#49724

Merged

UPSTREAM: 49724: skip WaitForAttachAndMount for terminated pods in syncPod #15533

Merged

eparis added priority/P2 and removed priority/P1 labels Aug 1, 2017

openshift-merge-robot closed this as completed in #15533 Aug 2, 2017

FailedMount and FailedSync events related to builder and deployer tokens #14383

FailedMount and FailedSync events related to builder and deployer tokens #14383

Comments

jwforres commented May 26, 2017

jwforres commented May 26, 2017

Uh oh!

smarterclayton commented May 26, 2017

Uh oh!

derekwaynecarr commented Jul 5, 2017

Uh oh!

derekwaynecarr commented Jul 5, 2017

Uh oh!

sjenning commented Jul 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sjenning commented Jul 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sjenning commented Jul 6, 2017

Uh oh!

sjenning commented Jul 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wongma7 commented Jul 6, 2017

Uh oh!

sjenning commented Jul 6, 2017

Uh oh!

sjenning commented Jul 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sjenning commented Jul 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sjenning commented Jul 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sjenning commented Jul 11, 2017

Uh oh!

sjenning commented Jul 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sjenning commented Jul 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eparis commented Aug 1, 2017

Uh oh!

sjenning commented Jul 5, 2017 •

edited

Loading

sjenning commented Jul 5, 2017 •

edited

Loading

sjenning commented Jul 6, 2017 •

edited

Loading

sjenning commented Jul 10, 2017 •

edited

Loading

sjenning commented Jul 10, 2017 •

edited

Loading

sjenning commented Jul 10, 2017 •

edited

Loading

sjenning commented Jul 26, 2017 •

edited

Loading

sjenning commented Jul 26, 2017 •

edited

Loading