Skip to content

sentry: skipped adding service rules for serviceEvent: Added, Error: Failed to find netid for namespace: 6t92g in vnid map #13265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
smarterclayton opened this issue Mar 6, 2017 · 7 comments
Assignees
Labels
component/networking kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P2

Comments

@smarterclayton
Copy link
Contributor

smarterclayton commented Mar 6, 2017

Got a burst of event failures from devpreview over one day, then looked like it resolved

*errors.errorString: event processing failed: skipped adding service rules for serviceEvent: Added, Error: Failed to find netid for namespace: 6t92g in vnid map
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/pkg/cmd/util/serviceability/sentry.go", line 61, in CaptureError
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/pkg/cmd/util/serviceability/panic.go", line 29, in CaptureError-fm
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/util/runtime/runtime.go", line 95, in HandleError
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/pkg/sdn/plugin/eventqueue.go", line 109, in func1
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/client/cache/delta_fifo.go", line 420, in Pop
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/pkg/sdn/plugin/eventqueue.go", line 114, in Pop
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/pkg/sdn/plugin/common.go", line 114, in runEventQueueForResource
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/pkg/sdn/plugin/common.go", line 144, in RunEventQueue
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/pkg/sdn/plugin/vnids_node.go", line 280, in watchServices
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/pkg/sdn/plugin/vnids_node.go", line 147, in watchServices)-fm
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/util/wait/wait.go", line 88, in func1
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/util/wait/wait.go", line 89, in JitterUntil
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/util/wait/wait.go", line 49, in Until
  File "/builddir/build/BUILD/atomic-openshift-git-0.46178be/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/util/wait/wait.go", line 41, in Forever
  File "/usr/lib/golang/src/runtime/asm_amd64.s", line 2086, in goexit
@smarterclayton
Copy link
Contributor Author

@openshift/networking

@pweil- pweil- added kind/bug Categorizes issue or PR as related to a bug. priority/P2 labels Mar 7, 2017
@pravisankar
Copy link

This could happen in few places in OpenShift SDN plugin: watchNetNamespaces, watchServices, watchNetworkPolicies and watchEgressNetworkPolicies
Series of events that can lead to this issue:
(a) On the master, new namespace was created
(b) On the master, asynchronously netID is being assigned to namespace but not persisted yet or master is temporarily down
(c) On the node, some of the asynchronous watch routines tries to fetch netID for the namespace created in (a). When (b) is not finished, fetching netID fails. We do alleviate this problem by performing retries (exponential backoff).

@openshift-sentry
Copy link

https://sentry.io/red-hat/openshift-3-stg/issues/251960899/

Is this covered by our existing fix? Just want to verify

@dcbw
Copy link
Contributor

dcbw commented Jun 8, 2017

Just a note here; if we see this again, make sure to check PLEG relist() times. If the node is loaded or docker is struggling (eg rhbz#1451902) that will potentially prevent some eventqueue/watches from happening but allow others to happen, leading to Pod/Service/etc ADD events happening before we get the NetNamespace ADD event.

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 12, 2018
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 15, 2018
@pravisankar
Copy link

This should be fixed by #17243

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/networking kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P2
Projects
None yet
Development

No branches or pull requests

7 participants