Skip to content
This repository was archived by the owner on Jul 25, 2019. It is now read-only.

Add a mutex to registry.podsByIP #292

Merged
merged 1 commit into from
Apr 13, 2016

Conversation

danwinship
Copy link
Contributor

The pod-monitoring and endpoint-filtering threads might both run at the same time so we need a mutex here. Fixes #291.

As for the actual race condition, if a pod which is an endpoint is added, and OnEndpointsUpdate() ends up running before trackPod(), then the pod won't be allowed as an endpoint and will be ignored until the
next time OnEndpointsUpdate() runs, when it will get picked up. So isolation is guaranteed but correct functioning of the service is not...

But can that actually happen? It seems unlikely unless there's really unfair thread scheduling... (The more common case is that trackPod/unTrackPod and OnEndpointsUpdates are running at the same time for unrelated reasons.) Anyway, we don't like this code and want it to go away and be replaced by something else anyway...

@openshift/networking PTAL

@knobunc
Copy link
Contributor

knobunc commented Apr 12, 2016

LGTM

switch eventType {
case watch.Added, watch.Modified:
registry.trackPod(pod)
case watch.Deleted:
registry.unTrackPod(pod)
}
registry.podTrackingLock.Unlock()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this is not an issue but there could be potential issues later on with this approach:

  • Since we are taking lock at a wider scope than needed, if a method (trackPod/unTrackPod...) panics for some reason holding the lock, all dependent methods will be blocked
  • We need to sprinkle these lock/unlock whenever we access podsByIP

Instead I prefer adding these 3 methods:
GetPodInfo() {
get lock
fetch from podsByIP
release lock
}
SetPodInfo(podInfo) {
get lock
add/set pod info
release lock
}
UnSetPodInfo(podInfo) {
get lock
remove from podsByIP
release lock
}
Use these methods where ever required : in endpointsUpdate, trackPod and untrackPod

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead I prefer adding these 3 methods:
...
Use these methods where ever required : in endpointsUpdate, trackPod and untrackPod

The lock needs to be held throughout trackPod() and untrackPod() though or else you might test one value and then delete another, etc.

Re-pushed with proper go-ish use of defer to unlock the mutexes, and changed trackPod() to delete stale data itself rather than calling out to unTrackPod() to do it. (unTrackPod() does an additional check which is dropped now, but that check should be irrelevant; if the UID is correct, then we want to remove the pod, regardless of whether the IP matches or not [which it should, but...])

The pod-monitoring and endpoint-filtering threads might both run at
the same time so we need a mutex here.

As for the actual race condition, if a pod which is an endpoint is
added, and OnEndpointsUpdate() ends up running before trackPod(), then
the pod won't be allowed as an endpoint and will be ignored until the
next time OnEndpointsUpdate() runs, when it will get picked up. So
isolation is guaranteed but correct functioning of the service is
not...
@pravisankar
Copy link

Looks good

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants