Add a mutex to registry.podsByIP #292

danwinship · 2016-04-12T17:43:46Z

The pod-monitoring and endpoint-filtering threads might both run at the same time so we need a mutex here. Fixes #291.

As for the actual race condition, if a pod which is an endpoint is added, and OnEndpointsUpdate() ends up running before trackPod(), then the pod won't be allowed as an endpoint and will be ignored until the
next time OnEndpointsUpdate() runs, when it will get picked up. So isolation is guaranteed but correct functioning of the service is not...

But can that actually happen? It seems unlikely unless there's really unfair thread scheduling... (The more common case is that trackPod/unTrackPod and OnEndpointsUpdates are running at the same time for unrelated reasons.) Anyway, we don't like this code and want it to go away and be replaced by something else anyway...

@openshift/networking PTAL

knobunc · 2016-04-12T17:58:34Z

LGTM

pravisankar · 2016-04-12T19:11:20Z

plugins/osdn/registry.go

 		switch eventType {
 		case watch.Added, watch.Modified:
 			registry.trackPod(pod)
 		case watch.Deleted:
 			registry.unTrackPod(pod)
 		}
+		registry.podTrackingLock.Unlock()


Currently this is not an issue but there could be potential issues later on with this approach:

Since we are taking lock at a wider scope than needed, if a method (trackPod/unTrackPod...) panics for some reason holding the lock, all dependent methods will be blocked

We need to sprinkle these lock/unlock whenever we access podsByIP

Instead I prefer adding these 3 methods:
GetPodInfo() {
get lock
fetch from podsByIP
release lock
}
SetPodInfo(podInfo) {
get lock
add/set pod info
release lock
}
UnSetPodInfo(podInfo) {
get lock
remove from podsByIP
release lock
}
Use these methods where ever required : in endpointsUpdate, trackPod and untrackPod

Instead I prefer adding these 3 methods:
...
Use these methods where ever required : in endpointsUpdate, trackPod and untrackPod

The lock needs to be held throughout trackPod() and untrackPod() though or else you might test one value and then delete another, etc.

Re-pushed with proper go-ish use of defer to unlock the mutexes, and changed trackPod() to delete stale data itself rather than calling out to unTrackPod() to do it. (unTrackPod() does an additional check which is dropped now, but that check should be irrelevant; if the UID is correct, then we want to remove the pod, regardless of whether the IP matches or not [which it should, but...])

The pod-monitoring and endpoint-filtering threads might both run at the same time so we need a mutex here. As for the actual race condition, if a pod which is an endpoint is added, and OnEndpointsUpdate() ends up running before trackPod(), then the pod won't be allowed as an endpoint and will be ignored until the next time OnEndpointsUpdate() runs, when it will get picked up. So isolation is guaranteed but correct functioning of the service is not...

pravisankar · 2016-04-13T18:28:14Z

Looks good

pravisankar reviewed Apr 12, 2016
View reviewed changes

danwinship force-pushed the endpoints-race branch from 3ee3624 to 809ec29 Compare April 12, 2016 20:55

danwinship mentioned this pull request Apr 12, 2016

bump(github.com/openshift/openshift-sdn): 9f1f60258fcef6f0ef647a75a87 openshift/origin#8468

Merged

pravisankar merged commit c756bc4 into openshift:master Apr 13, 2016

danwinship deleted the endpoints-race branch April 13, 2016 21:23

pravisankar mentioned this pull request Apr 22, 2016

bump(github.com/openshift/openshift-sdn) ba3087afd66cce7c7d918af10ad openshift/origin#8614

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a mutex to registry.podsByIP #292

Add a mutex to registry.podsByIP #292

Uh oh!

danwinship commented Apr 12, 2016

Uh oh!

knobunc commented Apr 12, 2016

Uh oh!

pravisankar Apr 12, 2016

Uh oh!

danwinship Apr 12, 2016

Uh oh!

pravisankar commented Apr 13, 2016

Uh oh!

Uh oh!

Add a mutex to registry.podsByIP #292

Add a mutex to registry.podsByIP #292

Uh oh!

Conversation

danwinship commented Apr 12, 2016

Uh oh!

knobunc commented Apr 12, 2016

Uh oh!

pravisankar Apr 12, 2016

Choose a reason for hiding this comment

Uh oh!

danwinship Apr 12, 2016

Choose a reason for hiding this comment

Uh oh!

pravisankar commented Apr 13, 2016

Uh oh!

Uh oh!