Skip to content

Policy never evicting pods, despite finding fits #1627

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jekawo opened this issue Feb 7, 2025 · 4 comments
Open

Policy never evicting pods, despite finding fits #1627

jekawo opened this issue Feb 7, 2025 · 4 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@jekawo
Copy link

jekawo commented Feb 7, 2025

What version of descheduler are you using?

descheduler version:
0.32.1
Helm chart:

Does this issue reproduce with the latest release?
on latest

Which descheduler CLI options are you using?

Please provide a copy of your descheduler policy config file

kind: Deployment

# labels that'll be applied to all resources
commonLabels: { "Descheduler" }

# Required when running as a Deployment
deschedulingInterval: 15m

cmdOptions:
  v: 5

evictionFailureEventNotification : true

# deschedulerPolicy contains the policies the descheduler will execute.
# To use policies stored in an existing configMap use:
# NOTE: The name of the cm should comply to {{ template "descheduler.fullname" . }}
deschedulerPolicy:
  nodeSelector: "agentpool=nodepool1"
  profiles:
    - name: default
      pluginConfig:
        - name: DefaultEvictor
          nodeFit: true
        - name: RemovePodsViolatingNodeAffinity
          args:
            nodeAffinityType:
              - requiredDuringSchedulingIgnoredDuringExecution
              - preferredDuringSchedulingIgnoredDuringExecution
            namespaces:
              include:
                - staging
                #- production
        - name: RemovePodsViolatingNodeTaints
          args:
            namespaces:
              include:
                - staging
                #- production
        - name: RemovePodsViolatingInterPodAntiAffinity
          args:
            namespaces:
              include:
                - development
                - staging
                #- production
      plugins:
        deschedule:
          enabled:
            - RemovePodsViolatingNodeTaints
            - RemovePodsViolatingNodeAffinity
            - RemovePodsViolatingInterPodAntiAffinity

What k8s version are you using (kubectl version)?

kubectl version Output
$ kubectl version
Client Version: v1.30.5
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.31.3

What did you do?


Deployed descheduler with the above config, expecting pods to be evicted from nodepool1 if they would fit better in staging at the current point in time.

Logs bear witness and acknowledges that staging nodes fit, and thus should be evicted.

However descheduler continously reports 0 evictions and 0 attempts, it also does not show an eviction error.

I added the pod disruption budget rules to the cluster role manually to get the current version working.

What did you expect to see?

Pods evicted if they could fit on tolerated nodes.

What did you see instead?

0 pods evicted ever.

@jekawo jekawo added the kind/bug Categorizes issue or PR as related to a bug. label Feb 7, 2025
@AlexGurtoff
Copy link

We are experiencing the same issue. We have such configMap:

apiVersion: "descheduler/v1alpha2"
kind: "DeschedulerPolicy"
profiles:
- name: default-profile
  pluginConfig:
  - args:
      nodeAffinityType:
      - preferredDuringSchedulingIgnoredDuringExecution
    name: RemovePodsViolatingNodeAffinity
  plugins:
    deschedule:
      enabled:
      - RemovePodsViolatingNodeAffinity

Also, we have the pod which is deployed onto node. It has such nodeAffinity

  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          preference:
            matchExpressions:
              - key: cloud.google.com/gke-spot
                operator: Exists

And there are a second node which is "better" (it has cloud.google.com/gke-spot). But descheduler do nothing

I0220 12:29:51.668857 1 descheduler.go:173] Setting up the pod evictor
I0220 12:29:51.668964 1 node_affinity.go:81] "Executing for nodeAffinityType" nodeAffinity="preferredDuringSchedulingIgnoredDuringExecution"
...
I0220 12:29:51.680954 1 node_affinity.go:121] "Processing node" node="some_node_name"
...
I0220 12:29:51.687967 1 profile.go:317] "Total number of pods evicted" extension point="Deschedule" evictedPods=0
I0220 12:29:51.688006 1 descheduler.go:179] "Number of evicted pods" totalEvicted=0

@AlexGurtoff
Copy link

I found the issue. I needed to pass an additional parameter: evictLocalStoragePods.

    profiles:
    - name: default
      pluginConfig:
      - args:
          evictLocalStoragePods: true
        name: DefaultEvictor

With this change, it started working.

Additionally, increasing the verbosity level can be helpful for debugging:

        cmdOptions:
          v: 5

@jekawo
Copy link
Author

jekawo commented Mar 24, 2025

I found the issue. I needed to pass an additional parameter: evictLocalStoragePods.

    profiles:
    - name: default
      pluginConfig:
      - args:
          evictLocalStoragePods: true
        name: DefaultEvictor

With this change, it started working.

Additionally, increasing the verbosity level can be helpful for debugging:

        cmdOptions:
          v: 5

I already am using highest verbosity, and you fix did not fix it for us.

@Montelime
Copy link

What version of descheduler are you using?

descheduler version: 0.32.1 Helm chart:

Does this issue reproduce with the latest release? on latest

Which descheduler CLI options are you using?

Please provide a copy of your descheduler policy config file

kind: Deployment

# labels that'll be applied to all resources
commonLabels: { "Descheduler" }

# Required when running as a Deployment
deschedulingInterval: 15m

cmdOptions:
  v: 5

evictionFailureEventNotification : true

# deschedulerPolicy contains the policies the descheduler will execute.
# To use policies stored in an existing configMap use:
# NOTE: The name of the cm should comply to {{ template "descheduler.fullname" . }}
deschedulerPolicy:
  nodeSelector: "agentpool=nodepool1"
  profiles:
    - name: default
      pluginConfig:
        - name: DefaultEvictor
          nodeFit: true
        - name: RemovePodsViolatingNodeAffinity
          args:
            nodeAffinityType:
              - requiredDuringSchedulingIgnoredDuringExecution
              - preferredDuringSchedulingIgnoredDuringExecution
            namespaces:
              include:
                - staging
                #- production
        - name: RemovePodsViolatingNodeTaints
          args:
            namespaces:
              include:
                - staging
                #- production
        - name: RemovePodsViolatingInterPodAntiAffinity
          args:
            namespaces:
              include:
                - development
                - staging
                #- production
      plugins:
        deschedule:
          enabled:
            - RemovePodsViolatingNodeTaints
            - RemovePodsViolatingNodeAffinity
            - RemovePodsViolatingInterPodAntiAffinity

What k8s version are you using (kubectl version)?

kubectl version Output

What did you do? Deployed descheduler with the above config, expecting pods to be evicted from nodepool1 if they would fit better in staging at the current point in time. Logs bear witness and acknowledges that staging nodes fit, and thus should be evicted. However descheduler continously reports 0 evictions and 0 attempts, it also does not show an eviction error.

I added the pod disruption budget rules to the cluster role manually to get the current version working.

What did you expect to see? Pods evicted if they could fit on tolerated nodes.

What did you see instead? 0 pods evicted ever.

n

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants