Skip to content

Karpenter and its ConsolidateAfter config does not work Optimally for Empty Nodes #2254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Adityashar opened this issue May 26, 2025 · 3 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@Adityashar
Copy link

Adityashar commented May 26, 2025

Description

Observed Behavior:

When we perform a large and a sudden deployment, Karpenter has a habit of over-provisioning nodes or creating more nodeClaims that what are actually required. Eg: If we deploy a simple app with 50 pods, then Karpenter will provision twice or thrice the amount of nodes than what is actually needed, eg, 20-25 new nodes while our original 50 pods deployment gets scheduled only on 10 nodes.

Now, we also have another requirement where we want to make use of the consolidateAfter flag (say 25min) in the same nodePool. With the above behaviour (of over-provisioned nodes), we are now wasting a lot of resources and cloud costs since 10-15 "completely empty" nodes remain in our K8s clusters until this time duration of 25 mins is hit and Karpenter decides to terminate them.

Expected Behavior:

  1. Empty nodes should be excluded from the consolidateAfter configuration. We needed this flag to prevent nodes (with newly created pods) from consolidating / disrupting. But empty nodes, especially the duplicate nodeClaims, should not wait until the end of this time period.
  2. Duplicate NodeClaims are not created. This is a temporary concern right now, since their lifetime is mostly 2-3mins. The above one is a real cost burner for us right now.

Reproduction Steps (Please include YAML):

I can add these steps if needed, please let me know.

Versions:

  • Chart Version: v1.09
  • Kubernetes Version (kubectl version): v1.30
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@Adityashar Adityashar added the kind/bug Categorizes issue or PR as related to a bug. label May 26, 2025
@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority labels May 26, 2025
@jonathan-innis
Copy link
Member

If we deploy a simple app with 50 pods, then Karpenter will provision twice or thrice the amount of nodes than what is actually needed

This bit is concerning to me and seems like the bigger issue rather than changing the semantic of consolidateAfter. Can you share a deployment and your NodePool/NodeClass config where you have seen this over-provisioning behavior?

One reason that you might see this kind of behavior that I can think of just off the top of my head is if you are using preferences in your pod spec. Karpenter will try to meet this preferences as best it can but kube-scheduler may not see all of the available options to completely satisfy your preferences like Karpenter did and will schedule to a smaller set of nodes sooner. When the other nodes join the cluster, all of the pods are already bound so there is no need for them and they are left empty.

@jonathan-innis
Copy link
Member

/triage needs-information

@k8s-ci-robot k8s-ci-robot added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 27, 2025
@jonathan-innis
Copy link
Member

/priority awaiting-more-evidence

@k8s-ci-robot k8s-ci-robot added priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. and removed needs-priority labels May 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

3 participants