Skip to content

Add more context to "removing consolidatable status condition" message #2238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
grosser opened this issue May 18, 2025 · 6 comments
Open
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@grosser
Copy link

grosser commented May 18, 2025

Description

What problem are you trying to solve?

we want to track down why a certain node was not considered for consolidation,
for example "pod xyz was not disruptable"

for example we have a 16xl node that is running almost empty but karpenter only says

"marking consolidatable"
"removing consolidatable status condition"
... loop forever

and no indication of what is going on
... because there is a pod with a bad pdb sitting on it

How important is this feature to you?

Medium: would allow us to track down why nodes are wasting so much resources

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@grosser grosser added the kind/feature Categorizes issue or PR as related to a new feature. label May 18, 2025
@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority labels May 18, 2025
@jonathan-innis
Copy link
Member

We remove this status condition from the NodeClaim when our lastPodEventTime stored in the status is within the consolidateAfter time window. This time is updated when there is a pod add or removal event for this node. I think it might be tough for us to directly map that pod event time back to the pod that actually caused us to remove Consolidatable.

Maybe we just store this data in memory and make a best-effort on it? That might work

@jonathan-innis
Copy link
Member

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 27, 2025
@jonathan-innis
Copy link
Member

/priority important-longterm

@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-priority labels May 27, 2025
@jonathan-innis
Copy link
Member

/help

@k8s-ci-robot
Copy link
Contributor

@jonathan-innis:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label May 27, 2025
@jonathan-innis
Copy link
Member

/size s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

3 participants