generated from kubernetes/kubernetes-template-project
-
Notifications
You must be signed in to change notification settings - Fork 14
Issues: kubernetes-sigs/inference-perf
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Add shared prefixes to random data generator for benchmarking prefix caching
kind/feature
Categorizes issue or PR as related to a new feature.
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
Update docs to make sure they cover the usage instructions well
kind/documentation
Categorizes issue or PR as related to documentation.
Unit tests for the core functionality
kind/cleanup
Categorizes issue or PR as related to cleaning up code, process, or technical debt.
Publish container image for v0.1.0 release
kind/cleanup
Categorizes issue or PR as related to cleaning up code, process, or technical debt.
Report Generation for multi-stage runs
kind/feature
Categorizes issue or PR as related to a new feature.
Add a logger with log levels
good first issue
Denotes an issue ready for a new contributor, according to the "help wanted" guidelines.
kind/cleanup
Categorizes issue or PR as related to cleaning up code, process, or technical debt.
#48
opened Mar 31, 2025 by
Bslabe123
Calculate Accurate Prompt Tokens for Chat Completions in vLLM Client
#45
opened Mar 31, 2025 by
vivekk16
[Testing] Add test cases for validating inference perf load generation
#40
opened Mar 26, 2025 by
SachinVarghese
[Feature] Collect Latency metrics - compute request time percentiles
#30
opened Mar 4, 2025 by
SachinVarghese
[Feature] Add a model server client for TGI
kind/feature
Categorizes issue or PR as related to a new feature.
#24
opened Feb 20, 2025 by
SachinVarghese
Consolidate perf testing tools
lifecycle/stale
Denotes an issue or PR has remained open with no activity and has become stale.
#23
opened Feb 19, 2025 by
kfswain
Add Kubernetes Orchestration Library for Model Server Deployment and Benchmarking
kind/feature
Categorizes issue or PR as related to a new feature.
lifecycle/stale
Denotes an issue or PR has remained open with no activity and has become stale.
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
#22
opened Feb 13, 2025 by
wangchen615
[Feature] Add a model server client for Triton using TensorRT-LLM
lifecycle/stale
Denotes an issue or PR has remained open with no activity and has become stale.
#18
opened Feb 3, 2025 by
achandrasekar
[Feature] Add a client to get model server metrics
lifecycle/stale
Denotes an issue or PR has remained open with no activity and has become stale.
#17
opened Feb 3, 2025 by
achandrasekar
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-04-28.