Feature Request: Expose Model Internal Reasoning States (logits, attention, token-level trace) #5360

tandenghui · 2025-05-13T02:19:41Z

Hi LocalAI team 👋,

I'd like to request a feature that would greatly enhance the interpretability and debuggability of LocalAI models — the ability to expose the internal reasoning process during text generation.

Problem:
Currently, LocalAI only returns the final generated output (tokens or text), which limits insight into how the model arrived at its response.

There is no way to access intermediate model states such as:

logprobs or top-k token scores at each decoding step

attention weights per layer/head

hidden states / intermediate token embeddings

any kind of token-level reasoning trace

This makes it hard to:

debug model behavior

understand model uncertainty

build explainable AI systems (e.g. chain-of-thought visualization, step-by-step validation)

evaluate how model biases or hallucinations might arise

tandenghui added the enhancement New feature or request label May 13, 2025

mudler added the roadmap label May 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Feature Request: Expose Model Internal Reasoning States (logits, attention, token-level trace) #5360

Feature Request: Expose Model Internal Reasoning States (logits, attention, token-level trace) #5360

tandenghui commented May 13, 2025

Uh oh!

Feature Request: Expose Model Internal Reasoning States (logits, attention, token-level trace) #5360

Feature Request: Expose Model Internal Reasoning States (logits, attention, token-level trace) #5360

Comments

tandenghui commented May 13, 2025