Skip to content

PPO训练完,模型的答案和训练之前的结果一模一样? #8105

Unanswered
yaya159456 asked this question in Q&A
Discussion options

You must be logged in to vote

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
bug Something isn't working pending This problem is yet to be addressed
1 participant
Converted from issue

This discussion was converted from issue #8104 on May 19, 2025 11:59.