Skip to content

Update tokenizer.model_max_length if necessary (Fix #6415) #6632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 14, 2025

Conversation

xiaosu-zhu
Copy link
Contributor

@xiaosu-zhu xiaosu-zhu commented Jan 14, 2025

What does this PR do?

Fixes #6415

It modifies logic of loader.py -> load_tokenizer(...) to update tokenizer.model_max_length if model_args.model_max_length changes.

Before submitting

@hiyouga
Copy link
Owner

hiyouga commented Jan 14, 2025

Hi Xiaosu, thanks for your concern about the handling of tokenizer in llama factory. We thought that this property does not affect both the training and inference, so we would like not to modify it in our framework.

@xiaosu-zhu
Copy link
Contributor Author

Thank you for your comments! You are right there will be no defects to do training and/or inference, except one place:

The VLLM inference will read the model_max_length as a hard token limit for generation.

If we create an vllm engine by LLM(max_model_len=something_larger) which is longer than model_max_length, it will raise an error and we need to set VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 to bypass it.

Therefore, I think this pr makes less confusion for further use cases.

Copy link
Owner

@hiyouga hiyouga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xiaosu-zhu
Copy link
Contributor Author

Thank you for your quick response ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tokenizer does not derive the newer config
2 participants