forked from hiyouga/LLaMA-Factory
-
Notifications
You must be signed in to change notification settings - Fork 0
Upstream branch main (revision 31335575) #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* yay * uv with ray temporary commit * remove ray specific code for now * cleanup
* refactor data * rename file
* sync from upstream * update * update * fix --------- Co-authored-by: hiyouga <[email protected]>
* [Update] loader.py , evaluate will run separate evaluations on each dataset. `If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation` seq2seqtrainner support eval_dataset as Dict. * fix format * fix * fix --------- Co-authored-by: hiyouga <[email protected]>
* Correctly pass gen_kwarg to eval during model runs * fix * fix --------- Co-authored-by: hiyouga <[email protected]>
* add liger kernel to qwen2_5 vl * fix patch * fix patch
* fix lora regex * fix
* support transformers 449 * fix mm plugin
* fix template * fix bug in messages processing
* add qwen25vl awq models * add moonlight
* fix mllama * fix test
* add seed args * add seed args * update seed
* Update base npu container image version:The Python version required for Hugging Face Transformers is >= python3.10 * Fix the bug: arg type of INSTALL_DEEPSPEED shoud been string now. * Update Ascend CANN, CANN-Kernel and corresponding torch and torch-npu version * Upgrade torch-npu needs packages' version: torch==2.1.0 and torch-npu==2.4.0.post2
* webui add swanlab link * change callback name * update --------- Co-authored-by: hiyouga <[email protected]>
* add bailing template * add bailing template * add bailing template --------- Co-authored-by: [email protected] <[email protected]>
* Update pairwise.py [data]Repair multimodal model dpo training * Update pairwise.py [data]repair multimodal model dpo training using deepcopy * Update pairwise.py * Update mm_plugin.py
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Integrating latest changes from hiyouga/LLaMA-Factory branch main
3133557 [script] fix vllm version (hiyouga#7193)
abb23f7 [webui] support escape html (hiyouga#7190)
d739fdd [deps] upgrade vllm (hiyouga#7183)
be66df1 [data] fix mm template (hiyouga#7181)
64a6fb9 [model] add QwQ 32b (hiyouga#7179)
8ad0325 [trainer] fix swanlab callback (hiyouga#7176)
b4b89b4 [trainer] update config (hiyouga#7174)
dff4130 [data] fix qwen2audio plugin (hiyouga#7166)
0c403ea [assets] update wechat (hiyouga#7161)
bc298c6 [data] use bicubic resampler (hiyouga#7143)
17ba2d5 [webui] fix webui (hiyouga#7142)
049ddf4 [data] bailing template (hiyouga#7117)
1036311 [inference] fix hf_engine (hiyouga#7120)
d1863bb [assets] update wechat (hiyouga#7106)
891c487 [webui] display swanlab exp link (hiyouga#7089)
acc52e0 [npu] update cann base image and torch 2.4 (hiyouga#7061)
96fd510 [misc] fix project toml (hiyouga#7067)
e8266fe [script] add seed args (hiyouga#7058)
19861d5 [model] add paligemma2-mix series (hiyouga#7060)
76314e6 [data] fix mllama (hiyouga#7053)
ec1a1bc [model] add models (hiyouga#7054)
fe6dd92 [assets] update readme (hiyouga#7051)
1481af5 [assets] update wechat (hiyouga#7019)
cde479e [data] fix MiniCPMV plugin (hiyouga#6998)
302ecb0 [webui] update css (hiyouga#6985)
2591a3f [data] add r1 distill dataset (hiyouga#6983)
b00b290 [version] support transformers 449 (hiyouga#6982)
cc8c7e7 [misc] fix script (hiyouga#6977)
3da2cc2 [data] update vlm args (hiyouga#6976)
7faecc0 [data] add min resolution option (hiyouga#6975)
bdb581c [data] fix predict dataset (hiyouga#6972)
ad0c6c8 [assets] update wechat (hiyouga#6963)
2faf8ae [data] fix minicpmo template (hiyouga#6946)
6edd499 [ray] specify ray storage path (hiyouga#6920)
1ada3ae [misc] fix lora regex (hiyouga#6944)
c31c63b [misc] fix grad ckpt (hiyouga#6931)
797043d [model] add liger kernel to qwen2_5 vl (hiyouga#6930)
11eac71 [trainer] fix gen_kwarg to eval during training (hiyouga#5451)
1e35967 [data] evaluate on each dataset (hiyouga#5522)
4c7bfeb [data] improve error handling (hiyouga#6128)
8956c93 [misc] update readme (hiyouga#6918)
499ea45 [misc] update readme (hiyouga#6917)
617c8ab [breaking change] refactor data pipeline (hiyouga#6901)
f8a2061 [misc] support for launching LLaMA-Factory with
uv run
(hiyouga#6907)ee5fe21 [example] fix path to ray example (hiyouga#6906)
e34c3c0 [misc] fix grad ckpt func (hiyouga#6916)