Skip to content

Upstream branch main (revision 31335575) #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 46 commits into from

Conversation

apolo-developer
Copy link

Integrating latest changes from hiyouga/LLaMA-Factory branch main

3133557 [script] fix vllm version (hiyouga#7193)
abb23f7 [webui] support escape html (hiyouga#7190)
d739fdd [deps] upgrade vllm (hiyouga#7183)
be66df1 [data] fix mm template (hiyouga#7181)
64a6fb9 [model] add QwQ 32b (hiyouga#7179)
8ad0325 [trainer] fix swanlab callback (hiyouga#7176)
b4b89b4 [trainer] update config (hiyouga#7174)
dff4130 [data] fix qwen2audio plugin (hiyouga#7166)
0c403ea [assets] update wechat (hiyouga#7161)
bc298c6 [data] use bicubic resampler (hiyouga#7143)
17ba2d5 [webui] fix webui (hiyouga#7142)
049ddf4 [data] bailing template (hiyouga#7117)
1036311 [inference] fix hf_engine (hiyouga#7120)
d1863bb [assets] update wechat (hiyouga#7106)
891c487 [webui] display swanlab exp link (hiyouga#7089)
acc52e0 [npu] update cann base image and torch 2.4 (hiyouga#7061)
96fd510 [misc] fix project toml (hiyouga#7067)
e8266fe [script] add seed args (hiyouga#7058)
19861d5 [model] add paligemma2-mix series (hiyouga#7060)
76314e6 [data] fix mllama (hiyouga#7053)
ec1a1bc [model] add models (hiyouga#7054)
fe6dd92 [assets] update readme (hiyouga#7051)
1481af5 [assets] update wechat (hiyouga#7019)
cde479e [data] fix MiniCPMV plugin (hiyouga#6998)
302ecb0 [webui] update css (hiyouga#6985)
2591a3f [data] add r1 distill dataset (hiyouga#6983)
b00b290 [version] support transformers 449 (hiyouga#6982)
cc8c7e7 [misc] fix script (hiyouga#6977)
3da2cc2 [data] update vlm args (hiyouga#6976)
7faecc0 [data] add min resolution option (hiyouga#6975)
bdb581c [data] fix predict dataset (hiyouga#6972)
ad0c6c8 [assets] update wechat (hiyouga#6963)
2faf8ae [data] fix minicpmo template (hiyouga#6946)
6edd499 [ray] specify ray storage path (hiyouga#6920)
1ada3ae [misc] fix lora regex (hiyouga#6944)
c31c63b [misc] fix grad ckpt (hiyouga#6931)
797043d [model] add liger kernel to qwen2_5 vl (hiyouga#6930)
11eac71 [trainer] fix gen_kwarg to eval during training (hiyouga#5451)
1e35967 [data] evaluate on each dataset (hiyouga#5522)
4c7bfeb [data] improve error handling (hiyouga#6128)
8956c93 [misc] update readme (hiyouga#6918)
499ea45 [misc] update readme (hiyouga#6917)
617c8ab [breaking change] refactor data pipeline (hiyouga#6901)
f8a2061 [misc] support for launching LLaMA-Factory with uv run (hiyouga#6907)
ee5fe21 [example] fix path to ray example (hiyouga#6906)
e34c3c0 [misc] fix grad ckpt func (hiyouga#6916)

hiyouga and others added 30 commits February 13, 2025 00:17
* yay

* uv with ray temporary commit

* remove ray specific code for now

* cleanup
* sync from upstream

* update

* update

* fix

---------

Co-authored-by: hiyouga <[email protected]>
* [Update] loader.py , evaluate will run separate evaluations on each dataset.

`If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation`

seq2seqtrainner support eval_dataset as Dict.

* fix format

* fix

* fix

---------

Co-authored-by: hiyouga <[email protected]>
* Correctly pass gen_kwarg to eval during model runs

* fix

* fix

---------

Co-authored-by: hiyouga <[email protected]>
* add liger kernel to qwen2_5 vl

* fix patch

* fix patch
* support transformers 449

* fix mm plugin
* fix template

* fix bug in messages processing
* add qwen25vl awq models

* add moonlight
* fix mllama

* fix test
* add seed args

* add seed args

* update seed
leo-pony and others added 16 commits February 25, 2025 23:32
* Update base npu container image version:The Python version required for Hugging Face Transformers is >= python3.10

* Fix the bug: arg type of INSTALL_DEEPSPEED shoud been string now.

* Update Ascend CANN, CANN-Kernel and corresponding torch and torch-npu version

* Upgrade torch-npu needs packages' version: torch==2.1.0 and torch-npu==2.4.0.post2
* webui add swanlab link

* change callback name

* update

---------

Co-authored-by: hiyouga <[email protected]>
* add bailing template

* add bailing template

* add bailing template

---------

Co-authored-by: [email protected] <[email protected]>
* Update pairwise.py

[data]Repair multimodal model dpo training

* Update pairwise.py

[data]repair multimodal model dpo training using deepcopy

* Update pairwise.py

* Update mm_plugin.py
@apolo-developer apolo-developer requested review from a team and taddeusb90 and removed request for a team March 6, 2025 12:04
@apolo-developer apolo-developer deleted the upstream-to-pr/rev-31335575 branch March 7, 2025 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.