Skip to content

Commit f5c6e80

Browse files
hiyougaSalmon-f42
authored andcommitted
[assets] update video (hiyouga#7287)
[assets] update wechat (hiyouga#7288) [dataset] fix ultrachat_200k dataset (hiyouga#7259) The `HuggingFaceH4/ultrachat_200k` dataset doesn't contain the default "train" split. The correct split is "train_sft". [data] gemma3 plugin pan and scan (hiyouga#7294) * gemma3 pan and scan * add test case * fix test [inference] support sglang backend (hiyouga#7278) * Mimic SGLang offline Engine * Add more tests and args * Pass all current tests * Clean Code * fix sample_params * clean code * Fix Stream Chat * change sglang from engine mode to server mode * fix * Fix Review Issues * Use SGLang Built-In Utilities * Fix test SGLang * Some Doc Issue * fix sglang engine * add readme --------- Co-authored-by: Jin Pan <[email protected]> Co-authored-by: hiyouga <[email protected]> [model] support hunyuan 7b (hiyouga#7317) * [Model]supported tencent-hunyuan model * [Model]supported tencent-hunyuan model(fix) * [Model]supported tencent-hunyuan model(fix) [assets] update videos (hiyouga#7340) * Update README.md * Update README_zh.md [data] fix template (hiyouga#7349) [misc] set dev version (hiyouga#7351) [assets] update wechat (hiyouga#7361) [version] fix minicpmo (hiyouga#7378) [3rdparty] fix redundant process group destroy for ray (hiyouga#7395) * fix redundant process group destroy for ray * Update tuner.py --------- Co-authored-by: hoshi-hiyouga <[email protected]> [misc] fix sglang deps (hiyouga#7432) * feat: Add transformer version requirement for sglang * feat: add srt to sglang which is required for running sglang Other options are srt_hip, srt_xpu, srt_npu, srt_hpu, srt_cpu, for different computation architectures. [deps] upgrade vllm to 0.8 (hiyouga#7436) [deps] upgrade transformers to 4.50.0 (hiyouga#7437) * upgrade transformers * fix hf cache * fix dpo trainer [scripts] support compute score on vllm's predictions (hiyouga#7419) * enable manual bleu&rouge eval by adding `scripts/eval_bleu_rouge.py` * added libraries check * update: 使用datasets库的多进程加速处理 * update: - 使用 fire.Fire - 修改代码格式 * Update eval_bleu_rouge.py: correctly uses fire Deleted the code of using sys.argv * Update eval_bleu_rouge.py --------- Co-authored-by: SnowFox4004 <manba@out> Co-authored-by: hoshi-hiyouga <[email protected]> [misc] fix license (hiyouga#7440) [misc] fix ci (hiyouga#7441) * fix ci * improve ci [docker] upgrade to torch 2.6 (hiyouga#7442) [trainer] fix vlm loss for transformers 4.49 (hiyouga#7448) [assets] fix gemma3 readme (hiyouga#7449) [assets] update wechat (hiyouga#7455) [misc] enable liger kernel for gemma3 (hiyouga#7462) [misc] enable liger kernel for gemma3 text and paligemma (hiyouga#7466) * add gemma3 text * add paligemma (1,2 and 2 mix) [misc] update liger-kernel's monkey patch (hiyouga#7453) * Update liger_kernel.py * Update setup.py [model] fix lora on quant models (hiyouga#7456) Co-authored-by: root <root@ai> [model] add qwen2vl 32b & upgrade peft (hiyouga#7469) * add qwen2vl 32b * fix ci * upgrade peft to 0.15 * fix ci * fix ci [trainer] fix wsd scheduler (hiyouga#7304) * [trainer] Warmup_stable_decay supports setting the number of stable and decay steps according to the warmup_ratio ratio * Update trainer_utils.py --------- Co-authored-by: hoshi-hiyouga <[email protected]> [3rdparty] support swanlab lark notification (hiyouga#7481) [data] fix pixtral plugin (hiyouga#7505) * preserve `image_sizes` * add comments [assets] update wechat (hiyouga#7523) [deps] pin pydantic to 2.10.6 (hiyouga#7546) [model] add Qwen2.5-Omni model (hiyouga#7537) * preserve image_sizes * preserve image_sizes * init plugin * support audio-text2text lora * nit * support image/video-text2text, audio-text2text * remove args * remove lines * add docs && nit * remove some comments * fix && add merge part script * add license [data] fix qwen2.5 omni collator (hiyouga#7553) [trainer] new kto mismatch pair creation strategy (hiyouga#7509) [data] shard the dataset to allow multiprocessing when streaming is enabled (hiyouga#7530) * Shard the dataset when streaming to allow multiprocessing * Allow user to not set dataset_shards to ensure backward compatibility [webui] fix launch with proxy (hiyouga#7332) [data] specify position_ids in PackedSupervisedDatasetProcessor for neat_packing (hiyouga#7318) * use position_ids for neat_packing with fa2 * revert fa2 changes [model] fix use_cache patching for gemma3 multimodal (hiyouga#7500) [model] fix kv cache (hiyouga#7564) [infer] vllm video/audio inference (hiyouga#7566) [trainer] fix batch processing in PPO trainer (hiyouga#7576) [data] fix qwen2.5 omni plugin (hiyouga#7573) * align key with qwen2vl * nit && change scripts [data] fix qwen2.5 omni plugin (hiyouga#7578) * specific entry * Update mm_plugin.py * fix fps cal --------- Co-authored-by: hoshi-hiyouga <[email protected]> [assets] update wechat (hiyouga#7594) [model] add llama4 (hiyouga#7611) [assets] update readme (hiyouga#7612) [misc] fix packing and eval plot (hiyouga#7623) [sglang] support transformers 4.51.0 (hiyouga#7639) [trainer] fix key error (hiyouga#7635) [data] Fix bugs of `use_audio_in_video` in Qwen2.5 Omni (hiyouga#7638) * cache _mm_inputs * nit * support for use_audio_in_video * remove cache * fix data * Update mllm_video_audio_demo.json [assets] update readme (hiyouga#7644) [assets] update readme (hiyouga#7654) [data] add coig-p dataset (hiyouga#7657) [misc] fix cuda warn on intel GPU (hiyouga#7655) [bugfix] enable_gemma_liger_kernel (hiyouga#7660) - The `enable_liger_kernel` function for the Gemma model series was not executed due to the existing `if` statement in the code. - Changed the line to an `elif` statement so that the `apply_liger_kernel` function is executed properly. resolved: hiyouga#7628 [ray] allow for specifying ray.init kwargs (i.e. runtime_env) (hiyouga#7647) * ray init kwargs * Update trainer_utils.py * fix ray args --------- Co-authored-by: hoshi-hiyouga <[email protected]> [data] support for specifying a dataset in cloud storage (hiyouga#7567) * add support for loading datasets from s3/gcs * add comments to readme * run linter and address comments * add option to pass in kwargs to ray init (i.e. runtime env) * address comment * revert mixed up changes [assets] update wechat (hiyouga#7674) [deps] fix uv conflicts (hiyouga#7686) * fix hiyouga#7678 * Update setup.py * Update tests.yml * Update publish.yml * Update Makefile [model] add GLM-4-0414 (hiyouga#7695) * Update README_zh.md * update [deps] upgrade transformers (hiyouga#7704) [misc] upgrade cli (hiyouga#7714) [misc] fix env vars (hiyouga#7715) [model] Support Kimi_VL thinking/instruct (hiyouga#7719) * add kimi_vl * patch config * check version * Update mm_plugin.py * Update mm_plugin.py --------- Co-authored-by: hoshi-hiyouga <[email protected]> [assets] update model readme (hiyouga#7724) [docker] patch docker-rocm (hiyouga#7725) * Update Dockerfile * Fix typo * Fix syntax for /bin/sh conditional * Add build args to docker-compose * Change shell to /bin/bash This is required for "==" syntax in conditional string comparison [deps] upgrade vllm (hiyouga#7728) [api] fix chat messages (hiyouga#7732) [assets] wechat (hiyouga#7740) [infer] support vllm-ascend (hiyouga#7739) [misc] improve entrypoint (hiyouga#7345) * 纯粹优化下入口代码,因为看到if else太多了 * Update cli.py --------- Co-authored-by: hoshi-hiyouga <[email protected]> [model] support intern-VL 2.5-3 series (hiyouga#7258) * add internvl and rebase * fix for internvl2&3 * remove lines * fix video_inputs & lint * nit * add constants * remove lines * fix * fix error * pass ci * pass ci * skip internvl & nit [infer] set env for vllm ascend (hiyouga#7745) [breaking] bump transformers to 4.45.0 & improve ci (hiyouga#7746) * update ci * fix * fix * fix * fix * fix [trainer] fix pt loss (hiyouga#7748) * fix pt loss * robust * fix * test [assets] update wechat (hiyouga#7792) [misc] fix bug in constant (hiyouga#7765) Co-authored-by: Sachin Beldona <[email protected]> [model] fix gemma3 export (hiyouga#7786) Co-authored-by: hoshi-hiyouga <[email protected]> [misc] fix new tokens adding (hiyouga#7253) Co-authored-by: hoshi-hiyouga <[email protected]> [data] Fix wrong position ids with packed attention masks (hiyouga#7754) Co-authored-by: hoshi-hiyouga <[email protected]> [parser] support omegaconf (hiyouga#7793) [trainer] Add Muon Optimizer (hiyouga#7749) Co-authored-by: hoshi-hiyouga <[email protected]> [example] add bash usage (hiyouga#7794) [data] improve mmplugin (hiyouga#7795) [trainer] support early stop (hiyouga#7797) [misc] update internvl constants (hiyouga#7801) [model] add arch check for InternVL (hiyouga#7803) [assets] update model readme (hiyouga#7804) [data] fix internvl plugin (hiyouga#7817) [model] fix moe zero3 (hiyouga#7826) Merge commit from fork [model] fix vit gradient checkpointing (hiyouga#7830) [assets] update wechat (hiyouga#7840) [ray] add storage filesystem to ray config (hiyouga#7854) fix attn patch for kimivl (hiyouga#7867) [data] fix minicpmo vllm infer (hiyouga#7870) [trainer] make projector trainable in freeze training (hiyouga#7872) Co-authored-by: hoshi-hiyouga <[email protected]> [data] fix qwen2 omni plugin (hiyouga#7875) [model] fix dsv3 leaf node (hiyouga#7879) [data] fix qwen2.5 omni template (hiyouga#7883) [model] add qwen3 (hiyouga#7885) support lora sft dsv3 update code update eval yaml rebase sync w/ major branch update baseline
1 parent 2833dde commit f5c6e80

File tree

161 files changed

+3890
-1693
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

161 files changed

+3890
-1693
lines changed

.env.local

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ USE_MODELSCOPE_HUB=
1616
USE_OPENMIND_HUB=
1717
USE_RAY=
1818
RECORD_VRAM=
19+
OPTIM_TORCH=
20+
NPU_JIT_COMPILE=
1921
# torchrun
2022
FORCE_TORCHRUN=
2123
MASTER_ADDR=

.github/ISSUE_TEMPLATE/1-bug-report.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ body:
1212
attributes:
1313
value: |
1414
Please do not create issues that are not related to framework bugs under this category, use **[Discussions](https://github.com/hiyouga/LLaMA-Factory/discussions/categories/q-a)** instead.
15-
请勿在此分类下创建和框架 bug 无关的 issues,请使用 **[讨论区](https://github.com/hiyouga/LLaMA-Factory/discussions/categories/q-a)**。
15+
请勿在此分类下创建和框架 bug 无关的 issues,训练问题求助请使用 **[讨论区](https://github.com/hiyouga/LLaMA-Factory/discussions/categories/q-a)**。
1616
1717
- type: checkboxes
1818
id: reminder

.github/workflows/publish.yml

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -28,14 +28,9 @@ jobs:
2828
with:
2929
python-version: "3.9"
3030

31-
- name: Install dependencies
32-
run: |
33-
python -m pip install --upgrade pip
34-
python -m pip install build
35-
3631
- name: Build package
3732
run: |
38-
python -m build
33+
make build
3934
4035
- name: Publish package
4136
uses: pypa/gh-action-pypi-publish@release/v1

.github/workflows/tests.yml

Lines changed: 35 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ jobs:
2222
strategy:
2323
fail-fast: false
2424
matrix:
25-
python-version:
25+
python:
2626
- "3.9"
2727
- "3.10"
2828
- "3.11"
@@ -31,9 +31,22 @@ jobs:
3131
- "ubuntu-latest"
3232
- "windows-latest"
3333
- "macos-13"
34+
transformers:
35+
- null
36+
include: # test backward compatibility
37+
- python: "3.9"
38+
os: "ubuntu-latest"
39+
transformers: "4.45.0"
40+
- python: "3.9"
41+
os: "ubuntu-latest"
42+
transformers: "4.49.0"
3443

3544
runs-on: ${{ matrix.os }}
3645

46+
concurrency:
47+
group: ${{ github.workflow }}-${{ github.ref }}-${{ matrix.os }}-${{ matrix.python }}-${{ matrix.transformers }}
48+
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
49+
3750
env:
3851
HF_TOKEN: ${{ secrets.HF_TOKEN }}
3952
OS_NAME: ${{ matrix.os }}
@@ -45,15 +58,27 @@ jobs:
4558
- name: Set up Python
4659
uses: actions/setup-python@v5
4760
with:
48-
python-version: ${{ matrix.python-version }}
61+
python-version: ${{ matrix.python }}
4962
cache: "pip"
50-
cache-dependency-path: "setup.py"
63+
cache-dependency-path: "**/requirements*.txt"
5164

5265
- name: Install dependencies
5366
run: |
5467
python -m pip install --upgrade pip
5568
python -m pip install ".[torch,dev]"
5669
70+
- name: Install transformers
71+
if: ${{ matrix.transformers }}
72+
run: |
73+
python -m pip install "transformers==${{ matrix.transformers }}"
74+
75+
- name: Cache files
76+
id: hf-hub-cache
77+
uses: actions/cache@v4
78+
with:
79+
path: ${{ runner.temp }}/huggingface
80+
key: huggingface-${{ matrix.os }}-${{ matrix.python }}-${{ matrix.transformers }}-${{ hashFiles('tests/version.txt') }}
81+
5782
- name: Check quality
5883
run: |
5984
make style && make quality
@@ -62,6 +87,13 @@ jobs:
6287
run: |
6388
make license
6489
90+
- name: Check build
91+
run: |
92+
make build
93+
6594
- name: Test with pytest
6695
run: |
6796
make test
97+
env:
98+
HF_HOME: ${{ runner.temp }}/huggingface
99+
HF_HUB_OFFLINE: "${{ steps.hf-hub-cache.outputs.cache-hit == 'true' && '1' || '0' }}"

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -185,4 +185,6 @@ test.py
185185
gen_ans_wo_think.py
186186
*.code-workspace
187187
*.xlsx
188-
templates
188+
templates
189+
*.code-workspace
190+
*.xlsx

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
check_dirs := scripts src tests setup.py
44

55
build:
6-
pip install build && python -m build
6+
pip3 install build && python3 -m build
77

88
commit:
99
pre-commit install

README.md

Lines changed: 54 additions & 25 deletions
Large diffs are not rendered by default.

README_zh.md

Lines changed: 55 additions & 25 deletions
Large diffs are not rendered by default.

assets/wechat.jpg

2.31 KB
Loading

assets/wechat_npu.jpg

533 Bytes
Loading

data/README.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,10 @@ Currently we support datasets in **alpaca** and **sharegpt** format.
44

55
```json
66
"dataset_name": {
7-
"hf_hub_url": "the name of the dataset repository on the Hugging Face hub. (if specified, ignore script_url and file_name)",
8-
"ms_hub_url": "the name of the dataset repository on the Model Scope hub. (if specified, ignore script_url and file_name)",
9-
"script_url": "the name of the directory containing a dataset loading script. (if specified, ignore file_name)",
7+
"hf_hub_url": "the name of the dataset repository on the Hugging Face hub. (if specified, ignore script_url, file_name and cloud_file_name)",
8+
"ms_hub_url": "the name of the dataset repository on the Model Scope hub. (if specified, ignore script_url, file_name and cloud_file_name)",
9+
"script_url": "the name of the directory containing a dataset loading script. (if specified, ignore file_name and cloud_file_name)",
10+
"cloud_file_name": "the name of the dataset file in s3/gcs cloud storage. (if specified, ignore file_name)",
1011
"file_name": "the name of the dataset folder or dataset file in this directory. (required if above are not specified)",
1112
"formatting": "the format of the dataset. (optional, default: alpaca, can be chosen from {alpaca, sharegpt})",
1213
"ranking": "whether the dataset is a preference dataset or not. (default: False)",
@@ -85,7 +86,7 @@ Regarding the above dataset, the *dataset description* in `dataset_info.json` sh
8586

8687
### Pre-training Dataset
8788

88-
- [Example dataset](c4_demo.json)
89+
- [Example dataset](c4_demo.jsonl)
8990

9091
In pre-training, only the `text` column will be used for model learning.
9192

data/README_zh.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@
8585

8686
### 预训练数据集
8787

88-
- [样例数据集](c4_demo.json)
88+
- [样例数据集](c4_demo.jsonl)
8989

9090
在预训练时,只有 `text` 列中的内容会用于模型学习。
9191

0 commit comments

Comments
 (0)