Skip to content

Support fastsafetensors to load model #10667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

zeroRains
Copy link
Contributor

@zeroRains zeroRains commented May 28, 2025

Before submitting

  • Lint code. If there are lint issues, please format the code first.
# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py
  • Add test cases into tests folder. If there are codecov issues, please add tests cases first.

PR types

New features

PR changes

Others

Description

支持使用fastsafetensor加载模型

fastsafetensor安装:

git clone -b paddle https://github.com/zeroRains/fastsafetensors.git
cd fastsafetensors
make install

已向原仓库提交pr,后续可以直接pip install安装:

foundation-model-stack/fastsafetensors#16

GPU:RTX A6000

测试指令:

cd llm

# 用这里的model_name变量替换下面指令的$model_name
model_name="Qwen/Qwen2.5-7B-Instruct"
# model_name="meta-llama/Meta-Llama-3-8B-Instruct"

time python ./predict/predictor.py --model_name_or_path $model_name --dtype float16 --mode dynamic --decode_strategy greedy_search --inference_model 1 --block_attn 1 --append_attn 1

测试记录:

pr中修改部分有一个not_use_gds的变量,设置为true时就是不使用GDS,设置为false时就是使用GDS

method node dtyp model GDS load files time(s) end2end execute time(s) 备注
base 1 float16 meta-llama/Meta-Llama-3-8B-Instruct × 11 46.648
fastsafetensors 1 float16 meta-llama/Meta-Llama-3-8B-Instruct 80 115.363
fastsafetensors 1 float16 meta-llama/Meta-Llama-3-8B-Instruct × 2 38.081
base 1 float16 Qwen/Qwen2.5-7B-Instruct × 75 94.809
fastsafetensors 1 float16 Qwen/Qwen2.5-7B-Instruct 54 76.694
fastsafetensors 1 float16 Qwen/Qwen2.5-7B-Instruct × 2 15.197

Copy link

paddle-bot bot commented May 28, 2025

Thanks for your contribution!

@yuanlehome yuanlehome self-assigned this May 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants