Support fastsafetensors to load model #10667

zeroRains · 2025-05-28T04:23:00Z

Before submitting

Lint code. If there are lint issues, please format the code first.

# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py

Add test cases into tests folder. If there are codecov issues, please add tests cases first.

PR types

New features

PR changes

Others

Description

支持使用fastsafetensor加载模型

fastsafetensor安装：

git clone -b paddle https://github.com/zeroRains/fastsafetensors.git
cd fastsafetensors
make install

已向原仓库提交pr，后续可以直接pip install安装：

foundation-model-stack/fastsafetensors#16

GPU：RTX A6000

测试指令：

cd llm

# 用这里的model_name变量替换下面指令的$model_name
model_name="Qwen/Qwen2.5-7B-Instruct"
# model_name="meta-llama/Meta-Llama-3-8B-Instruct"

time python ./predict/predictor.py --model_name_or_path $model_name --dtype float16 --mode dynamic --decode_strategy greedy_search --inference_model 1 --block_attn 1 --append_attn 1

测试记录：

pr中修改部分有一个not_use_gds的变量，设置为true时就是不使用GDS，设置为false时就是使用GDS

method	node	dtyp	model	GDS	load files time(s)	end2end execute time(s)
base	1	float16	meta-llama/Meta-Llama-3-8B-Instruct	×	11	46.648
fastsafetensors	1	float16	meta-llama/Meta-Llama-3-8B-Instruct	√	80	115.363
fastsafetensors	1	float16	meta-llama/Meta-Llama-3-8B-Instruct	×	2	38.081
base	1	float16	Qwen/Qwen2.5-7B-Instruct	×	75	94.809
fastsafetensors	1	float16	Qwen/Qwen2.5-7B-Instruct	√	54	76.694
fastsafetensors	1	float16	Qwen/Qwen2.5-7B-Instruct	×	2	15.197

paddle-bot · 2025-05-28T04:23:05Z

Thanks for your contribution!

support fastsafetensors

d188e15

paddle-bot bot added the contributor label May 28, 2025

paddle-bot bot assigned KB-Ding May 28, 2025

yuanlehome self-assigned this May 28, 2025

fix the bug with load hunging

4008df6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support fastsafetensors to load model #10667

Support fastsafetensors to load model #10667

zeroRains commented May 28, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented May 28, 2025

Uh oh!

Uh oh!

Support fastsafetensors to load model #10667

Are you sure you want to change the base?

Support fastsafetensors to load model #10667

Conversation

zeroRains commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Before submitting

PR types

PR changes

Description

Uh oh!

paddle-bot bot commented May 28, 2025

Uh oh!

Uh oh!

zeroRains commented May 28, 2025 •

edited

Loading