Paligemma2-3b-pt-448 could train but failed to chat #7116

hanggun · 2025-02-28T07:53:23Z

Reminder

I have read the above rules and searched the existing issues.

System Info

llamafactory version: 0.9.2.dev0
Platform: Linux-5.15.0-86-generic-x86_64-with-glibc2.35
Python version: 3.12.3
PyTorch version: 2.6.0+cu124 (GPU)
Transformers version: 4.49.0
Datasets version: 3.2.0
Accelerate version: 1.2.1
PEFT version: 0.12.0
TRL version: 0.9.6
GPU type: NVIDIA A100-PCIE-40GB
GPU number: 8
GPU memory: 39.39GB

Reproduction

Please help me, I used this setting but the paligemma2 can not chat

the error says:

Traceback (most recent call last):
  File "/home/ps/data/LLaMA-Factory/venv/lib/python3.12/site-packages/gradio/queueing.py", line 625, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/venv/lib/python3.12/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/venv/lib/python3.12/site-packages/gradio/blocks.py", line 2042, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/venv/lib/python3.12/site-packages/gradio/blocks.py", line 1601, in call_function
    prediction = await utils.async_iteration(iterator)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/venv/lib/python3.12/site-packages/gradio/utils.py", line 728, in async_iteration
    return await anext(iterator)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/venv/lib/python3.12/site-packages/gradio/utils.py", line 722, in __anext__
    return await anyio.to_thread.run_sync(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/venv/lib/python3.12/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2461, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 962, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/venv/lib/python3.12/site-packages/gradio/utils.py", line 705, in run_sync_iterator_async
    return next(iterator)
           ^^^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/venv/lib/python3.12/site-packages/gradio/utils.py", line 866, in gen_wrapper
    response = next(iterator)
               ^^^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/src/llamafactory/webui/chatter.py", line 188, in stream
    for new_text in self.stream_chat(
  File "/home/ps/data/LLaMA-Factory/src/llamafactory/chat/chat_model.py", line 112, in stream_chat
    yield task.result()
          ^^^^^^^^^^^^^
  File "/home/ps/anaconda3/envs/py3.12.3/lib/python3.12/concurrent/futures/_base.py", line 456, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/home/ps/anaconda3/envs/py3.12.3/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/home/ps/data/LLaMA-Factory/src/llamafactory/chat/chat_model.py", line 129, in astream_chat
    async for new_token in self.engine.stream_chat(
  File "/home/ps/data/LLaMA-Factory/src/llamafactory/chat/hf_engine.py", line 390, in stream_chat
    stream = self._stream_chat(*input_args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/src/llamafactory/chat/hf_engine.py", line 275, in _stream_chat
    gen_kwargs, _ = HuggingfaceEngine._process_args(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ps/data/LLaMA-Factory/src/llamafactory/chat/hf_engine.py", line 184, in _process_args
    value = torch.stack([torch.stack(v) for v in value])
                         ^^^^^^^^^^^^^^
TypeError: expected Tensor as element 0 in argument 0, but got int

Others

No response

The text was updated successfully, but these errors were encountered:

hanggun · 2025-02-28T10:04:47Z

我发现输入不是tensor

pixel_values
tensor([[[[1., 1., 1.,  ..., 1., 1., 1.],
          [1., 1., 1.,  ..., 1., 1., 1.],
          [1., 1., 1.,  ..., 1., 1., 1.],
          ...,
          [1., 1., 1.,  ..., 1., 1., 1.],
          [1., 1., 1.,  ..., 1., 1., 1.],
          [1., 1., 1.,  ..., 1., 1., 1.]],

         [[1., 1., 1.,  ..., 1., 1., 1.],
          [1., 1., 1.,  ..., 1., 1., 1.],
          [1., 1., 1.,  ..., 1., 1., 1.],
          ...,
          [1., 1., 1.,  ..., 1., 1., 1.],
          [1., 1., 1.,  ..., 1., 1., 1.],
          [1., 1., 1.,  ..., 1., 1., 1.]],

         [[1., 1., 1.,  ..., 1., 1., 1.],
          [1., 1., 1.,  ..., 1., 1., 1.],
          [1., 1., 1.,  ..., 1., 1., 1.],
          ...,
          [1., 1., 1.,  ..., 1., 1., 1.],
          [1., 1., 1.,  ..., 1., 1., 1.],
          [1., 1., 1.,  ..., 1., 1., 1.]]]])
token_type_ids
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]

然后hf_engine的第186行是这样写的

if isinstance(value, list) and all(isinstance(v, torch.Tensor) for v in value):  # for pixtral inputs
                value = torch.stack(value)  # assume they have same sizes
elif isinstance(value, list) and all(isinstance(v, list) for v in value):  # for minicpmv inputs
                value = torch.stack([torch.stack(v) for v in value])
elif not isinstance(value, torch.Tensor):
    value = torch.tensor(value)

所以我改成了如下

if isinstance(value, list) and all(isinstance(v, torch.Tensor) for v in value):  # for pixtral inputs
                value = torch.stack(value)  # assume they have same sizes
elif not isinstance(value, torch.Tensor):
    value = torch.tensor(value)
elif isinstance(value, list) and all(isinstance(v, list) for v in value):  # for minicpmv inputs
    value = torch.stack([torch.stack(v) for v in value])

改成这样之后可以运行了
但是结果如下

请问我的做法是否正确？

hanggun added bug Something isn't working pending This problem is yet to be addressed labels Feb 28, 2025

hiyouga mentioned this issue Feb 28, 2025

[inference] fix hf_engine #7120

Merged

2 tasks

hiyouga closed this as completed in #7120 Feb 28, 2025

hiyouga added solved This problem has been already solved and removed bug Something isn't working pending This problem is yet to be addressed labels Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Paligemma2-3b-pt-448 could train but failed to chat #7116

Paligemma2-3b-pt-448 could train but failed to chat #7116

hanggun commented Feb 28, 2025

hanggun commented Feb 28, 2025

Uh oh!

Paligemma2-3b-pt-448 could train but failed to chat #7116

Paligemma2-3b-pt-448 could train but failed to chat #7116

Comments

hanggun commented Feb 28, 2025

Reminder

System Info

Reproduction

Others

hanggun commented Feb 28, 2025

Uh oh!