Skip to content

RuntimeError: generator raised StopIteration #8168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
cs-mshah opened this issue May 27, 2025 · 0 comments
Open
1 task done

RuntimeError: generator raised StopIteration #8168

cs-mshah opened this issue May 27, 2025 · 0 comments
Labels
enhancement New feature or request pending This problem is yet to be addressed

Comments

@cs-mshah
Copy link

Reminder

  • I have read the above rules and searched the existing issues.

System Info

[2025-05-27 04:27:19,216] [INFO] [real_accelerator.py:239:get_accelerator] Setting ds_accelerator to cuda (auto detect)
INFO 05-27 04:27:25 [importing.py:53] Triton module has been replaced with a placeholder.
INFO 05-27 04:27:25 [__init__.py:239] Automatically detected platform cuda.

- `llamafactory` version: 0.9.3.dev0
- Platform: Linux-5.10.0-34-cloud-amd64-x86_64-with-glibc2.35
- Python version: 3.11.11
- PyTorch version: 2.6.0+cu124 (GPU)
- Transformers version: 4.51.3
- Datasets version: 3.6.0
- Accelerate version: 1.7.0
- PEFT version: 0.15.2
- TRL version: 0.9.6
- GPU type: NVIDIA A100-SXM4-40GB
- GPU number: 8
- GPU memory: 39.38GB
- DeepSpeed version: 0.16.8
- Bitsandbytes version: 0.45.5
- vLLM version: 0.8.5.post1

Reproduction

While training qwen2.5-vl-7B with sft on video data, I encountered the following issue during the preprocessing of the dataset:

[rank0]: multiprocess.pool.RemoteTraceback:                                                                                                                                                   
[rank0]: """                                                                                                                                                                                  
[rank0]: Traceback (most recent call last):                                                                                                                                                   
[rank0]:   File "/opt/conda/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 3475, in iter_outputs                                                                               
[rank0]:     yield i, apply_function(example, i, offset=offset)                                                                                                                               
[rank0]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                               
[rank0]:   File "/opt/conda/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 3398, in apply_function                                                                             
[rank0]:     processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)                                                                                                             
[rank0]:                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                             
[rank0]:   File "/app/src/llamafactory/data/processor/supervised.py", line 99, in preprocess_dataset                                                                                          
[rank0]:     input_ids, labels = self._encode_data_example(                                                                                                                                   
[rank0]:                         ^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                   
[rank0]:   File "/app/src/llamafactory/data/processor/supervised.py", line 43, in _encode_data_example                                                                                        
[rank0]:     messages = self.template.mm_plugin.process_messages(prompt + response, images, videos, audios, self.processor)                                                                   
[rank0]:                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                   
[rank0]:   File "/app/src/llamafactory/data/mm_plugin.py", line 1454, in process_messages                                                                                                     
[rank0]:     mm_inputs = self._get_mm_inputs(images, videos, audios, processor)                                                                                                               
[rank0]:                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                               
[rank0]:   File "/app/src/llamafactory/data/mm_plugin.py", line 1423, in _get_mm_inputs                                                                                                       
[rank0]:     video_data = self._regularize_videos(                                                                                                                                            
[rank0]:                  ^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                            
[rank0]:   File "/app/src/llamafactory/data/mm_plugin.py", line 1384, in _regularize_videos                                                                                                   
[rank0]:     video_stream = next(stream for stream in container.streams if stream.type == "video")                                                                                            
[rank0]:                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                            
[rank0]: StopIteration

Although I ran some ffmpeg checks on the videos I had, the stream issue wasn't captured and some buggy videos were in the training jsonl.

Feature request: Can we directly ignore videos/media that face issues during the preprocessing stage to avoid unwanted errors and halt the overall training? This is useful when processing large datasets since it saves a lot of time instead of finding and removing the media for unknown errors.

Others

No response

@cs-mshah cs-mshah added bug Something isn't working pending This problem is yet to be addressed labels May 27, 2025
@hiyouga hiyouga added enhancement New feature or request and removed bug Something isn't working labels May 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

2 participants