We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llamafactory
请问能否在现有预训练模型基础之上添加新的网络层,并重新定义loss function进行训练?
我想在Qwen-VL的基础之上添加新的网络层,这个新的网络层输入是last hidden state,输出是我想预测的未来某个状态。我已经在transformers的models/qwen2_5_vl/modeling_qwen2_5_vl.py中定义了我的新模型:
transformers
class MyQwen2_5_VLModel(Qwen2_5_VLModel): def __init__(self, config): super(MyQwen2_5_VLModel, self).__init__(config) self.image_predictor = nn.Sequential( nn.Linear(3584, 1024), nn.ReLU(), nn.Linear(1024, 2048), nn.ReLU(), nn.Linear(2048, 3 * 256 * 256), nn.Sigmoid() ) # self.loss = nn.CrossEntropyLoss(ignore_index=-100) def forward( self, input_ids: torch.LongTensor = None, attention_mask: Optional[torch.Tensor] = None, position_ids: Optional[torch.LongTensor] = None, past_key_values: Optional[List[torch.FloatTensor]] = None, inputs_embeds: Optional[torch.FloatTensor] = None, use_cache: Optional[bool] = None, output_attentions: Optional[bool] = None, output_hidden_states: Optional[bool] = None, return_dict: Optional[bool] = None, pixel_values: Optional[torch.Tensor] = None, pixel_values_videos: Optional[torch.FloatTensor] = None, image_grid_thw: Optional[torch.LongTensor] = None, video_grid_thw: Optional[torch.LongTensor] = None, rope_deltas: Optional[torch.LongTensor] = None, cache_position: Optional[torch.LongTensor] = None, second_per_grid_ts: Optional[torch.Tensor] = None, ) -> Union[Tuple, Qwen2_5_VLModelOutputWithPast]: return_dict = return_dict if return_dict is not None else self.config.use_return_dict output = super().forward( input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, use_cache, output_attentions, output_hidden_states, return_dict, pixel_values, pixel_values_videos, image_grid_thw, video_grid_thw, rope_deltas, cache_position, second_per_grid_ts, ) assert len(output.last_hidden_state.shape) == 3 last_token_last_hidden_state = output.last_hidden_state[:, -1, :] # hidden states prediction_image = self.image_predictor(last_token_last_hidden_state) return MyQwen2_5_VLModelOutputWithPast( last_hidden_state=output.last_hidden_state, past_key_values=output.past_key_values, hidden_states=output.hidden_states, attentions=output.attentions, rope_deltas=output.rope_deltas, prediction_image=prediction_image ) class MyQwen2_5_VLModelOutputWithPast(ModelOutput): last_hidden_state: torch.FloatTensor = None past_key_values: Optional[List[torch.FloatTensor]] = None hidden_states: Optional[Tuple[torch.FloatTensor]] = None attentions: Optional[Tuple[torch.FloatTensor]] = None rope_deltas: Optional[torch.LongTensor] = None prediction_image: Optional[Tuple[torch.FloatTensor]] = None __all__ = ["Qwen2_5_VLForConditionalGeneration", "Qwen2_5_VLModel", "Qwen2_5_VLPreTrainedModel", "Qwen2_5_VLTextModel", "MyQwen2_5_VLModel"]
我该如何在训练时指定调用该模型,而不是Qwen2_5_VLModel?
Qwen2_5_VLModel
我想我可以参考#8084 和 #3843 自定义损失函数。
希望您能解惑,感谢。
No response
The text was updated successfully, but these errors were encountered:
你可以在这段代码给你自定义的模型加一个hack操作,if model_type==xxx load_your_custom_model()
model_type==xxx
LLaMA-Factory/src/llamafactory/model/loader.py
Lines 143 to 169 in a4048b7
Sorry, something went wrong.
No branches or pull requests
Uh oh!
There was an error while loading. Please reload this page.
Reminder
System Info
llamafactory
version: 0.9.2.dev0Reproduction
请问能否在现有预训练模型基础之上添加新的网络层,并重新定义loss function进行训练?
我想在Qwen-VL的基础之上添加新的网络层,这个新的网络层输入是last hidden state,输出是我想预测的未来某个状态。我已经在
transformers
的models/qwen2_5_vl/modeling_qwen2_5_vl.py中定义了我的新模型:我该如何在训练时指定调用该模型,而不是
Qwen2_5_VLModel
?我想我可以参考#8084 和 #3843 自定义损失函数。
希望您能解惑,感谢。
Others
No response
The text was updated successfully, but these errors were encountered: