-
Notifications
You must be signed in to change notification settings - Fork 6.2k
The performance decreases seriously after finetuning on qwen2.5-Omni model with lora #8146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
maybe overfit? |
@Kuangdd01 not sure, the acc on training set was decreased too |
Can you provide the training script? @humble-gambler |
Sure, the training script is below. I also try use sooo small learning rate like 1.0e-7. The loss curve is relatively volatile and does not drop that fast. However, it still gets worse performance compared to the original model but not decrease so seriously. It seems that the fine-tuning doesn't work. So, it is a little wired. ![]()
|
Emmm, the lr is fine. |
Q1: The prediction is simple, I add the prompt "You should output the emotion label by using the following format: [emotion label]". Therefore, the predictions don't contain abnormal tokens. It just simply outputs the label, such as :[sadness]、[anger]. So, it can output the right format, but sometimes wrong label. (BTW, can I output the prediction token when using LLaMA-Factory during the fine-tune stage, what should I set?) Q2: Yes, it contains video data. |
You can save several Lora adapters, then do prediction after training. |
I think that the extra prompt is not the key reason. The inputs are also not different.
where, my created classifier works normally.
It's wired too😂 I am not sure if some view operations in the Omni model code caused a node in the computation graph to block the gradient return, making the LLaMA-Factory fine-tuning unable to work properly. But if there is no gradient return, the performance should be the same as the original model. |
Thanks for reporting this. I think something went wrong. |
TBH, I didn't encounter this issue in my case: I could also get normal grad curve during training. But I didn't check the grad matrix mentioned above. My training config can be found here: #7767 (comment) |
Does your model perform normally after training? Because the loss curve looks similar to the above. |
Yes it performs normally as expected after training. It has improvements on different metrics such as BERTScore, BLEU and ROUGE. |
@Luffy-ZY-Wang @Kuangdd01 Thanks for your help. Maybe I got something wrong, I will try more. |
Reminder
System Info
I tried to use omni model to do emotion recognition. The fine-tuning dataset is relatively simple, the label is directly used as the assistant's response for autoregressive training. During fine-tuning, the training set loss quickly dropped to 0, but when the training set was tested again, the classification accuracy became very low, so there was no overfitting. When it comes to test set, the prediction scores dropped from 0.5+ to 0.2+ and after fine-tuning, many labels that were originally predicted correctly were predicted incorrectly.
Reproduction
The sample of training set after tokenizing is belows:
and the loss png:
Others
Thanks for helping.
The text was updated successfully, but these errors were encountered: