-
Notifications
You must be signed in to change notification settings - Fork 637
Bug: internvl fine-tuning with quantization(QLora), AttributeError: 'Linear4bit' object has no attribute 'state' #1724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
bug
Something isn't working
Comments
试了下,4bit确实运行有问题,即使模型初始化时跳过检查,后续运行也不行,似乎是模型和4bit量化不兼容,8bit是可以的,是否可以使用8bit量化训练呢 |
谢谢,8bit可以用在internvl-chat-v1_5的训练上。我另一个要训练的是 internvl2-8b, 8bit也可以load,但是训练的时候会有另一个bug, 训练cmd
|
8 bit 情况下的这个bug assert qkv.dtype in [torch.float16, torch.bfloat16] 找到原因了,提了一个huggingface的pr( https://huggingface.co/OpenGVLab/InternVL2-8B/discussions/13 ) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图)
Goal is to fine tune internvl2-8b, using internvl-chat-v1_5 since it is an provided example
When finetuning internvl with cmd below to enable quantization(finetuning code without any quanzation parameters works). Same bug when fine-tuning internvl-8B
There is a bug below,
and it comes from line https://github.com/modelscope/ms-swift/blob/main/swift/llm/utils/model.py#L4265, the model.language_model.output is (output): Linear4bit(in_features=6144, out_features=92553, bias=False)
Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等)
Additional context
Add any other context about the problem here(在这里补充其他信息)
The text was updated successfully, but these errors were encountered: