Skip to content

GPTQ量化模型GRPO强化微调报错:AttributeError: 'GPTQLoraLinear' object has no attribute 'get_delta_weight' #3949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wangxiajun68 opened this issue Apr 21, 2025 · 0 comments

Comments

@wangxiajun68
Copy link

Describe the bug

Train:   0%|▏                                                                                       | 1/695 [00:21<4:04:20, 21.12s/it][rank0]: Traceback (most recent call last):
[rank0]:   File "/mnt/wangxj/ms-swift/swift/cli/rlhf.py", line 5, in <module>
[rank0]:     rlhf_main()
[rank0]:   File "/mnt/wangxj/ms-swift/swift/llm/train/rlhf.py", line 98, in rlhf_main
[rank0]:     return SwiftRLHF(args).main()
[rank0]:   File "/mnt/wangxj/ms-swift/swift/llm/base.py", line 47, in main
[rank0]:     result = self.run()
[rank0]:   File "/mnt/wangxj/ms-swift/swift/llm/train/sft.py", line 144, in run
[rank0]:     return self.train(trainer)
[rank0]:   File "/mnt/wangxj/ms-swift/swift/llm/train/sft.py", line 204, in train
[rank0]:     trainer.train(trainer.args.resume_from_checkpoint)
[rank0]:   File "/mnt/wangxj/ms-swift/swift/trainers/mixin.py", line 294, in train
[rank0]:     res = super().train(*args, **kwargs)
[rank0]:   File "/home/.conda/envs/swift/lib/python3.10/site-packages/transformers/trainer.py", line 2245, in train
[rank0]:     return inner_training_loop(
[rank0]:   File "/home/.conda/envs/swift/lib/python3.10/site-packages/transformers/trainer.py", line 2560, in _inner_training_loop
[rank0]:     tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
[rank0]:   File "/mnt/wangxj/ms-swift/swift/trainers/rlhf_trainer/grpo_trainer.py", line 1142, in training_step
[rank0]:     return super().training_step(model, inputs, num_items_in_batch)
[rank0]:   File "/home/.conda/envs/swift/lib/python3.10/site-packages/transformers/trainer.py", line 3730, in training_step
[rank0]:     inputs = self._prepare_inputs(inputs)
[rank0]:   File "/home/.conda/envs/swift/lib/python3.10/site-packages/trl/extras/profiling.py", line 87, in wrapper
[rank0]:     return func(self, *args, **kwargs)
[rank0]:   File "/home/.conda/envs/swift/lib/python3.10/site-packages/trl/trainer/grpo_trainer.py", line 647, in _prepare_inputs
[rank0]:     inputs = self._generate_and_score_completions(inputs)
[rank0]:   File "/mnt/wangxj/ms-swift/swift/trainers/rlhf_trainer/grpo_trainer.py", line 818, in _generate_and_score_completions
[rank0]:     inputs = self._generate_completions(inputs)
[rank0]:   File "/mnt/wangxj/ms-swift/swift/trainers/rlhf_trainer/grpo_trainer.py", line 788, in _generate_completions
[rank0]:     inputs, outputs = self._fast_infer(inputs)
[rank0]:   File "/mnt/wangxj/ms-swift/swift/trainers/rlhf_trainer/grpo_trainer.py", line 731, in _fast_infer
[rank0]:     self._move_model_to_vllm_lmdeploy()
[rank0]:   File "/home//.conda/envs/swift/lib/python3.10/site-packages/trl/extras/profiling.py", line 87, in wrapper
[rank0]:     return func(self, *args, **kwargs)
[rank0]:   File "/mnt/wangxj/ms-swift/swift/trainers/rlhf_trainer/grpo_trainer.py", line 524, in _move_model_to_vllm_lmdeploy
[rank0]:     with patch_lora_merge(unwrapped_model, parameter_group):
[rank0]:   File "/home/.conda/envs/swift/lib/python3.10/contextlib.py", line 135, in __enter__
[rank0]:     return next(self.gen)
[rank0]:   File "/mnt/wangxj/ms-swift/swift/trainers/rlhf_trainer/utils.py", line 78, in patch_lora_merge
[rank0]:     module.get_delta_weight_origin = module.get_delta_weight
[rank0]:   File "/.conda/envs/swift/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1928, in __getattr__
[rank0]:     raise AttributeError(
[rank0]: AttributeError: 'GPTQLoraLinear' object has no attribute 'get_delta_weight'
Train:   0%|▏                                                                                       | 1/695 [00:21<4:08:35, 21.49s/it]
[rank0]:[W421 09:18:05.270067812 ProcessGroupNCCL.cpp:1496] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())

Your hardware and system info
CUDA 12.2
ms_swift 3.4.0 dev0
peft 0.15.2
auto_gptq 0.7.1
vllm 0.8.3

Additional context

CUDA_VISIBLE_DEVICES=0 \
swift rlhf \
    --rlhf_type grpo \
    --model qwen2.5-3b-gptq-int4 \
    --model_type qwen2_5 \
    --reward_funcs toolbench \
    --train_type lora \
    --use_vllm true \
    --vllm_device auto \
    --vllm_gpu_memory_utilization 0.5 \
    --vllm_max_num_seqs 20 \
    --vllm_max_model_len 512 \
    --lora_rank 2 \
    --lora_alpha 4 \
    --target_modules all-linear \
    --torch_dtype bfloat16 \
    --dataset XXXX \
    --max_completion_length 512 \
    --num_train_epochs 1 \
    --per_device_train_batch_size 2 \
    --per_device_eval_batch_size 2 \
    --learning_rate 1e-5 \
    --gradient_accumulation_steps 8 \
    --eval_steps 100 \
    --save_steps 100 \
    --save_total_limit 2 \
    --logging_steps 5 \
    --max_length 1024 \
    --output_dir XXXX \
    --warmup_ratio 0.05 \
    --dataloader_num_workers 1 \
    --dataset_num_proc 1 \
    --num_generations 2 \
    --temperature 0.9 \
    --deepspeed zero2 \
    --offload_optimizer true \
    --offload_model true \
    --beta 0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant