meet error when quantizing Qwen2.5vl-72B with multi-gpus #3867

sys-reasoner · 2025-04-14T04:10:41Z

I use 8 * 80G A100 to quantize my post-trained Qwen2.5vl-72B with autoAWQ. Before that I have installed autoawq from source. But after loading model, it throws distributed error immediately. How should I solve this problem?

Here is my shell command:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \ MAX_PIXELS=2097152 \ VIDEO_MAX_PIXELS=50176 \ FPS_MAX_FRAMES=12 \ swift export \ --model Qwen2.5vl72B_post_train \ --dataset quant.json \ --quant_n_samples 256 \ --quant_batch_size -1 \ --max_length 8192 \ --quant_method awq \ --quant_bits 4 \ --output_dir 72B_awq

Here are related libraries:
autoawq----0.2.8 built from source
trainsformers----4.49.0
torch----2.5.1

The text was updated successfully, but these errors were encountered:

Jintao-Huang · 2025-04-14T07:11:38Z

Try upgrading the transformers.

sys-reasoner · 2025-04-14T12:00:10Z

Try upgrading the transformers.

After upgrading the transformers to 4.51 it works for quantizing. But when deploying to vllm, it encounters this error:
ValueError: The input size is not aligned with the quantized weight shape. This can be caused by too large tensor parallel size.
I have tried tp=2,4,8, all these tp have this error.

sys-reasoner · 2025-04-14T12:10:57Z

Try upgrading the transformers.

After upgrading the transformers to 4.51 it works for quantizing. But when deploying to vllm, it encounters this error: ValueError: The input size is not aligned with the quantized weight shape. This can be caused by too large tensor parallel size.

in config.json, I find that this quantization hyperparameters are added:
"quantization_config": { "bits": 4, "group_size": 128, "modules_to_not_convert": [ "visual" ], "quant_method": "awq", "version": "gemm", "zero_point": true },

sys-reasoner · 2025-04-16T06:36:18Z

@Jintao-Huang issue fixed. if using vllm tensor parallelism, awq quantized model's intermediate size should be divisible by (group_size * tp). So the default group size should be modified to fit that.

sys-reasoner closed this as completed Apr 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

meet error when quantizing Qwen2.5vl-72B with multi-gpus #3867

meet error when quantizing Qwen2.5vl-72B with multi-gpus #3867

sys-reasoner commented Apr 14, 2025 •

edited

Loading

Jintao-Huang commented Apr 14, 2025 •

edited

Loading

sys-reasoner commented Apr 14, 2025 •

edited

Loading

sys-reasoner commented Apr 14, 2025

sys-reasoner commented Apr 16, 2025

meet error when quantizing Qwen2.5vl-72B with multi-gpus #3867

meet error when quantizing Qwen2.5vl-72B with multi-gpus #3867

Comments

sys-reasoner commented Apr 14, 2025 • edited Loading

Jintao-Huang commented Apr 14, 2025 • edited Loading

sys-reasoner commented Apr 14, 2025 • edited Loading

sys-reasoner commented Apr 14, 2025

sys-reasoner commented Apr 16, 2025

sys-reasoner commented Apr 14, 2025 •

edited

Loading

Jintao-Huang commented Apr 14, 2025 •

edited

Loading

sys-reasoner commented Apr 14, 2025 •

edited

Loading