Skip to content

max_pixels到底是怎么发挥作用呢? #3721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
goodstudent9 opened this issue Mar 30, 2025 · 6 comments
Open

max_pixels到底是怎么发挥作用呢? #3721

goodstudent9 opened this issue Mar 30, 2025 · 6 comments

Comments

@goodstudent9
Copy link

项目负责人你好,
当我在使用qwen2.5vl进行sft和grpo训练的时候,我需要指定每一张图片的max_pixels的数量。
根据我查阅文档,我使用了命令行的方式给出
MAX_PIXELS=40000 \ CUDA_VISIBLE_DEVICES=2,3,6,7 \ NPROC_PER_NODE=4 \ swift rlhf \
但是我发现我得到的checkpoint中,preprocessor的max_pixels的config依然没有改变,还是basemodel的config,这让我困惑,这样指定maxpixels真的有用吗?

而且在我验证的时候,load进来我的模型,然后使用hugging face的transformers代码进行eval,发现实际处理图像的时候并没有按照命令行的max_pixel进行图像的resize,所以来信询问这方面的问题,这是bug吗?

顺颂时祺

@Jintao-Huang
Copy link
Collaborator

MAX_PIXELS修改的这里的默认值,是生效的,参考这里:https://github.com/QwenLM/Qwen2.5-VL/blob/main/qwen-vl-utils/src/qwen_vl_utils/vision_process.py#L24

@Jintao-Huang
Copy link
Collaborator

transformers的代码如何修改max_pixels可以参考 qwen2_5_vl的在Huggingface/Modelscope上的示例代码

@Jintao-Huang
Copy link
Collaborator

@yizheyfd
Copy link

貌似max_pixels 设置 也会影响图像预处理方式,这里有一个scaled_image操作:https://github.com/modelscope/ms-swift/blob/main/swift/llm/template/base.py#L198

@Jintao-Huang
Copy link
Collaborator

#3729

--max_pixels是所有模型通用的参数;MAX_PIXELS环境变量是只作用于qwen2vl qwen2_5vl

@Jintao-Huang
Copy link
Collaborator

参考命令行参数文档

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants