Skip to content

Support for Qwen2-Audio and Qwen2.5-Omni #4088

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zcs-hlt opened this issue May 6, 2025 · 2 comments
Closed

Support for Qwen2-Audio and Qwen2.5-Omni #4088

zcs-hlt opened this issue May 6, 2025 · 2 comments

Comments

@zcs-hlt
Copy link

zcs-hlt commented May 6, 2025

When we tried to reproduce the results of Qwen2.5-Omni on MMAU, some problems occurred. The usage environment is as follows:

1.ms_swift == 3.4.0.dev0
2.transformers == 4.52.0.dev0

The infer script is the same as the official sample file (examples/train/multimodal/omni/infer.sh), but the ACC on MMAU-test-mini is much lower than the results reported in the paper, so we try to verify our evaluation. We also tested it on Qwen2-Audio using the same evaluation script, and found that the metric did not gap with Omni, and it was also far lower than the results reported in Omni's paper. However, for Qwen2-Audio, which we have done the same evaluation script before, the ACC is close to the results reported in Omni's paper, but the environment is different, and only the version of ms_swift is changed, just as follows:

1.ms_swift == 3.3.0.dev0
2.transformers == 4.52.0.dev0

We would like to ask if there are any details on how to reproduce the results of Qwen2.5-Omni and Qwen2-Audio on MMAU-test-mini reported in the Omni paper on ms_swift == 3.4.0.dev0 or later. (We have also tested the two versions of ms_swift mentioned above on different transformers, eliminating the problem of transformers version.)
At the same time, we also proposed a similar issue to the Omni team. They only emphasized thinker_do_sample=False. We set temperature=0 in the infer script to achieve this, but the conclusion remains unchanged.
Looking forward to your reply. Thank you very much.

@Jintao-Huang
Copy link
Collaborator

@zcs-hlt zcs-hlt closed this as completed May 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants