Skip to content

Conversation

hl475
Copy link
Contributor

@hl475 hl475 commented Oct 17, 2025

Purpose

Not ready - https://buildkite.com/vllm/ci/builds/35444/steps/canvas

Test Plan

pytest -s -v .buildkite/lm-eval-harness/test_lm_eval_correctness.py \
   --config-list-file .buildkite/lm-eval-harness/configs/models-large-h100.txt \
   --tp-size 4

pytest -s -v test_lm_eval_correctness.py \
    --config-list-file=configs/models-large-h100.txt \
    --tp-size=8

Test Result

LM Eval Small Models

6 passed, 3 warnings in 1229.77s (0:20:29)

LM Eval Large Models (H200)

1 passed, 102 warnings in 834.74s (0:13:54)

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify mergify bot added ci/build qwen Related to Qwen models labels Oct 17, 2025
@hl475 hl475 force-pushed the Qwen3-235B-A22B-Thinking-2507-FP8 branch from ca0144d to ed452ff Compare October 17, 2025 17:38
@hl475 hl475 force-pushed the Qwen3-235B-A22B-Thinking-2507-FP8 branch 4 times, most recently from 8720e75 to 303c0c5 Compare October 17, 2025 23:51
@hl475 hl475 changed the title [CI/Build][WIP] Add eval config for Qwen3-235B-A22B-Thinking-2507-FP8 [CI/Build]Add eval config for Qwen3-235B-A22B-Thinking-2507-FP8 and Qwen3-8B Oct 18, 2025
@hl475 hl475 force-pushed the Qwen3-235B-A22B-Thinking-2507-FP8 branch from 418d612 to bb9a649 Compare October 18, 2025 05:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build qwen Related to Qwen models

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants