[Hardware][IBM Z] Fix Outlines Core issue for s390x #24034

R3hankhan123 · 2025-09-01T09:27:12Z

Purpose

Fix outlines-core for s390x

Test Plan

Inferencing is working as expected, tested locally.

Test Result

(APIServer pid=1) INFO 09-01 09:12:19 [launcher.py:44] Route: /v1/score, Methods: POST
(APIServer pid=1) INFO 09-01 09:12:19 [launcher.py:44] Route: /v1/audio/transcriptions, Methods: POST
(APIServer pid=1) INFO 09-01 09:12:19 [launcher.py:44] Route: /v1/audio/translations, Methods: POST
(APIServer pid=1) INFO 09-01 09:12:19 [launcher.py:44] Route: /rerank, Methods: POST
(APIServer pid=1) INFO 09-01 09:12:19 [launcher.py:44] Route: /v1/rerank, Methods: POST
(APIServer pid=1) INFO 09-01 09:12:19 [launcher.py:44] Route: /v2/rerank, Methods: POST
(APIServer pid=1) INFO 09-01 09:12:19 [launcher.py:44] Route: /scale_elastic_ep, Methods: POST
(APIServer pid=1) INFO 09-01 09:12:19 [launcher.py:44] Route: /is_scaling_elastic_ep, Methods: POST
(APIServer pid=1) INFO 09-01 09:12:19 [launcher.py:44] Route: /invocations, Methods: POST
(APIServer pid=1) INFO 09-01 09:12:19 [launcher.py:44] Route: /metrics, Methods: GET
(APIServer pid=1) INFO:     Started server process [1]
(APIServer pid=1) INFO:     Waiting for application startup.
(APIServer pid=1) INFO:     Application startup complete.
(EngineCore_0 pid=21) WARNING 09-01 09:14:07 [cudagraph_dispatcher.py:102] cudagraph dispatching keys are not initialized. No cudagraph will be used.
(APIServer pid=1) INFO 09-01 09:14:09 [loggers.py:123] Engine 000: Avg prompt throughput: 0.4 tokens/s, Avg generation throughput: 4.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 0.0%
(APIServer pid=1) INFO:     172.17.0.1:55114 - "POST /v1/completions HTTP/1.1" 200 OK
(APIServer pid=1) INFO 09-01 09:14:19 [loggers.py:123] Engine 000: Avg prompt throughput: 0.4 tokens/s, Avg generation throughput: 5.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 0.0%
(APIServer pid=1) INFO:     172.17.0.1:36300 - "POST /v1/completions HTTP/1.1" 200 OK
(APIServer pid=1) INFO 09-01 09:14:29 [loggers.py:123] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.2 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%
(APIServer pid=1) INFO 09-01 09:14:39 [loggers.py:123] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 0.0%

[root@b314lp81 ~]# curl http://localhost:8000/v1/completions   -H "Content-Type: application/json"   -d '{
    "model": "gpt2",
    "prompt": "Once upon a time",
    "max_tokens": 50
  }' | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   741  100   660  100    81    242     29  0:00:02  0:00:02 --:--:--   272
{
  "id": "cmpl-ab9c7302a9054f4ba037cce9033b55bd",
  "object": "text_completion",
  "created": 1756718057,
  "model": "gpt2",
  "choices": [
    {
      "index": 0,
      "text": ", there being only a single ray of light in this world, there will be four sides, and the last one will be divided into three sets (standards) by Fire and Light. If any records of his blade enter the valley below the others",
      "logprobs": null,
      "finish_reason": "length",
      "stop_reason": null,
      "token_ids": null,
      "prompt_logprobs": null,
      "prompt_token_ids": null
    }
  ],
  "service_tier": null,
  "system_fingerprint": null,
  "usage": {
    "prompt_tokens": 4,
    "total_tokens": 54,
    "completion_tokens": 50,
    "prompt_tokens_details": null
  },
  "kv_transfer_params": null
}

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request aims to fix outlines-core for the s390x architecture by building it from source with necessary patches. The overall approach is sound, but the implementation uses sed commands with hardcoded line numbers and string replacements to patch dependencies in the Dockerfile. This is a fragile practice that can easily break with upstream changes, making the build process difficult to maintain. My review includes suggestions to replace these sed commands with more robust patch files to improve the long-term stability of the build.

gemini-code-assist · 2025-09-01T09:28:55Z

docker/Dockerfile.s390x

Using sed with hardcoded line numbers to patch source files is highly fragile. If the upstream file aws-lc/crypto/pem/pem_lib.c is modified, these line numbers may become incorrect, which could cause the build to fail or apply the patch to the wrong lines. A more robust and maintainable approach is to use a patch file.

I recommend creating a patch file (e.g., aws-lc-s390x.patch) with the required changes and applying it using git apply or patch. This would make the build process more resilient to upstream changes.

gemini-code-assist · 2025-09-01T09:28:55Z

docker/Dockerfile.s390x

Using sed and echo to modify Cargo.toml is fragile and can lead to build failures.

The sed command on line 237 is sensitive to the exact formatting of the version string.

The echo commands on lines 238-239 could create a duplicate [patch.crates-io] section if one already exists, resulting in an invalid TOML file.

A more robust solution is to use a patch file for these modifications. This ensures the changes are applied atomically and are not dependent on the exact formatting of the original file.

R3hankhan123 · 2025-09-01T09:29:09Z

#22725 (comment)
@youkaichao @JaheimLee This is the PR to fix the outlines core issue

github-actions · 2025-09-01T10:24:41Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Signed-off-by: Rehan Khan <[email protected]>

dilipgb

LGTM

JaheimLee · 2025-09-04T01:56:19Z

@youkaichao Hi, could you merge this pr

dilipgb · 2025-09-05T06:29:40Z

@JaheimLee @youkaichao can you please help adding the ready label so all CI jobs starts running.

JaheimLee · 2025-09-06T03:07:23Z

@JaheimLee @youkaichao can you please help adding the ready label so all CI jobs starts running.

I don't have access rights. Maybe you need to add other reviewers who have bandwidth.

Signed-off-by: Rehan Khan <[email protected]>

Signed-off-by: Rehan Khan <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

mergify bot added the ci/build label Sep 1, 2025

gemini-code-assist bot reviewed Sep 1, 2025

View reviewed changes

R3hankhan123 mentioned this pull request Sep 1, 2025

[Hardware][IBM Z]Enable v1 for s390x and s390x dockerfile fixes #22725

Merged

4 tasks

Fix outlines-core wheel version and dependency handling for s390x build

34abd8b

Signed-off-by: Rehan Khan <[email protected]>

R3hankhan123 force-pushed the s390x-fix branch from 9e465b0 to 34abd8b Compare September 2, 2025 07:06

dilipgb approved these changes Sep 2, 2025

View reviewed changes

JaheimLee mentioned this pull request Sep 7, 2025

[Misc] bump outlines_core to fix the version conflicts with outlines >= 1.2.0 #24368

Merged

5 tasks

simon-mo approved these changes Sep 8, 2025

View reviewed changes

simon-mo merged commit e10fef0 into vllm-project:main Sep 8, 2025
12 checks passed

R3hankhan123 deleted the s390x-fix branch September 9, 2025 05:05

eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025

[Hardware][IBM Z] Fix Outlines Core issue for s390x (vllm-project#24034)

e44b06e

Signed-off-by: Rehan Khan <[email protected]>

skyloevil pushed a commit to skyloevil/vllm that referenced this pull request Sep 13, 2025

[Hardware][IBM Z] Fix Outlines Core issue for s390x (vllm-project#24034)

33b312c

Signed-off-by: Rehan Khan <[email protected]>

sindhujabd268 mentioned this pull request Sep 15, 2025

Fixed numba module issue red-hat-data-services/vllm-cpu#115

Merged

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Hardware][IBM Z] Fix Outlines Core issue for s390x (vllm-project#24034)

02c0fc8

Signed-off-by: Rehan Khan <[email protected]>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025

[Hardware][IBM Z] Fix Outlines Core issue for s390x (vllm-project#24034)

41fb639

Signed-off-by: Rehan Khan <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Hardware][IBM Z] Fix Outlines Core issue for s390x #24034

[Hardware][IBM Z] Fix Outlines Core issue for s390x #24034

Uh oh!

R3hankhan123 commented Sep 1, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Sep 1, 2025

Uh oh!

gemini-code-assist bot Sep 1, 2025

Uh oh!

R3hankhan123 commented Sep 1, 2025

Uh oh!

github-actions bot commented Sep 1, 2025

Uh oh!

dilipgb left a comment

Uh oh!

JaheimLee commented Sep 4, 2025

Uh oh!

dilipgb commented Sep 5, 2025

Uh oh!

JaheimLee commented Sep 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

[Hardware][IBM Z] Fix Outlines Core issue for s390x #24034

[Hardware][IBM Z] Fix Outlines Core issue for s390x #24034

Uh oh!

Conversation

R3hankhan123 commented Sep 1, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

R3hankhan123 commented Sep 1, 2025

Uh oh!

github-actions bot commented Sep 1, 2025

Uh oh!

dilipgb left a comment

Choose a reason for hiding this comment

Uh oh!

JaheimLee commented Sep 4, 2025

Uh oh!

dilipgb commented Sep 5, 2025

Uh oh!

JaheimLee commented Sep 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

R3hankhan123 commented Sep 1, 2025 •

edited by github-actions bot

Loading