Skip to content

fix: Fix logic in OnnxToOvNetworkBindings for stateful models #719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: ovep-develop
Choose a base branch
from

Conversation

RyanMetcalfeInt8
Copy link

@RyanMetcalfeInt8 RyanMetcalfeInt8 commented Jun 24, 2025

This resolves the issue on ovep-develop where stateful models encounter a runtime error when using EPCtx-wrapped models with ORT GenAI.

The issue that I encountered when using ORT GenAI with EPCtx IR's was as follows:

Both the .onnx and the encapsulated IR have a 'beam_idx' tensor defined, so matched_names is set to true (line 67).

The old logic was:

if (!matched_names && session_context.enable_causallm &&
            std::any_of(special_io_names_.begin(), special_io_names_.end(),
                        [&onnx_name](const std::string& name) { return onnx_name.find(name) != std::string::npos; })) {
          // This case also requires dynamic shape inference, so we'll mark the bindings as dynamic.
          has_dynamic_io_ = true;
          continue;
}

So immediately, it skips over this because matched_named==true. For beam_idx, it's something that ORT GenAI doesn't set, and hence we hit a runtime error during BasicBacked::Infer.

I reworked the logic here to reflect the fact that both of these conditions should be true to hit this continue (skip io binding):

  1. session_context.enable_causallm is true (i.e. stateful flow is enabled)
  2. Either there was a name mismatch, or the tensor name is part of the special io names list.

And so, accomplished this with a nested if:

if (session_context.enable_causallm) {
  if (!matched_names ||
      std::any_of(special_io_names_.begin(), special_io_names_.end(),
                  [&onnx_name](const std::string& name) { return onnx_name.find(name) != std::string::npos; })) {
    // This case also requires dynamic shape inference, so we'll mark the bindings as dynamic.
    has_dynamic_io_ = true;
    continue;
  }
}

As that seemed more readable to me than if ( session_context.enable_causallm && ( [.. the other conditions ...]) ).

@MayureshV1 MayureshV1 requested a review from Copilot June 25, 2025 05:02
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes the conditional logic in OnnxToOvNetworkBindings to correctly skip IO bindings for stateful (causal LM) models when encountering name mismatches or special IO names, preventing runtime errors.

  • Restructured the enable_causallm check into a nested if so both stateful mode and name conditions are evaluated together.
  • Clarified comments around handling of newly introduced tensors like beam_idx.
Comments suppressed due to low confidence (1)

onnxruntime/core/providers/openvino/backends/basic_backend.h:75

  • Add or update unit tests to cover the new stateful branch where session_context.enable_causallm is true and inputs have unmatched or special IO names, ensuring dynamic shape inference is validated.
        if (session_context.enable_causallm) {

Copy link

@gblong1 gblong1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Tested on LNL with Phi3.5, Qwen2.5, and Deepseek

@ankitm3k ankitm3k self-requested a review June 25, 2025 05:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants