[Inference API] Align Get/Update Inference APIs with index API pattern #124179
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Existing Inference APIs (
GET
/POST _inference/<task_type>/<inference_id>
) only support single-ID operations, creating friction for bulk management scenarios like credential rotation. This PR aligns their behavior with standard Index APIs by adding wildcard(*
) and comma-separated list support.Key Changes
prod_*
) and comma-separated IDs (model_a,model_b
)endpoints[]
array wrapping multipleModelConfigurations
Usage Examples
Standardized Response Format:
Testing
• Verified cluster rolling upgrades