[Inference API] Align Get/Update Inference APIs with index API pattern #124179

Rassyan · 2025-03-06T04:11:23Z

Motivation

Existing Inference APIs (GET/POST _inference/<task_type>/<inference_id>) only support single-ID operations, creating friction for bulk management scenarios like credential rotation. This PR aligns their behavior with standard Index APIs by adding wildcard(*) and comma-separated list support.

Key Changes

Support wildcards (prod_*) and comma-separated IDs (model_a,model_b)
Batch operation implementation
Maintained backward compatibility for single-ID usage
Update API now returns endpoints[] array wrapping multiple ModelConfigurations

Usage Examples

# Batch GET with wildcard and explicit IDs
GET _inference/text_embedding/prod_*,staging_model

# Wildcard credential update
POST _inference/completion/*openai*  
{
  "service_settings": {
    "api_key": "${NEW_KEY}", 
    "rate_limit": { "requests_per_minute": 1000 }
  }
}

Standardized Response Format:

{
  "endpoints": [
    {
      "inference_id": "openai-completion",
      "task_type": "completion",
      "service": "openai",
      "service_settings": {
        "model_id": "gpt-3.5-turbo",
        "rate_limit": { "requests_per_minute": 1000 }
      }
    },
    ...  // Additional endpoints
  ]
}

Testing

• Verified cluster rolling upgrades

…ldcard(*) and comma-separated inference ids

elasticsearchmachine · 2025-03-14T10:07:58Z

Pinging @elastic/ml-core (Team:ML)

Rassyan · 2025-04-21T03:36:50Z

Hi @maxhniebergall, I noticed you contributed most of the related code. Would you mind taking a quick look at this PR when you have a moment? Any feedback would be greatly appreciated!

elasticsearchmachine added needs:triage Requires assignment of a team area label v9.1.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Mar 6, 2025

Rassyan added 2 commits March 6, 2025 18:00

[Inference API] Align Get/Update APIs with index API by supporting wi…

692b305

…ldcard(*) and comma-separated inference ids

Adding changelog for PR elastic#124179

b993e3c

Rassyan force-pushed the allow_wildcard_inference_names branch from aee5be3 to b993e3c Compare March 6, 2025 10:15

github-actions bot deployed to docs-preview March 6, 2025 10:15 View deployment

javanna added :ml Machine learning and removed needs:triage Requires assignment of a team area label labels Mar 14, 2025

elasticsearchmachine added the Team:ML Meta label for the ML team label Mar 14, 2025

jonathan-buttner added the Feature:GenAI Features around GenAI label Mar 14, 2025

jonathan-buttner self-assigned this May 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference API] Align Get/Update Inference APIs with index API pattern #124179

[Inference API] Align Get/Update Inference APIs with index API pattern #124179

Rassyan commented Mar 6, 2025

elasticsearchmachine commented Mar 14, 2025

Rassyan commented Apr 21, 2025

[Inference API] Align Get/Update Inference APIs with index API pattern #124179

Are you sure you want to change the base?

[Inference API] Align Get/Update Inference APIs with index API pattern #124179

Conversation

Rassyan commented Mar 6, 2025

Motivation

Key Changes

Usage Examples

Testing

elasticsearchmachine commented Mar 14, 2025

Rassyan commented Apr 21, 2025