Skip to content

[Inference API] Align Get/Update Inference APIs with index API pattern #124179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Rassyan
Copy link
Contributor

@Rassyan Rassyan commented Mar 6, 2025

Motivation

Existing Inference APIs (GET/POST _inference/<task_type>/<inference_id>) only support single-ID operations, creating friction for bulk management scenarios like credential rotation. This PR aligns their behavior with standard Index APIs by adding wildcard(*) and comma-separated list support.

Key Changes

  1. Support wildcards (prod_*) and comma-separated IDs (model_a,model_b)
  2. Batch operation implementation
  3. Maintained backward compatibility for single-ID usage
  4. Update API now returns endpoints[] array wrapping multiple ModelConfigurations

Usage Examples

# Batch GET with wildcard and explicit IDs
GET _inference/text_embedding/prod_*,staging_model

# Wildcard credential update
POST _inference/completion/*openai*  
{
  "service_settings": {
    "api_key": "${NEW_KEY}", 
    "rate_limit": { "requests_per_minute": 1000 }
  }
}

Standardized Response Format:

{
  "endpoints": [
    {
      "inference_id": "openai-completion",
      "task_type": "completion",
      "service": "openai",
      "service_settings": {
        "model_id": "gpt-3.5-turbo",
        "rate_limit": { "requests_per_minute": 1000 }
      }
    },
    ...  // Additional endpoints
  ]
}

Testing

• Verified cluster rolling upgrades

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.1.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Mar 6, 2025
@Rassyan Rassyan force-pushed the allow_wildcard_inference_names branch from aee5be3 to b993e3c Compare March 6, 2025 10:15
@javanna javanna added :ml Machine learning and removed needs:triage Requires assignment of a team area label labels Mar 14, 2025
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Mar 14, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@jonathan-buttner jonathan-buttner added the Feature:GenAI Features around GenAI label Mar 14, 2025
@Rassyan
Copy link
Contributor Author

Rassyan commented Apr 21, 2025

Hi @maxhniebergall, I noticed you contributed most of the related code. Would you mind taking a quick look at this PR when you have a moment? Any feedback would be greatly appreciated!

@jonathan-buttner jonathan-buttner self-assigned this May 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external-contributor Pull request authored by a developer outside the Elasticsearch team Feature:GenAI Features around GenAI :ml Machine learning Team:ML Meta label for the ML team v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants