Configurable Inference Timeout #129880

Samiul-TheSoccerFan · 2025-06-23T20:54:05Z

This PR focuses on introducing user configurable inference timeout settings and use that as timeout during inference calls. Currently, it is hardcoded to 10s and the goal is to make it configurable.

Setup

PUT _inference/sparse_embedding/my-elser-model
{
  "service": "elser",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  },
  "task_settings": {}
}

PUT my-semantic-index-5
{
  "mappings": {
    "properties": {
      "writer": {
        "type": "semantic_text",
        "inference_id": "my-elser-model"
      },
      "reader": {
        "type": "semantic_text"
      }
    }
  }
}

PUT my-semantic-index-6
{
  "mappings": {
    "properties": {
      "writer": {
        "type": "semantic_text",
        "inference_id": "my-elser-model"
      },
      "reader": {
        "type": "semantic_text"
      }
    }
  }
}

POST my-semantic-index-5/_doc/1
{
  "writer": "Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-5/_doc/2
{
  "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-5/_doc/3
{
   "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-6/_doc/1
{
  "writer": "Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-6/_doc/2
{
  "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

POST my-semantic-index-6/_doc/3
{
   "writer": "Another Little Red Riding Hood",
  "reader": ["inference test", "another inference test"]
}

GET the default settings:

GET /my-semantic-index-5/_settings

GET /my-semantic-index-5/_settings?include_defaults=true

GET /my-semantic-index-6/_settings

GET /my-semantic-index-6/_settings?include_defaults=true

Update the inference timeout value:

PUT /my-semantic-index-6/_settings
{
  "index": {
    "semantic_text": {
      "inference_timeout": "1s"
    }
  }
}

GET the updated settings:

GET /my-semantic-index-5/_settings

GET /my-semantic-index-5/_settings?include_defaults=true

GET /my-semantic-index-6/_settings

GET /my-semantic-index-6/_settings?include_defaults=true

…emantic query build

Mikep86

Nice start! This initial commit gives us some useful info about what the scope of the solution should be:

The inference timeout is also hard-coded for sparse_vector (see here) and knn (see here) queries. Let's expand the scope to have one setting that controls the inference timeout for those + the semantic query inference timeout.
The machinations you had to go through to get the setting value for an individual index indicates that we should make this a cluster setting instead.

Adding settings for query time inference config and applying during s…

56cef20

…emantic query build

elasticsearchmachine added the v9.1.0 label Jun 23, 2025

Samiul-TheSoccerFan added >enhancement v8.19.0 v9.1.0 and removed v9.1.0 labels Jun 23, 2025

Mikep86 reviewed Jun 25, 2025

View reviewed changes

elasticsearchmachine added v9.2.0 and removed v9.1.0 labels Jun 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Configurable Inference Timeout #129880

Configurable Inference Timeout #129880

Uh oh!

Samiul-TheSoccerFan commented Jun 23, 2025

Uh oh!

Mikep86 left a comment

Uh oh!

Uh oh!

Configurable Inference Timeout #129880

Are you sure you want to change the base?

Configurable Inference Timeout #129880

Uh oh!

Conversation

Samiul-TheSoccerFan commented Jun 23, 2025

Setup

Uh oh!

Mikep86 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!