Skip to content

[ML] Adding timeout to request for creating inference endpoint #126805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jonathan-buttner
Copy link
Contributor

@jonathan-buttner jonathan-buttner commented Apr 14, 2025

This PR also adds the timeout query parameter to the PUT request so users can specify a longer timeout than the default 30 seconds. 30 seconds was used previously based on the ack timeout for the master node.

The timeout only applies to the start deployment request and the inference request during validation. The model download request ignores the timeout because it waits for the model to finish downloading before responding on the listener.

Testing

PUT http://localhost:9200/_inference/sparse_embedding/elser2?timeout=2nanos
{
    "service": "elasticsearch",
    "service_settings": {
        "model_id": ".elser_model_2",
        "num_threads": 1,
        "adaptive_allocations": {
            "enabled": true,
            "min_number_of_allocations": 1,
            "max_number_of_allocations": 4
        }
    }
}

This request should result in a failure like the following

{
    "error": {
        "root_cause": [
            {
                "type": "status_exception",
                "reason": "Timed out after [2nanos] waiting for model deployment to start. Use the trained model stats API to track the state of the deployment."
            }
        ],
        "type": "status_exception",
        "reason": "Timed out after [2nanos] waiting for model deployment to start. Use the trained model stats API to track the state of the deployment."
    },
    "status": 408
}

@jonathan-buttner jonathan-buttner added v8.18.1 v8.19.0 v9.0.1 >bug :ml Machine learning auto-backport Automatically create backport pull requests when merged Team:ML Meta label for the ML team labels Apr 14, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @jonathan-buttner, I've created a changelog YAML for you.

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jonathan-buttner jonathan-buttner marked this pull request as ready for review May 6, 2025 18:10
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@jonathan-buttner jonathan-buttner merged commit 4c507e2 into elastic:main May 6, 2025
17 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 126805

@jonathan-buttner
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

jonathan-buttner added a commit to jonathan-buttner/elasticsearch that referenced this pull request May 6, 2025
…out to request (elastic#126805)

* Fixing bug with listener and adding timeout

* Update docs/changelog/126805.yaml

* Fixing tests

* Fixing writeTo

(cherry picked from commit 4c507e2)

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
@jonathan-buttner jonathan-buttner changed the title [ML] Fixing bug with TransportPutModelAction listener and adding timeout to request [ML] Adding timeout to request for creating inference endpoint May 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged backport pending >bug :ml Machine learning Team:ML Meta label for the ML team v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants