Skip to content

Async search expiration to be independent from external signals #126833

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
javanna opened this issue Apr 15, 2025 · 1 comment
Open

Async search expiration to be independent from external signals #126833

javanna opened this issue Apr 15, 2025 · 1 comment
Labels
>bug priority:high A label for assessing bug priority to be used by ES engineers :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@javanna
Copy link
Member

javanna commented Apr 15, 2025

Submit async search allows users to provide a keep_alive. Get async search and get async status response allow users to update the keep_alive while retrieving incremental results or status for a specific async search.

When an async search is expired, its corresponding running tasks should get cancelled in order to stop doing useless work and release resources for other non expired searches. The current cancellation mechanism is based on listener callbacks, that actively check for cancellation (and cancel the task when needed) whenever new shard results come in to the coordinating node, or a partial reduction happens. This is not ideal in that if all shards take a long time coming back to the coord node, the search is likely to expire and the coord node is unlikely to cancel it promptly.

There is also a discrepancy in that get async search includes a check for cancellation as well, while get async status does not.

In a cross-cluster scenario, this gets worse in that when minimizing roundtrips, each cluster only comes back with its full results, and there aren't frequent enough callbacks that we can leverage to check for cancellation and cancel expired tasks.

We should redesign the cancellation mechanism for async search to not depend on external signals: when a new async search starts, submit a runnable that cancels it at its expiration. When a keep_alive gets extended, cancel previously scheduled runnable and schedule a new one. This will result in cancellation that is independent from the progress made by the search as well as what API the user call to poll for status.

@javanna javanna added :Search Foundations/Search Catch all for Search Foundations >bug priority:high A label for assessing bug priority to be used by ES engineers labels Apr 15, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Apr 15, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug priority:high A label for assessing bug priority to be used by ES engineers :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

2 participants