Skip to content

Better handling of node shutdown in SearchScrollAsyncAction #111751

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
piergm opened this issue Aug 9, 2024 · 4 comments
Open

Better handling of node shutdown in SearchScrollAsyncAction #111751

piergm opened this issue Aug 9, 2024 · 4 comments
Labels
>bug priority:high A label for assessing bug priority to be used by ES engineers :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@piergm
Copy link
Member

piergm commented Aug 9, 2024

SearchScrollAsyncAction seems to handle poorly node shutdown.
During a node shutdown we performed multiple scroll search path: /_search/scroll and got back 500 with the following exception:

Failed to execute phase [query], all shards failed; shardFailures {java.lang.IllegalStateException: node [XYZ] is not available

Node XYZ was SIGTERMd 30 seconds prior to the first 500 request and completed the shutdown process 6 seconds before the first 500 request.
I suspect that we could improve on how we handle node shutdown in SearchScrollAsyncAction.

@piergm piergm added >bug medium-risk An open issue or test failure that is a medium risk to future releases Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch :Search Foundations/Search Catch all for Search Foundations labels Aug 9, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@jimczi
Copy link
Contributor

jimczi commented Aug 9, 2024

Note that PIT (Point in Time) would behave similarly, so it would be beneficial to extend the scope to include all persistent readers.

@javanna javanna added priority:high A label for assessing bug priority to be used by ES engineers and removed medium-risk An open issue or test failure that is a medium risk to future releases labels Sep 4, 2024
@smalyshev
Copy link
Contributor

I just got a bunch of those as suppressed REST - did we change something to make this happen more frequently or is it just a coincidence?

@craigtaverner
Copy link
Contributor

Saw more suppressed REST errors today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug priority:high A label for assessing bug priority to be used by ES engineers :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

6 participants