Skip to content

Online prewarming service interface docs and usage in SearchService #126561

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 11, 2025

Conversation

andreidan
Copy link
Contributor

This adds the interface for search online prewarming with a default NOOP implementation.
This also hooks the interface in the SearchService after we fork the query phase to the search thread pool.

This adds the interface for search online prewarming with a default NOOP
implementation. This also hooks the interface in the SearchService after
we fork the query phase to the search thread pool.
@andreidan andreidan added >non-issue :Search Foundations/Search Catch all for Search Foundations v9.1.0 labels Apr 9, 2025
@andreidan andreidan requested a review from a team as a code owner April 9, 2025 20:03
@andreidan andreidan requested a review from cbuescher April 9, 2025 20:03
@elasticsearchmachine elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Apr 9, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I took a first look and only left a few questions for my understanding.

* should be skipped and indicate this decision via this
* flag.
*/
void prewarm(IndexShard indexShard, boolean skipPrewarming);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious to understand why we are passing a skip flag here instead of letting the caller to an if-block around the prewarm() call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it'd encourage callers to think about skipping under some circumstance that pertains to the caller (_search is concerned with what happens in the search threadpool, _query might derive other heuristics).

I'm happy to remove it and wrap the calls in if/else if it's currently confusing ?

WDYT ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would expect a flag like this only if it modifies the behavior of execution (i.e. orderWhisky(boolean withIce)) but not when it is intended to skip the purpose of the method call itself (i.e. orderWhisky(boolean damnINeedToDrive). I get the "encouragement to think" part but I would prefer to see that really nothing happens under certain circumstances on the caller site already. We have this being a no-op in stateful on top of this still.

// we successfully submitted the async task to the search pool so let's prewarm the shard
onlinePrewarmingService.prewarm(
shard,
executor instanceof ThreadPoolExecutor tpe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably did some digging around why to do this type check here. Since its a skip condition, maybe worth explaining somewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I see we skip conditioning on the queue size which seems to be a specific thing for tpe. I was probably put of by this being an instanceof check instead of a simple cast. I mostly wondered if we cannot be sure that executor here is always a TPE or maybe we have other cases in tests etc...

onlinePrewarmingService.prewarm(
readerContext.indexShard(),
executor instanceof ThreadPoolExecutor tpe
&& ((tpe.getMaximumPoolSize() * prewarmingMaxPoolFactorThreshold) < tpe.getQueue().size())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the skip condition seems to be the same everywhere, maybe a good candidate for a short helper function? That would also maybe be a good spot for additional short docs on the condition.

// the system has a chance to catch up and prewarming doesn't take over the network bandwidth
public static final Setting<Integer> PREWARMING_THRESHOLD_THREADPOOL_SIZE_FACTOR_POOL_SIZE = Setting.intSetting(
"search.online_prewarming_threshold_poolsize_factor",
10,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for this default that would be worth putting in the docs? Or is this something we are expecting to experiment with?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ we'll see what we end up with once we experiment and monitor a bit

@andreidan andreidan requested a review from cbuescher April 11, 2025 09:42
Copy link
Contributor

@drempapis drempapis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you Andrei!

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@andreidan andreidan merged commit fa09255 into elastic:main Apr 11, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants