Skip to content

Make can_match code a little easier to reuse #126588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Apr 10, 2025

This is a step on the short way to removing can_match from the normal query_then_fetch execution path now that we have batched excution.

Step 1 to refactoring this with reuse in a per-datanode fashion for batched execution.
Some obvious cleanup essentially making this a utility, removing one weird indirection and reducing the use of the actual instance of CanMatchPreFilterSearchPhase (this also results in a real performance gain from moving work for sorting shards etc. off of the transport_workers and closer to where its result is used).

This should by relatively trivial to review and allows for a simple follow up that extracts the ability to run an individual round in isolation as well as running the coordinator rewrite phase separately.

Step 1 to refactoring this with reuse in a per-datanode fashion
for batched execution.
Some obvious cleanup essentially making this a utility, removing
one weird indirection and reducing the use of the actual instance
of `CanMatchPreFilterSearchPhase` (this also results in a real
performance gain from moving work for sorting shards etc. off of
the transport_workers and closer to where its result is used).

This should by relatively trivial to review and allows for a simple follow
up that extracts the ability to run an individual round in isolation
as well as running the coordinator rewrite phase separately.
@original-brownbear original-brownbear added >non-issue :Search Foundations/Search Catch all for Search Foundations labels Apr 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine elasticsearchmachine added Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v9.1.0 labels Apr 10, 2025
@drempapis
Copy link
Contributor

This is a step on the short way to removing can_match from the normal query_then_fetch execution path now that we have batched excution.

@original-brownbear, Could you please provide some context on why batched execution deprecates the need for can_match? I'll go forward with the review, but I'd also like to understand the motivation for such a change.

@original-brownbear
Copy link
Member Author

Sure thing :)

What can_match does is: send one request per node, asking which shards might have results for a query and which shards definitely don't have any results. Also, for some sort situations, it provides an ordering of shards (e.g. by timestamp) for the query phase (e.g. if ordering by timestamp, start with the shards with the most recent timestamps and run until you have enough results).
Also, can_match rules out some shards without even sending a request to a node from information in the cluster state.

Now even in the absence of batched execution, you could still start query a node right away once you know which shards on that node are interesting. The current situation where we wait for all nodes before starting the query phase is introducing unnecessary latency already. Taking it further though, batched execution where we send a single request per node for the query phase, simply allows us to run the can_match logic at the beginning of handling that request within the same network roundtrip. The ordering of shards per node still works and the filtering of what to touch does as well.

But in order to do that we need to extract the per node part from this code to make it reusable.
Also, the can_match logic that rules out some shards without a trip to the data node makes sense still (because it might allow us to rule out nodes altogether :)) so we want to extract that also.
=> my thinking was, lets just rewrite the current can_match code into a chain/tree of futures produced by static methods and then just use those two pieces of the logic (the coordinating node and the per-node piece) directly in the query phase transport handler :) The first step of making the logic a little more procedural here I found to provide a good starting point for that refactoring (and really it gets us like 1/3 of the way there already pretty much IMO).

Hope that helps + thanks for taking a look!

Copy link
Contributor

@drempapis drempapis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! The code update is straightforward and easy to follow. The refactoring is clear, and the clarification provided helped me understand the intent behind the changes. Thank you for the detailed explanation and for making the code more maintainable. I am looking forward for the second part of this work.

@original-brownbear
Copy link
Member Author

Thanks Dimitris!

@original-brownbear original-brownbear merged commit d39f472 into elastic:main Apr 25, 2025
17 checks passed
@original-brownbear original-brownbear deleted the reusable-can-match branch April 25, 2025 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants