-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Cancel expired async search task when a remote returns its results #126583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cancel expired async search task when a remote returns its results #126583
Conversation
A while ago we enabled using ccs_minimize_roundtrips in async search. This makes it possible for users of async search to send a single search request per remote cluster, and minimize the impact of network latency. With non minimized roundtrips, we have pretty recurring cancellation checks: as part of the execution, we detect that a task expired whenever each shard comes back with its results. In a scenario where the coord node does not hold data, or only remote data is targeted by an async search, we have much less chance of detecting cancellation if roundtrips are minimized. The local coordinator would do nothing other than waiting for the minimized results from each remote cluster. One scenario where we can check for cancellation is when each cluster comes back with its full set of results. This commit adds such check, plus some testing for async search cancellation with minimized roundtrips.
@@ -519,6 +519,7 @@ public void onFinalReduce(List<SearchShard> shards, TotalHits totalHits, Interna | |||
*/ | |||
@Override | |||
public void onClusterResponseMinimizeRoundtrips(String clusterAlias, SearchResponse clusterResponse) { | |||
checkCancellation(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the proposed change, all the rest in this PR is testing :)
Hi @javanna, I've created a changelog YAML for you. |
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
latch = remoteQueryLatch.get(); | ||
} else if (searchContext.indexShard().shardId().getIndexName().equals("local")) { | ||
startedLocalLatch.get().countDown(); | ||
latch = localQueryLatch.get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By introducing a distinction between local and remote here, I can block local or remote query phase selectively.
assertNotNull(e.getCause()); | ||
Throwable t = ExceptionsHelper.unwrap(e, ResourceNotFoundException.class); | ||
assertNotNull("after deletion, getAsyncSearch should throw an Exception with ResourceNotFoundException in the causal chain", t); | ||
expectThrows(ResourceNotFoundException.class, () -> getAsyncSearch(response.getId())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switching from get
to actionGet
simplifies some of the expect throws that I added, plus this one that we already had.
I opened #126833 for a longer term fix of this issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Nice job rewriting the test framework for this!
💔 Backport failed
You can use sqren/backport to manually backport by running |
…lastic#126583) A while ago we enabled using ccs_minimize_roundtrips in async search. This makes it possible for users of async search to send a single search request per remote cluster, and minimize the impact of network latency. With non minimized roundtrips, we have pretty recurring cancellation checks: as part of the execution, we detect that a task expired whenever each shard comes back with its results. In a scenario where the coord node does not hold data, or only remote data is targeted by an async search, we have much less chance of detecting cancellation if roundtrips are minimized. The local coordinator would do nothing other than waiting for the minimized results from each remote cluster. One scenario where we can check for cancellation is when each cluster comes back with its full set of results. This commit adds such check, plus some testing for async search cancellation with minimized roundtrips.
…lastic#126583) A while ago we enabled using ccs_minimize_roundtrips in async search. This makes it possible for users of async search to send a single search request per remote cluster, and minimize the impact of network latency. With non minimized roundtrips, we have pretty recurring cancellation checks: as part of the execution, we detect that a task expired whenever each shard comes back with its results. In a scenario where the coord node does not hold data, or only remote data is targeted by an async search, we have much less chance of detecting cancellation if roundtrips are minimized. The local coordinator would do nothing other than waiting for the minimized results from each remote cluster. One scenario where we can check for cancellation is when each cluster comes back with its full set of results. This commit adds such check, plus some testing for async search cancellation with minimized roundtrips.
…126583) (#126915) A while ago we enabled using ccs_minimize_roundtrips in async search. This makes it possible for users of async search to send a single search request per remote cluster, and minimize the impact of network latency. With non minimized roundtrips, we have pretty recurring cancellation checks: as part of the execution, we detect that a task expired whenever each shard comes back with its results. In a scenario where the coord node does not hold data, or only remote data is targeted by an async search, we have much less chance of detecting cancellation if roundtrips are minimized. The local coordinator would do nothing other than waiting for the minimized results from each remote cluster. One scenario where we can check for cancellation is when each cluster comes back with its full set of results. This commit adds such check, plus some testing for async search cancellation with minimized roundtrips.
…lastic#126583) A while ago we enabled using ccs_minimize_roundtrips in async search. This makes it possible for users of async search to send a single search request per remote cluster, and minimize the impact of network latency. With non minimized roundtrips, we have pretty recurring cancellation checks: as part of the execution, we detect that a task expired whenever each shard comes back with its results. In a scenario where the coord node does not hold data, or only remote data is targeted by an async search, we have much less chance of detecting cancellation if roundtrips are minimized. The local coordinator would do nothing other than waiting for the minimized results from each remote cluster. One scenario where we can check for cancellation is when each cluster comes back with its full set of results. This commit adds such check, plus some testing for async search cancellation with minimized roundtrips.
…126583) (#126931) A while ago we enabled using ccs_minimize_roundtrips in async search. This makes it possible for users of async search to send a single search request per remote cluster, and minimize the impact of network latency. With non minimized roundtrips, we have pretty recurring cancellation checks: as part of the execution, we detect that a task expired whenever each shard comes back with its results. In a scenario where the coord node does not hold data, or only remote data is targeted by an async search, we have much less chance of detecting cancellation if roundtrips are minimized. The local coordinator would do nothing other than waiting for the minimized results from each remote cluster. One scenario where we can check for cancellation is when each cluster comes back with its full set of results. This commit adds such check, plus some testing for async search cancellation with minimized roundtrips.
…lts (#126583) (#126917) * Cancel expired async search task when a remote returns its results (#126583) A while ago we enabled using ccs_minimize_roundtrips in async search. This makes it possible for users of async search to send a single search request per remote cluster, and minimize the impact of network latency. With non minimized roundtrips, we have pretty recurring cancellation checks: as part of the execution, we detect that a task expired whenever each shard comes back with its results. In a scenario where the coord node does not hold data, or only remote data is targeted by an async search, we have much less chance of detecting cancellation if roundtrips are minimized. The local coordinator would do nothing other than waiting for the minimized results from each remote cluster. One scenario where we can check for cancellation is when each cluster comes back with its full set of results. This commit adds such check, plus some testing for async search cancellation with minimized roundtrips. * compile error
…ults (#126583) (#126916) * Cancel expired async search task when a remote returns its results (#126583) A while ago we enabled using ccs_minimize_roundtrips in async search. This makes it possible for users of async search to send a single search request per remote cluster, and minimize the impact of network latency. With non minimized roundtrips, we have pretty recurring cancellation checks: as part of the execution, we detect that a task expired whenever each shard comes back with its results. In a scenario where the coord node does not hold data, or only remote data is targeted by an async search, we have much less chance of detecting cancellation if roundtrips are minimized. The local coordinator would do nothing other than waiting for the minimized results from each remote cluster. One scenario where we can check for cancellation is when each cluster comes back with its full set of results. This commit adds such check, plus some testing for async search cancellation with minimized roundtrips. * compile error
A while ago we enabled using ccs_minimize_roundtrips in async search. This makes it possible for users of async search to send a single search request per remote cluster, and minimize the impact of network latency.
With non minimized roundtrips, we have pretty recurring cancellation checks: as part of the execution, we detect that a task expired whenever each shard comes back with its results.
In a scenario where the coord node does not hold data, or only remote data is targeted by an async search, we have much less chance of detecting cancellation if roundtrips are minimized. The local coordinator would do nothing other than waiting for the minimized results from each remote cluster. One scenario where we can check for cancellation is when each cluster comes back with its full set of results. This commit adds such check, plus some testing for async search cancellation with minimized roundtrips.