Skip to content

Support partial results in CCS in ES|QL #122708

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Feb 20, 2025
Merged

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Feb 16, 2025

A follow-up to #121942 that adds support for partial results in CCS in ES|QL.

Relates #121942

@dnhatn dnhatn force-pushed the partial-results-ccs branch 4 times, most recently from 2c5e548 to ba177e9 Compare February 17, 2025 07:41
@dnhatn dnhatn changed the title WIP Support partial results in CCS in ES|QL Feb 17, 2025
@dnhatn dnhatn force-pushed the partial-results-ccs branch 2 times, most recently from 60e6cb6 to 5420891 Compare February 17, 2025 16:34
@elasticsearchmachine
Copy link
Collaborator

Hi @dnhatn, I've created a changelog YAML for you.

@dnhatn dnhatn mentioned this pull request Feb 17, 2025
10 tasks
@dnhatn dnhatn marked this pull request as ready for review February 17, 2025 20:22
@dnhatn dnhatn requested review from smalyshev and quux00 February 17, 2025 20:22
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Feb 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@dnhatn dnhatn requested a review from idegtiarenko February 17, 2025 20:22
Copy link
Contributor

@quux00 quux00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, Nhat. Really nice work. Left nits and questions.

@dnhatn dnhatn requested a review from quux00 February 18, 2025 18:01
executionInfo.isSkipUnavailable(clusterAlias) == false,
() -> {},
failFast,
pagesFetched::incrementAndGet,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we're only using pagesFetched as a boolean flag, and in fact only every use onPageFetched to essentially trigger this flag. Is deeper usage planned in the future, or if not maybe it can be simplified somehow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a boolean is good enough. I pushed 9283823. Thanks!

@dnhatn dnhatn requested a review from smalyshev February 18, 2025 20:09
@@ -211,7 +214,8 @@ public void execute(
computeListener.acquireCompute().delegateFailure((l, profiles) -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, here I am not 100% sure of the intended logic - for a remote, it looks like if configuration.allowPartialResults() is true and any result is delivered, any error is converted to partial. Is that also the case for local results? If yes, I don't see what checks it in the code. If no, then it looks like local and remote behaves differently with regard to partial failure logic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the search_shards can fail when a node shuts down, and I've pushed d3be47f to handle this case. We also need to handle partial results for the final pipeline, but that's for a follow-up.

@dnhatn dnhatn requested a review from smalyshev February 18, 2025 22:35
@dnhatn dnhatn added the auto-backport Automatically create backport pull requests when merged label Feb 19, 2025
assertThat(id, is(in(allIds)));
}
if (request.includeCCSMetadata()) {
EsqlExecutionInfo.Cluster localInfo = resp.getExecutionInfo().getCluster(LOCAL_CLUSTER);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably should be a common method, but we can refactor it in a later patch.

assertThat(localInfo.getStatus(), equalTo(EsqlExecutionInfo.Cluster.Status.SUCCESSFUL));

EsqlExecutionInfo.Cluster remoteInfo = resp.getExecutionInfo().getCluster(REMOTE_CLUSTER_1);
assertThat(remoteInfo.getStatus(), equalTo(EsqlExecutionInfo.Cluster.Status.PARTIAL));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: if we fail to open an exchange, we couldn't have received any data from the remote. Doesn't it mean it should be SKIPPED, not PARTIAL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was deciding between FAILED and PARTIAL and chose PARTIAL for simplicity. We can revisit it later when writing the documentation for this.

}
}

public void testFailToStartRequestOnRemoteCluster() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to supercede testCloseSkipUnavailable in CrossClusterCancellationIT so we may not need that one anymore.

@dnhatn
Copy link
Member Author

dnhatn commented Feb 20, 2025

@smalyshev @quux00 Thanks for reviewing.

@dnhatn dnhatn merged commit 091ea9a into elastic:main Feb 20, 2025
16 of 17 checks passed
@dnhatn dnhatn deleted the partial-results-ccs branch February 20, 2025 15:27
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

elasticsearchmachine pushed a commit that referenced this pull request Feb 20, 2025
A follow-up to #121942 that adds support for partial results in CCS in ES|QL.

Relates #121942
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants