Remove empty results before merging #126770

javanna · 2025-04-14T09:37:25Z

We addressed the empty top docs issue with #126385 specifically for scenarios where empty top docs don't go through the wire. Yet they may be serialized from data node back to the coord node, in which case they will no longer be equal to Lucene#EMPTY_TOP_DOCS.

This commit expands the existing filtering of empty top docs to include also those that did go through serialization.

Closes #126742

We addressed the empty top docs issue with elastic#126385 specifically for scenarios where empty top docs don't go through the wire. Yet they may be serialized from data node back to the coord node, in which case they will no longer be equal to Lucene#EMPTY_TOP_DOCS. This commit expands the existing filtering of empty top docs to include also those that did go through serialization. Closes elastic#126742

elasticsearchmachine · 2025-04-14T09:37:50Z

Pinging @elastic/es-search-foundations (Team:Search Foundations)

elasticsearchmachine · 2025-04-14T09:37:51Z

Hi @javanna, I've created a changelog YAML for you.

original-brownbear

LGTM, thanks!

javanna · 2025-04-14T15:45:35Z

server/src/main/java/org/elasticsearch/action/search/SearchPhaseController.java

@@ -150,11 +150,12 @@ static TopDocs mergeTopDocs(List<TopDocs> results, int topN, int from) {
            return topDocs;
        } else if (topDocs instanceof TopFieldGroups firstTopDocs) {
            final Sort sort = new Sort(firstTopDocs.fields);
-            final TopFieldGroups[] shardTopDocs = results.stream().filter(td -> td != Lucene.EMPTY_TOP_DOCS).toArray(TopFieldGroups[]::new);
+            assert results.stream().noneMatch(topDoc -> topDoc == Lucene.EMPTY_TOP_DOCS);
+            final TopFieldGroups[] shardTopDocs = results.toArray(TopFieldGroups[]::new);


@original-brownbear can you keep me honest here? the filtering broke the field collapsing tests, I removed it entirely here (see TopFieldGroups#merge). We basically can't deal with empty top docs in there, so we can only assert that that scenario does not present. Is that the case in practice?

I'm not entirely sure we can do that, can't we deal with this case by simply returning null here like we did before if the resulting shardTopDocs after filtering are an empty array? (I think there's cases where this can legitimately be empty looking at org.elasticsearch.lucene.grouping.SinglePassGroupingCollector#getTopGroups for example?)

All of that said, I'm really starting to think I simply did a bit of a bad job here upstream. If we simply don't pass the empty top docs into this method ever (like we used to without data node side batching) then we would simply return null out of the box and behavior is unchanged. But for a quick fix, returning null should do the trick?

I tried the null handling one more time, and it seems to work this time. I had to add a bunch of odd null checks, but at least we can now distinguish between proper empty top docs and our own placeholder.

javanna · 2025-04-15T15:51:01Z

server/src/main/java/org/elasticsearch/action/search/SearchQueryThenFetchAsyncAction.java

@@ -722,7 +721,7 @@ private static final class QueryPerNodeState {

        private static final QueryPhaseResultConsumer.MergeResult EMPTY_PARTIAL_MERGE_RESULT = new QueryPhaseResultConsumer.MergeResult(
            List.of(),
-            Lucene.EMPTY_TOP_DOCS,
+            null,


I am restoring using null, which is what we did before batched execution.

javanna · 2025-04-15T15:51:05Z

server/src/main/java/org/elasticsearch/action/search/QueryPhaseResultConsumer.java

@@ -358,7 +364,7 @@ private MergeResult partialReduce(
                }
            }
            // we have to merge here in the same way we collect on a shard
-            newTopDocs = topDocsList == null ? Lucene.EMPTY_TOP_DOCS : mergeTopDocs(topDocsList, topNSize, 0);
+            newTopDocs = topDocsList == null ? null : mergeTopDocs(topDocsList, topNSize, 0);


I am restoring using null, which is what we did before batched execution.

javanna · 2025-04-15T15:52:24Z

server/src/main/java/org/elasticsearch/action/search/SearchPhaseController.java

@@ -140,24 +141,26 @@ static SortedTopDocs sortDocs(
    }

    static TopDocs mergeTopDocs(List<TopDocs> results, int topN, int from) {
-        if (results.isEmpty()) {
+        List<TopDocs> topDocsList = results.stream().filter(Objects::nonNull).toList();


I am still unclear how null worked before without the filtering. Stuff breaks pretty badly if you let null go through here, but I don't understand how that did not happen before we introduced batched execution, given that we did allow null, but I guess we were never calling mergeTopDocs against null results, while we do so now.

Exactly we should just not let the nulls get here was kinda my point initially, I think the way this PR does it now should be fine let me take a look :)

We never added null to this list because we didn't even queue empty query shard responses for merging. This sort of changed now because suggest+profiling aren't (yet) batched since we have no existing partial merge logic for those.
We can clean this up in a follow-up I had planned which leaves this logic on the transport thread (discussed in the previous PR on this), for now I think doing the filtering down here is fine :)

sounds good thanks, hopefully this fixes the test failures once and for all at least.

original-brownbear

LGTM unless the BwC tests complain :) thanks!

original-brownbear · 2025-04-16T11:22:09Z

server/src/main/java/org/elasticsearch/common/lucene/Lucene.java

@@ -384,7 +384,9 @@ public static void writeTotalHits(StreamOutput out, TotalHits totalHits) throws
     * by shard for sorting purposes.
     */
    public static void writeTopDocsIncludingShardIndex(StreamOutput out, TopDocs topDocs) throws IOException {
-        if (topDocs instanceof TopFieldGroups topFieldGroups) {
+        if (topDocs == null) {


I wonder if we should add BwC for this? I guess no need to if we merge to 8.19 right away since those tests didn't break in the first place and therefore we know that this branch is never taken :)

I was assuming that we are good because we are behind a feature flag. There is no roll-out made with the feature enabled, hence no bw comp guarantees?

We test against 8.19-snapshot (and it being a snapshot, it has the flag flipped) and seems like we did indeed fail the run against it?

oh boy, I see. It seems like a bit of an artificial problem around testing that will go away once we backport the fix to 8.x, but I guess it may end up breaking the world in between. I wonder if it makes sense disabling batching temporarily before merging this on both main and 8.19. Otherwise we need a transport version indeed, which we can not remove later?

I think you can by the same logic drop the other batched execution version constant once you're done backporting. That's actually fewer PRs I think? You just add the constant here, then backport, then adjust the version usage in main and remove the existing batched exec version constant in one PR.
-> 3 steps vs. 2 turn it off Prs + 1 merge and then two more for flipping the feature on again.

will do so , thanks

original-brownbear · 2025-04-16T11:23:59Z

server/src/main/java/org/elasticsearch/action/search/SearchPhaseController.java

            mergedTopDocs = TopDocs.merge(sort, from, topN, shardTopDocs);
        } else {
-            final TopDocs[] shardTopDocs = results.toArray(new TopDocs[numShards]);
+            final TopDocs[] shardTopDocs = topDocsList.toArray(new TopDocs[numShards]);


NiT: here and in the other array creation spots, if we change the line anyways lets make it a clean and faster zero size (new Type[0]) initialization for all of them like we had it before at some point? :)

original-brownbear · 2025-04-16T11:27:32Z

server/src/main/java/org/elasticsearch/action/search/SearchPhaseController.java

@@ -140,24 +141,26 @@ static SortedTopDocs sortDocs(
    }

    static TopDocs mergeTopDocs(List<TopDocs> results, int topN, int from) {
-        if (results.isEmpty()) {
+        List<TopDocs> topDocsList = results.stream().filter(Objects::nonNull).toList();


We never added null to this list because we didn't even queue empty query shard responses for merging. This sort of changed now because suggest+profiling aren't (yet) batched since we have no existing partial merge logic for those.
We can clean this up in a follow-up I had planned which leaves this logic on the transport thread (discussed in the previous PR on this), for now I think doing the filtering down here is fine :)

javanna · 2025-04-17T15:35:44Z

@original-brownbear it seems like I have made it, finally. I am going to merge this.

elasticsearchmachine · 2025-04-17T15:37:50Z

💔 Backport failed

Status	Branch	Result
❌	8.x	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 126770

We addressed the empty top docs issue with #126385 specifically for scenarios where empty top docs don't go through the wire. Yet they may be serialized from data node back to the coord node, in which case they will no longer be equal to Lucene#EMPTY_TOP_DOCS. This commit expands the existing filtering of empty top docs to include also those that did go through serialization. Closes #126742

This commit forward ports the transport version added with elastic#126770 in 8.x, and adjust the corresponding version conditionals.

This commit forward ports the transport version added with #126770 in 8.x, and adjust the corresponding version conditionals.

javanna added >bug auto-backport Automatically create backport pull requests when merged :Search Foundations/Search Catch all for Search Foundations v8.19.0 v9.1.0 labels Apr 14, 2025

javanna requested a review from original-brownbear April 14, 2025 09:37

javanna mentioned this pull request Apr 14, 2025

[CI] GeoDistanceIT testDistanceSortingWithUnmappedField failing #126742

Closed

elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Apr 14, 2025

Update docs/changelog/126770.yaml

c794c6d

[CI] Auto commit changes from spotless

5145707

original-brownbear approved these changes Apr 14, 2025

View reviewed changes

iter

810fd02

javanna commented Apr 14, 2025

View reviewed changes

iter

b5d6880

javanna commented Apr 15, 2025

View reviewed changes

Merge branch 'main' into fix/merge_remove_empty_results

d8cc823

original-brownbear approved these changes Apr 16, 2025

View reviewed changes

javanna added 5 commits April 16, 2025 14:39

iter

3d1b7d1

Merge branch 'main' into fix/merge_remove_empty_results

b07b03f

transport version

fc3725c

Merge branch 'main' into fix/merge_remove_empty_results

cc8abed

iter

6437e5e

javanna merged commit f274ab7 into elastic:main Apr 17, 2025
16 of 17 checks passed

javanna deleted the fix/merge_remove_empty_results branch April 17, 2025 15:36

elasticsearchmachine added the backport pending label Apr 17, 2025

javanna mentioned this pull request Apr 17, 2025

[8.x] Remove empty results before merging (#126770) #127031

Merged

javanna added a commit to javanna/elasticsearch that referenced this pull request Apr 18, 2025

Forward port incremental top docs 8.x transport version

0a00adb

This commit forward ports the transport version added with elastic#126770 in 8.x, and adjust the corresponding version conditionals.

javanna mentioned this pull request Apr 18, 2025

Forward port incremental top docs 8.x transport version #127048

Merged

javanna added a commit that referenced this pull request Apr 21, 2025

Forward port incremental top docs 8.x transport version (#127048)

bc0e1d0

This commit forward ports the transport version added with #126770 in 8.x, and adjust the corresponding version conditionals.

javanna removed the backport pending label Apr 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove empty results before merging #126770

Remove empty results before merging #126770

javanna commented Apr 14, 2025

elasticsearchmachine commented Apr 14, 2025

elasticsearchmachine commented Apr 14, 2025

original-brownbear left a comment

javanna Apr 14, 2025

original-brownbear Apr 14, 2025 •

edited

Loading

javanna Apr 15, 2025

javanna Apr 15, 2025

javanna Apr 15, 2025

javanna Apr 15, 2025

original-brownbear Apr 16, 2025

original-brownbear Apr 16, 2025

javanna Apr 16, 2025

original-brownbear left a comment

original-brownbear Apr 16, 2025

javanna Apr 16, 2025

original-brownbear Apr 16, 2025 •

edited

Loading

javanna Apr 16, 2025

original-brownbear Apr 16, 2025

javanna Apr 16, 2025

original-brownbear Apr 16, 2025

original-brownbear Apr 16, 2025

javanna commented Apr 17, 2025

elasticsearchmachine commented Apr 17, 2025

Remove empty results before merging #126770

Remove empty results before merging #126770

Conversation

javanna commented Apr 14, 2025

elasticsearchmachine commented Apr 14, 2025

elasticsearchmachine commented Apr 14, 2025

original-brownbear left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

original-brownbear Apr 14, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

original-brownbear left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

original-brownbear Apr 16, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

javanna commented Apr 17, 2025

elasticsearchmachine commented Apr 17, 2025

💔 Backport failed

original-brownbear Apr 14, 2025 •

edited

Loading

original-brownbear Apr 16, 2025 •

edited

Loading