Skip to content

Using competitive iterators in Filters agg is broken #126955

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
benwtrent opened this issue Apr 16, 2025 · 9 comments
Open

Using competitive iterators in Filters agg is broken #126955

benwtrent opened this issue Apr 16, 2025 · 9 comments
Labels
:Analytics/Aggregations Aggregations >bug :Search/Search Search-related issues that do not fall into other categories Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Search Meta label for search team

Comments

@benwtrent
Copy link
Member

Elasticsearch Version

9.1.0

Installed Plugins

No response

Java Version

bundled

OS Version

any

Problem Description

When using a query & a filters aggregation, we may use a competitive iterator, with the Lucene 10.2 upgrade, this is now broken. This results in an indeterminable number of counts. The exact reasoning is still unknown

Steps to Reproduce

This has been replicated in a unit test and will be committed shortly.

Generally, the idea is you have terms that you are filtering on and doing a filters agg while also doing a term query.

Logs (if relevant)

No response

@benwtrent benwtrent added :Analytics/Aggregations Aggregations :Search/Search Search-related issues that do not fall into other categories >bug blocker stateful Marking issues only relevant for stateful releases labels Apr 16, 2025
@elasticsearchmachine elasticsearchmachine added Team:Search Meta label for search team Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Apr 16, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@benwtrent
Copy link
Member Author

I THINK it has to do with the "intoBitSet" stuff. If it does regular iteration, everything works well, Once it does intoBitSet the numbers get all "weird" and the iterators skip huge sections of validly matching docs.

@benwtrent
Copy link
Member Author

OK, I think I see the issue, I am not sure how to fix.

The DenseConjunctionBulkScorer#scoreWindowUsingBitSet utilizes the competitive iterator we return. It will drain the iterator into the bit set

    for(upTo = 1; upTo < iterators.size() && this.windowMatches.cardinality() >= threshold; ++upTo) {
      DocIdSetIterator other = (DocIdSetIterator)iterators.get(upTo);
      if (other.docID() < windowBase) {
        other.advance(windowBase);
      }

      other.intoBitSet(windowMax, this.clauseWindowMatches, windowBase);
      this.windowMatches.and(this.clauseWindowMatches);
      this.clauseWindowMatches.clear();
    }

, doing an and and then calling

collector.collect(new BitSetDocIdStream(this.windowMatches, windowBase));

So, the docId stream we then get in the leaf collector is the set of docIds where the iterators overlap. However, we need to know which doc actually matches which doc (e.g. topList). So, I don't know how to handle that.

It seems we need to handle this bitset iteration correctly (or bypass it somehow).

Class in question: https://github.com/apache/lucene/blob/branch_10_2/lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java

@ChrisHegarty
Copy link
Contributor

With the way that DenseConjunctionBulkScorer now eagerly drains chunks of iterators into a bitSet, we can no longer observe the advancement of the iterators as we collect. The observable side-effect of the advancement of the iterators is not really something that was ever guaranteed, but clearly worked prior to Lucene 10.2. I don't see any obvious or clear way to restructure things that will allow us to use a competitive iterator here.

@benwtrent
Copy link
Member Author

related: apache/lucene#14517

@jdcryans
Copy link

FYSA we also have #126939 tracking the stack trace that you shared in that lucene issue, @benwtrent

@jpountz
Copy link
Contributor

jpountz commented Apr 18, 2025

I can think of two fixes:

  1. Disabling competitive iterators on this aggregation by returning null. It was implemented mostly because we could, and it was nice to have good performance in the corner case when the union of filters still matches a small share of the dataset.
  2. Pulling different DocIdSetIterator instances for the competitive iterator and to compute the aggregation.

benwtrent added a commit that referenced this issue Apr 23, 2025
I forgot to bypass the competitive iteration for the single bucket case when previously

closes: #127262

related: #126956
related: #126955
@benwtrent benwtrent removed blocker stateful Marking issues only relevant for stateful releases labels May 1, 2025
@benwtrent
Copy link
Member Author

Removing the blocker tag. We have disabled competitive iterators in filters agg. While we want to be able to recover performance, it is currently no longer a blocking bug as behavior is correct now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >bug :Search/Search Search-related issues that do not fall into other categories Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Search Meta label for search team
Projects
None yet
Development

No branches or pull requests

5 participants