Optimize filters aggregation with a single filter #99202

jpountz · 2023-09-05T16:49:43Z

Description

Follow-up of #98360: when there is a single filter, the collector could save the overhead of the priority queue (both in collect() and competitiveIterator()), which would likely result in a speedup.

The text was updated successfully, but these errors were encountered:

elasticsearchmachine · 2023-09-05T16:50:06Z

Pinging @elastic/es-analytics-geo (Team:Analytics)

When FiltersAggregator has a single filter, there is no benefit in using a DisiPriorityQueue as the heap will only contain values from a single iterator. In such a case, it's preferable to use the filtering approximation iterator directly as competitive iterator. Fixes elastic#99202

* Use a competitive iterator in FiltersAggregator. The iterator is used to combine filtering with querying in leaf collection. Its benefit is that rangers with docs that are filtered out by all filters are skipped from doc collection. The competitive iterator is restricted to FiltersAggregator, not used in FilterByFilterAggregator that's already optimized. It only applies to top-level filter aggregations with no "other" bucket defined; the latter leads to collecting all docs so there's no point in skipping doc ranges. Fixes #97544 * Fix function name. * Advance iterator on two-phase mismatch. * Restore docid tracking. * Fix failing tests. * Fix failing test. * Fix more tests. * Update docs/changelog/98360.yaml * More test fixes. * Update docs/changelog/98360.yaml * Skip checking useCompetitiveIterator in collect * Find approximate matches in CompetitiveIterator * Use DisiPriorityQueue to simplify FiltersAggregator * Skip competitive iterator when all docs match. * Check for empty priority queue. * Skip DisiPriorityQueue on single filter agg. When FiltersAggregator has a single filter, there is no benefit in using a DisiPriorityQueue as the heap will only contain values from a single iterator. In such a case, it's preferable to use the filtering approximation iterator directly as competitive iterator. Fixes #99202 * Update docs/changelog/99215.yaml * Use FilterMatchingDisiWrapper in leaf collectors.

jpountz added >enhancement :Analytics/Aggregations Aggregations labels Sep 5, 2023

jpountz mentioned this issue Sep 5, 2023

Use a competitive iterator in FiltersAggregator #98360

Merged

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Sep 5, 2023

kkrik-es mentioned this issue Sep 6, 2023

Skip DisiPriorityQueue on single filter agg #99215

Merged

kkrik-es closed this as completed in #99215 Sep 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize filters aggregation with a single filter #99202

Optimize filters aggregation with a single filter #99202

jpountz commented Sep 5, 2023

elasticsearchmachine commented Sep 5, 2023

Optimize filters aggregation with a single filter #99202

Optimize filters aggregation with a single filter #99202

Comments

jpountz commented Sep 5, 2023

Description

elasticsearchmachine commented Sep 5, 2023