ESQL: Specialize aggs AddInput for each block type #127582

ivancea · 2025-04-30T17:32:25Z

Context

While benchmarking the "skipping unused groups on aggs" WIP improvement, we found this profile:

Note the big itable access taking 30% of the time.

It came from the multiple IntBlock method calls we do when passing the groups to the aggregators. For example:

There are multiple groups.<method>() there, within the positions loop, and once for each multi-value too.

This PR

As a general improvement, and as a step towards the original improvement too, I'm replacing that itable access in that loop with a pattern matching switch on each IntBlock sub-type, and a custom call/logic for each. This change fully removed the itable access in the original feature profiling.

The bad part, is that there will be some repeated code. The good part, is that most of it is auto-generated, and it should improve performance. On the worst cases, it should be identical.

Microbenchmarks

AggregatorBenchmark ran on longs, with sum, count and count_distinct as different cases, and with all different filters cases as they are relevant too.

Left side: main (51959da)
Right side: This branch (ac35524)

Benchmark                    (blockType)        (filter)  (grouping)   (op)  Mode  Cnt   Score   Error  -> Score  Error Units
AggregatorBenchmark.run     vector_longs            none       longs  count  avgt    4   8.027   1.518     7.644  0.364 ns/op
AggregatorBenchmark.run     vector_longs   constant_true       longs  count  avgt    4  16.670   1.058    11.261  1.829 ns/op
AggregatorBenchmark.run     vector_longs        all_true       longs  count  avgt    4  27.077   3.382    14.829  2.199 ns/op
AggregatorBenchmark.run     vector_longs       half_true       longs  count  avgt    4  26.575   1.860    13.400  0.679 ns/op
AggregatorBenchmark.run     vector_longs  constant_false       longs  count  avgt    4  23.083   5.042     5.823  1.269 ns/op
AggregatorBenchmark.run  half_null_longs            none       longs  count  avgt    4  20.862   1.751    20.433  2.633 ns/op
AggregatorBenchmark.run  half_null_longs   constant_true       longs  count  avgt    4  48.153   5.325    19.425  0.896 ns/op
AggregatorBenchmark.run  half_null_longs        all_true       longs  count  avgt    4  54.481  14.401    27.827  3.408 ns/op
AggregatorBenchmark.run  half_null_longs       half_true       longs  count  avgt    4  56.866   1.582    30.780  0.310 ns/op
AggregatorBenchmark.run  half_null_longs  constant_false       longs  count  avgt    4  43.879   4.710    10.911  1.149 ns/op

Notes:

With no filters, there's no change, as this only affects the IntBlocks created by the HashBlocks, and only the filtered aggs create nulls there now
Filtered aggs, however, improved quite a lot, with times reduced by an average of 50% in those benchmarks

(cherry picked from commit a5855c1)

(cherry picked from commit 5176663)

(cherry picked from commit fb670bd)

elasticsearchmachine · 2025-04-30T17:32:49Z

Hi @ivancea, I've created a changelog YAML for you.

ivancea · 2025-04-30T17:35:58Z

.../compute/src/main/java/org/elasticsearch/compute/aggregation/GroupingAggregatorFunction.java

+        default void add(int positionOffset, IntBlock groupIds) {
+            switch (groupIds) {
+                case ConstantNullBlock ignored:
+                    // No-op
+                    break;
+                case IntVectorBlock b:
+                    add(positionOffset, b.asVector());
+                    break;
+                case IntArrayBlock b:
+                    add(positionOffset, b);
+                    break;
+                case IntBigArrayBlock b:
+                    add(positionOffset, b);
+                    break;
+            }
+        }


This class and this method specifically are the core of the PR.

For this method, I would have liked to make it final, as to avoid overrides disabling the enhancement. But it's an interface, and making it a class is probably (maybe?) not worth it (Insert more question marks here)

As for the implementation, luckily, I could simply ignore the ConstantNullBlock impl, and delegate the IntVectorBlock one to the add(IntVector) existing method. So it looks better than I expected

You could make it static maybe, but that'd involve changing a lot of call sites. I'd probably change it for a follow up you can call "100% mechanical"

elasticsearchmachine · 2025-04-30T17:39:46Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

nik9000

Looks good to me. I know we have coverage with the testAllFiltered, testNoneFiltered, testSomeFiltered. Could you verify that we have coverage with the big array blocks? I imagine a world where you typo something in that branch and we don't catch it until we do some funky transform that lands us in the BigArray branch. Maybe doctor testSomeFiltered to land us in that branch.

One thing I was curious about is - is it worth having an optimal path for where there's just nulls and no multivalued inputs? That's going to be fairly common. I don't think it's a thing for now, but it might be a thing for later.

nik9000 · 2025-05-01T12:45:36Z

.../compute/src/main/java/org/elasticsearch/compute/aggregation/GroupingAggregatorFunction.java

+        default void add(int positionOffset, IntBlock groupIds) {
+            switch (groupIds) {
+                case ConstantNullBlock ignored:
+                    // No-op
+                    break;
+                case IntVectorBlock b:
+                    add(positionOffset, b.asVector());
+                    break;
+                case IntArrayBlock b:
+                    add(positionOffset, b);
+                    break;
+                case IntBigArrayBlock b:
+                    add(positionOffset, b);
+                    break;
+            }
+        }


You could make it static maybe, but that'd involve changing a lot of call sites. I'd probably change it for a follow up you can call "100% mechanical"

nik9000 · 2025-05-01T12:47:30Z

...n/esql/compute/src/main/java/org/elasticsearch/compute/operator/HashAggregationOperator.java

-                        }
-                        end();
+                public void add(int positionOffset, IntArrayBlock groupIds) {
+                    startAggEndHash();


nik9000 · 2025-05-01T12:54:41Z

I'm a bit surprised we see any performance difference on constant_true and constant_false. We should probably detect that case and short circuit out. I thought I did that. But maybe I didn't.

ivancea · 2025-05-05T10:44:23Z

I'm a bit surprised we see any performance difference on constant_true and constant_false. We should probably detect that case and short circuit out.

@nik9000 Shortcircuit at which stage?

When the Filtered agg does its keepMask() on the groups (https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/FilteredGroupingAggregatorFunction.java#L61), when constant_false, it creates a ConstantNullBlock. As it's short-circuited in the AddInput in this PR, it's nearly instant as no aggs are called at all.
However, a constant_true needs to call the aggs as if there was no filter. So it takes more time. It looks ok to me at least

nik9000 · 2025-05-05T17:47:51Z

Ok. Got it! It's fast - but we're still grouping which is why it's not fully instant.

ivancea added 4 commits April 30, 2025 14:52

Specialize block parameters on AddInput

ba2f5ea

(cherry picked from commit a5855c1)

Call the specific add() methods for eacj block type

ff01cde

(cherry picked from commit 5176663)

Implement custom add in HashAggregationOperator

fc54ebb

(cherry picked from commit fb670bd)

Migrated everything to the new add() calls

ac35524

ivancea added >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.1.0 labels Apr 30, 2025

ivancea requested a review from nik9000 April 30, 2025 17:32

Update docs/changelog/127582.yaml

78cb4e0

github-actions bot deployed to docs-preview April 30, 2025 17:33 View deployment

ivancea commented Apr 30, 2025

View reviewed changes

Spotless format

4dd94b7

github-actions bot had a problem deploying to docs-preview April 30, 2025 17:38 Failure

Remove unused ClassName for IntVectorBlock

2aac546

ivancea marked this pull request as ready for review April 30, 2025 17:39

Merge branch 'main' into esql-aggs-hashblock-int-blocks-specialization

c266504

github-actions bot deployed to docs-preview April 30, 2025 17:39 View deployment

ivancea changed the title ~~ESQL: Specialize ags AddInput for each block type~~ ESQL: Specialize aggs AddInput for each block type Apr 30, 2025

Fixed tests

d95a81b

github-actions bot deployed to docs-preview April 30, 2025 18:05 View deployment

nik9000 approved these changes May 1, 2025

View reviewed changes

Randomize groupIds block types to check most AddInput cases

32588b5

github-actions bot deployed to docs-preview May 6, 2025 16:18 View deployment

Minor fix and added some docs

89a8bcf

github-actions bot deployed to docs-preview May 6, 2025 16:25 View deployment

Renamed BlockHashWrapper

2c95ced

github-actions bot deployed to docs-preview May 6, 2025 16:28 View deployment

Merge branch 'main' into esql-aggs-hashblock-int-blocks-specialization

3635e70

github-actions bot deployed to docs-preview May 6, 2025 16:42 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESQL: Specialize aggs AddInput for each block type #127582

ESQL: Specialize aggs AddInput for each block type #127582

ivancea commented Apr 30, 2025 •

edited

Loading

elasticsearchmachine commented Apr 30, 2025

ivancea Apr 30, 2025

nik9000 May 1, 2025

elasticsearchmachine commented Apr 30, 2025

nik9000 left a comment

nik9000 May 1, 2025

nik9000 May 1, 2025

nik9000 commented May 1, 2025

ivancea commented May 5, 2025

nik9000 commented May 5, 2025

ESQL: Specialize aggs AddInput for each block type #127582

Are you sure you want to change the base?

ESQL: Specialize aggs AddInput for each block type #127582

Conversation

ivancea commented Apr 30, 2025 • edited Loading

Context

This PR

Microbenchmarks

elasticsearchmachine commented Apr 30, 2025

ivancea Apr 30, 2025

Choose a reason for hiding this comment

nik9000 May 1, 2025

Choose a reason for hiding this comment

elasticsearchmachine commented Apr 30, 2025

nik9000 left a comment

Choose a reason for hiding this comment

nik9000 May 1, 2025

Choose a reason for hiding this comment

nik9000 May 1, 2025

Choose a reason for hiding this comment

nik9000 commented May 1, 2025

ivancea commented May 5, 2025

nik9000 commented May 5, 2025

ivancea commented Apr 30, 2025 •

edited

Loading