-
Notifications
You must be signed in to change notification settings - Fork 25.2k
ESQL: Specialize aggs AddInput for each block type #127582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
ESQL: Specialize aggs AddInput for each block type #127582
Conversation
Hi @ivancea, I've created a changelog YAML for you. |
default void add(int positionOffset, IntBlock groupIds) { | ||
switch (groupIds) { | ||
case ConstantNullBlock ignored: | ||
// No-op | ||
break; | ||
case IntVectorBlock b: | ||
add(positionOffset, b.asVector()); | ||
break; | ||
case IntArrayBlock b: | ||
add(positionOffset, b); | ||
break; | ||
case IntBigArrayBlock b: | ||
add(positionOffset, b); | ||
break; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class and this method specifically are the core of the PR.
For this method, I would have liked to make it final
, as to avoid overrides disabling the enhancement. But it's an interface, and making it a class is probably (maybe?) not worth it (Insert more question marks here)
As for the implementation, luckily, I could simply ignore the ConstantNullBlock
impl, and delegate the IntVectorBlock
one to the add(IntVector)
existing method. So it looks better than I expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make it static
maybe, but that'd involve changing a lot of call sites. I'd probably change it for a follow up you can call "100% mechanical"
Pinging @elastic/es-analytical-engine (Team:Analytics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I know we have coverage with the testAllFiltered
, testNoneFiltered
, testSomeFiltered
. Could you verify that we have coverage with the big array blocks? I imagine a world where you typo something in that branch and we don't catch it until we do some funky transform that lands us in the BigArray branch. Maybe doctor testSomeFiltered
to land us in that branch.
One thing I was curious about is - is it worth having an optimal path for where there's just null
s and no multivalued inputs? That's going to be fairly common. I don't think it's a thing for now, but it might be a thing for later.
default void add(int positionOffset, IntBlock groupIds) { | ||
switch (groupIds) { | ||
case ConstantNullBlock ignored: | ||
// No-op | ||
break; | ||
case IntVectorBlock b: | ||
add(positionOffset, b.asVector()); | ||
break; | ||
case IntArrayBlock b: | ||
add(positionOffset, b); | ||
break; | ||
case IntBigArrayBlock b: | ||
add(positionOffset, b); | ||
break; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make it static
maybe, but that'd involve changing a lot of call sites. I'd probably change it for a follow up you can call "100% mechanical"
} | ||
end(); | ||
public void add(int positionOffset, IntArrayBlock groupIds) { | ||
startAggEndHash(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
I'm a bit surprised we see any performance difference on |
@nik9000 Shortcircuit at which stage? When the Filtered agg does its |
Ok. Got it! It's fast - but we're still grouping which is why it's not fully instant. |
Context
While benchmarking the "skipping unused groups on aggs" WIP improvement, we found this profile:

Note the big
itable
access taking 30% of the time.It came from the multiple

IntBlock
method calls we do when passing the groups to the aggregators. For example:There are multiple
groups.<method>()
there, within the positions loop, and once for each multi-value too.This PR
As a general improvement, and as a step towards the original improvement too, I'm replacing that
itable
access in that loop with a pattern matching switch on eachIntBlock
sub-type, and a custom call/logic for each. This change fully removed theitable
access in the original feature profiling.The bad part, is that there will be some repeated code. The good part, is that most of it is auto-generated, and it should improve performance. On the worst cases, it should be identical.
Microbenchmarks
AggregatorBenchmark
ran on longs, with sum, count and count_distinct as different cases, and with all differentfilters
cases as they are relevant too.Notes: