-
Notifications
You must be signed in to change notification settings - Fork 25.2k
ESQL: Split grouping functions based on their EVAL-ability #126597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
8b91fbc
a35e7bc
fe92dca
8d0e6fa
1162f17
07777f8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,17 +22,13 @@ | |
|
||
import static org.elasticsearch.xpack.esql.common.Failure.fail; | ||
|
||
public abstract class GroupingFunction extends Function implements EvaluatorMapper, PostAnalysisPlanVerificationAware { | ||
public abstract sealed class GroupingFunction extends Function implements PostAnalysisPlanVerificationAware permits | ||
GroupingFunction.NonEvaluatableGroupingFunction, GroupingFunction.EvaluatableGroupingFunction { | ||
|
||
protected GroupingFunction(Source source, List<Expression> fields) { | ||
super(source, fields); | ||
} | ||
|
||
@Override | ||
public Object fold(FoldContext ctx) { | ||
return EvaluatorMapper.super.fold(source(), ctx); | ||
} | ||
|
||
@Override | ||
public BiConsumer<LogicalPlan, Failures> postAnalysisPlanVerification() { | ||
return (p, failures) -> { | ||
|
@@ -45,4 +41,29 @@ public BiConsumer<LogicalPlan, Failures> postAnalysisPlanVerification() { | |
}; | ||
} | ||
|
||
/** | ||
* This is a class of grouping functions that cannot be evaluated outside the context of an aggregation. | ||
* They will have their evaluation implemented part of an aggregation, which may keep state for their execution, making them "stateful" | ||
* grouping functions. | ||
*/ | ||
public abstract static non-sealed class NonEvaluatableGroupingFunction extends GroupingFunction { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we add some javadocs here? Maybe mentioning that "it's (currently?) used for stateful grouping functions, which make them effectively non-evaluatable before the agg, so its value can't be used inside the aggs". Even if it has some high level examples, I think it will be useful for whoever has to reuse them in the future There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, added some docs. |
||
protected NonEvaluatableGroupingFunction(Source source, List<Expression> fields) { | ||
super(source, fields); | ||
} | ||
} | ||
|
||
/** | ||
* This is a class of grouping functions that can be evaluated independently within an EVAL operator, independent of the aggregation | ||
* they're used by. | ||
*/ | ||
public abstract static non-sealed class EvaluatableGroupingFunction extends GroupingFunction implements EvaluatorMapper { | ||
protected EvaluatableGroupingFunction(Source source, List<Expression> fields) { | ||
super(source, fields); | ||
} | ||
|
||
@Override | ||
public Object fold(FoldContext ctx) { | ||
return EvaluatorMapper.super.fold(source(), ctx); | ||
} | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,7 +16,7 @@ | |
import org.elasticsearch.xpack.esql.core.tree.Source; | ||
import org.elasticsearch.xpack.esql.core.util.Holder; | ||
import org.elasticsearch.xpack.esql.expression.function.aggregate.AggregateFunction; | ||
import org.elasticsearch.xpack.esql.expression.function.grouping.Categorize; | ||
import org.elasticsearch.xpack.esql.expression.function.grouping.GroupingFunction; | ||
import org.elasticsearch.xpack.esql.plan.logical.Aggregate; | ||
import org.elasticsearch.xpack.esql.plan.logical.Eval; | ||
import org.elasticsearch.xpack.esql.plan.logical.LogicalPlan; | ||
|
@@ -50,20 +50,20 @@ public ReplaceAggregateAggExpressionWithEval() { | |
|
||
@Override | ||
protected LogicalPlan rule(Aggregate aggregate) { | ||
// build alias map | ||
// an alias map for evaluatable grouping functions | ||
AttributeMap.Builder<Expression> aliasesBuilder = AttributeMap.builder(); | ||
aggregate.forEachExpressionUp(Alias.class, a -> aliasesBuilder.put(a.toAttribute(), a.child())); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This iteration can be combined with the one below, so much more as they do complementary things on each iteration. |
||
var aliases = aliasesBuilder.build(); | ||
|
||
// Build Categorize grouping functions map. | ||
// Functions like BUCKET() shouldn't reach this point, | ||
// as they are moved to an early EVAL by ReplaceAggregateNestedExpressionWithEval | ||
Map<Categorize, Attribute> groupingAttributes = new HashMap<>(); | ||
// a function map for non-evaluatable grouping functions | ||
Map<GroupingFunction.NonEvaluatableGroupingFunction, Attribute> nonEvalGroupingAttributes = new HashMap<>( | ||
aggregate.groupings().size() | ||
); | ||
aggregate.forEachExpressionUp(Alias.class, a -> { | ||
if (a.child() instanceof Categorize groupingFunction) { | ||
groupingAttributes.put(groupingFunction, a.toAttribute()); | ||
if (a.child() instanceof GroupingFunction.NonEvaluatableGroupingFunction groupingFunction) { | ||
nonEvalGroupingAttributes.put(groupingFunction, a.toAttribute()); | ||
} else { | ||
aliasesBuilder.put(a.toAttribute(), a.child()); | ||
} | ||
}); | ||
var aliases = aliasesBuilder.build(); | ||
|
||
// break down each aggregate into AggregateFunction and/or grouping key | ||
// preserve the projection at the end | ||
|
@@ -123,8 +123,11 @@ protected LogicalPlan rule(Aggregate aggregate) { | |
return alias.toAttribute(); | ||
}); | ||
|
||
// replace grouping functions with their references | ||
aggExpression = aggExpression.transformUp(Categorize.class, groupingAttributes::get); | ||
// replace non-evaluatable grouping functions with their references | ||
aggExpression = aggExpression.transformUp( | ||
GroupingFunction.NonEvaluatableGroupingFunction.class, | ||
nonEvalGroupingAttributes::get | ||
); | ||
|
||
Alias alias = as.replaceChild(aggExpression); | ||
newEvals.add(alias); | ||
|
@@ -152,7 +155,7 @@ protected LogicalPlan rule(Aggregate aggregate) { | |
return plan; | ||
} | ||
|
||
static String syntheticName(Expression expression, Expression af, int counter) { | ||
private static String syntheticName(Expression expression, Expression af, int counter) { | ||
return TemporaryNameUtils.temporaryName(expression, af, counter); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Should this add a
final foldable()=false
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but not sure. We can't fold these functions through EvaluatorMapper / with EVAL's operator, but we might maybe with another agg operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. With the "evaluated outside of an aggregation" docs you added, it's also clear what this means specifically.
I still feel like a foldable()=true could lead to some rule doing something nasty, as "foldable()" doesn't specify "inside an eval" in any way, it's just foldable or not. But I can't think of a grouping causing trouble here, so maybe when it appears, we can do whatever we must
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. The
foldable()
isfalse
by default so one would have to actively switch it totrue
, which will indeed become immediately problematic.