Speedup equals #126394

idegtiarenko · 2025-04-07T11:39:51Z

We spend ~7-8% of time computing equals during tree traversal.
If my understanding correct, we mostly return the same sub-tree/attributes. However our equals checks are structured in a way that reaching to this == obj require multiple super.equals calls. Even if that check returns true, we are still comparing fields in sub-classes (this might get quiet expensive with string equality check).

This PR restructure equals to always check reference equality first, before calling super.
I believe this should become cheaper. We might also consider adding equalsByFields that skip reference and class check and call it from the sub-classes to avoid duplicating those in every equals check.

^^ equals calls are highlighted in pink

elasticsearchmachine · 2025-04-07T11:40:15Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

alex-spies

I think this change can make sense.

But it increases complexity, adding risk to very fundamental methods of important classes; we had bugs from wrong equals implementations before, and they can be tricky to find.

Therefore, before merging it, I'd like to:

Know how much faster this makes us in some queries, and ideally why.
Add javadoc/comments so that future us know why the code is the way it is.
Check if we can factor duplicated code into helper methods without losing (a lot of) the performance gains.

alex-spies · 2025-04-08T12:14:15Z

...k/plugin/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/Attribute.java

-        if (super.equals(obj)) {
-            Attribute other = (Attribute) obj;
-            return Objects.equals(nullability, other.nullability);
+        if (this == obj) {


I don't fully understand this change.

The checks that were added to Attribute.equals are already performed in super.equals, and in the same order. Now we're performing these checks twice, and this equals method has become confusing to read.

How does this contribute to a speedup? Is there lower cost from megamorphic calls with this change? (If needed, this can be confirmed by running a small microbench with the async profiler and having it print the machine code for hot parts of the code.) How much faster do we get with this change? I.e., is it worth it to make our equals methods more complicated? Unless the speedup is noticeable, making equals methods complex is risky and we had bugs in those before.

If the megamorphic calls are causing slowdowns and this makes it noticeably better, I think we should:

a) Add javadoc to NamedExpression.java for warning everyone that they shouldn't just use super.equals as the first thing they call because of the performance cost of megamorphic calls, and they should rather perform all cheap equality check before that, and

b) add a method NamedExpression.equalsByFields and call it from here to avoid redundancy, like you suggested.

The latter is also important to make the intent of this code clear - currently, it appears like the code is redundant due to technical debt. (Which is a sane assumption, given that we inherited this code initially from the ql package and many redundancies actually just haven't been cleaned up, yet.)

Is there lower cost from megamorphic calls with this change?

I believe these are the rule for if you pay the megamorphic cost is "just": did you invokevirtual on a method with more than two overrides loaded by the jvm? I'll bet it's more complex than that - but that's a simple rule you can hold in your head.

UnsupportedAttribute a, b; a.equal(b); NO - concrete subclass FieldAttribute a, b; a.equal(b); NO - only one subclass - it's pretty fast Attribute a, b; a.equal(b); YES - many subclasses

For what it's worth, I think we might have more luck with the kind of lame way of writing equals:

Node { final boolean equals(Object other) { if(this == other) {return true;} if(false == getClass().equals(other.getClass())) { return false; } return nodeEquals(other) && children.equals(other.children) } abstract boolean nodeEquals(Object other); } Expression { @Override final boolean nodeEquals(Object other) { return whatever && expressionEquals(other); } abstract boolean expressionEquals(Object other);

This would force the == and getClass paths super early. No dynamic invocations to get to them because the root method is final. OTOH, you'd end up with a chain of virtual invocations. On the other other hand it's easier to maintain the list of comparisons here.

On the other other other other other other hand - we could get a lot of benefit from doing the == comparison in the transform methods and skipping .equals if they are ==.

I dunno. It'd take some fiddling.

Discussed this offline, short summary:

Is there lower cost from megamorphic calls with this change?

Correct, I expect this part to become a bit cheaper.

The other part that contribute to the cost of equals is that even if this == obj is true and checked in parent equals, with current structure we still compare child object fields. This might get expensive when comparing strings or other non primitive types.

Discussed offline with @idegtiarenko : it's potentially not even about megamorphic calls, but about the fact that calling super.equals(...) doesn't currently short circuit and immediately return true if the objects are identical.

However, identical objects is something we encounter all the time when optimizer rules try to make changes that don't apply to the current node, e.g. here.

alex-spies · 2025-04-08T12:18:34Z

.../esql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/MetadataAttribute.java

+        if (this == o) {
+            return true;
+        }
+        if (o == null || getClass() != o.getClass() || super.equals(o) == false) {


Could we encapsulate this repeated part (it's the same for all inheritors of NamedExpression) in a (static?) method, or would that kill any gains that we got from inlining this?

alex-spies · 2025-04-08T12:22:35Z

...sql/src/main/java/org/elasticsearch/xpack/esql/expression/function/UnsupportedAttribute.java

-        }
-        return false;
+    public int hashCode() {
+        return Objects.hash(super.hashCode(), message, hasCustomMessage);


Why did we flip the order of message and hasCustomMessage?

alex-spies

As discussed offline, the gist of this change makes a lot of sense, namely the short-circuiting of equals in case of identical objects; I think this improvement is obvious enough that it doesn't require a micro bench to demonstrate the effect, as long as we can comment this well, and refactor this a little to cut down on redundancy.

alex-spies · 2025-04-08T14:07:52Z

...k/plugin/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/Attribute.java

-        if (super.equals(obj)) {
-            Attribute other = (Attribute) obj;
-            return Objects.equals(nullability, other.nullability);
+        if (this == obj) {


Discussed offline with @idegtiarenko : it's potentially not even about megamorphic calls, but about the fact that calling super.equals(...) doesn't currently short circuit and immediately return true if the objects are identical.

However, identical objects is something we encounter all the time when optimizer rules try to make changes that don't apply to the current node, e.g. here.

alex-spies

This looks good now and makes an obvious perf improvement without sacrificing maintainability IMO. I like it, thanks @idegtiarenko !

There's a minor change in UnresolvedAttribute.equals that should be fixed before merging, but otherwise this LGTM!

alex-spies · 2025-04-08T14:29:52Z

...in/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/NamedExpression.java

@@ -61,15 +61,25 @@ public int hashCode() {
        return Objects.hash(super.hashCode(), name, synthetic);
    }

+    /**
+     * Polymorphic equality is a pain.


Maybe let's mention that we consider performance, and especially short circuiting in case of object equality, because our tree traversal methods usually try to return the same object when there is no change needed.

alex-spies · 2025-04-08T14:30:05Z

...in/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/NamedExpression.java

+    /**
+     * Polymorphic equality is a pain.
+     * This equals shortcuts `this == o` and type checks.
+     * Here equals is final to ensure we are not duplication those checks.


Suggested change

* Here equals is final to ensure we are not duplication those checks.

* Here equals is final to ensure we are not duplicating those checks.

...sql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/UnresolvedAttribute.java

...sql/src/main/java/org/elasticsearch/xpack/esql/expression/function/UnsupportedAttribute.java

(cherry picked from commit e95397c)

Speedup equals

9c008f1

idegtiarenko added >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.1.0 labels Apr 7, 2025

idegtiarenko requested review from costin and alex-spies April 7, 2025 11:39

idegtiarenko added 2 commits April 8, 2025 08:12

Merge branch 'main' into speedup_equals

c1f5832

Merge branch 'main' into speedup_equals

1320be5

alex-spies requested changes Apr 8, 2025

View reviewed changes

alex-spies reviewed Apr 8, 2025

View reviewed changes

upd

6a4b18c

idegtiarenko requested review from alex-spies and nik9000 April 8, 2025 14:12

alex-spies approved these changes Apr 8, 2025

View reviewed changes

idegtiarenko added 2 commits April 9, 2025 08:30

fix comments

39838b8

Merge branch 'main' into speedup_equals

5e5bc16

idegtiarenko merged commit e95397c into elastic:main Apr 9, 2025
17 checks passed

idegtiarenko deleted the speedup_equals branch April 9, 2025 07:36

idegtiarenko added a commit to idegtiarenko/elasticsearch that referenced this pull request Jun 5, 2025

Speedup equals (elastic#126394)

95c9d71

(cherry picked from commit e95397c)

idegtiarenko added a commit to idegtiarenko/elasticsearch that referenced this pull request Jun 5, 2025

Speedup equals (elastic#126394)

630d000

(cherry picked from commit e95397c)

idegtiarenko mentioned this pull request Jun 5, 2025

[8.19] Speedup equals #128949

Merged

idegtiarenko added the v6.8.19 label Jun 5, 2025

elasticsearchmachine pushed a commit that referenced this pull request Jun 5, 2025

Speedup equals (#126394) (#128949)

d7fb0e2

(cherry picked from commit e95397c)

idegtiarenko added v8.19.0 and removed v6.8.19 labels Jun 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speedup equals #126394

Speedup equals #126394

Uh oh!

idegtiarenko commented Apr 7, 2025 •

edited

Loading

Uh oh!

elasticsearchmachine commented Apr 7, 2025

Uh oh!

alex-spies left a comment

Uh oh!

alex-spies Apr 8, 2025

Uh oh!

nik9000 Apr 8, 2025

Uh oh!

idegtiarenko Apr 8, 2025

Uh oh!

alex-spies Apr 8, 2025

Uh oh!

alex-spies Apr 8, 2025

Uh oh!

alex-spies Apr 8, 2025

Uh oh!

alex-spies left a comment

Uh oh!

alex-spies Apr 8, 2025

Uh oh!

alex-spies left a comment

Uh oh!

alex-spies Apr 8, 2025

Uh oh!

alex-spies Apr 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

	* Here equals is final to ensure we are not duplication those checks.
	* Here equals is final to ensure we are not duplicating those checks.

Speedup equals #126394

Speedup equals #126394

Uh oh!

Conversation

idegtiarenko commented Apr 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Apr 7, 2025

Uh oh!

alex-spies left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alex-spies left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alex-spies left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

idegtiarenko commented Apr 7, 2025 •

edited

Loading