Skip to content

ESQL: Don't push down filters on the right hand side of an inlinejoin #127383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

astefan
Copy link
Contributor

@astefan astefan commented Apr 25, 2025

Pushing down filters on the left hand side of an inlinestats commands could apparently be beneficial, but the source of the right hand side of the said inlinestats is the same as the left hand side one and this will inherently use the same filters, which is wrong.

For ex, FROM employees | INLINESTATS max_salary = MAX(salary) by languages | KEEP emp_no, languages, max_salary | WHERE emp_no == 10003 would consider emp_no == 10003 as eligible to be used like WHERE emp_no == 10003 | STATS MAX(salary) by languages which changes drastically the outcome of the stats command.

@elasticsearchmachine
Copy link
Collaborator

Hi @astefan, I've created a changelog YAML for you.

@astefan astefan marked this pull request as ready for review April 29, 2025 08:46
@astefan astefan requested a review from alex-spies April 29, 2025 08:46
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Apr 29, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@astefan
Copy link
Contributor Author

astefan commented Apr 29, 2025

@elasticmachine run elasticsearch-ci/part-3

1 similar comment
@astefan
Copy link
Contributor Author

astefan commented Apr 29, 2025

@elasticmachine run elasticsearch-ci/part-3

@astefan astefan added auto-backport Automatically create backport pull requests when merged v9.0.2 and removed v9.0.1 labels Apr 30, 2025
Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simple enough, just disable the faulty optimization and be done with it. Thank you @astefan !

I left some thoughts on how we could re-enable this optimization. I believe this hinges on the StubRelation approach, which I think leads to issues that need to be thought about and already had prompted me to open #124752 .

Comment on lines +89 to +91
// TODO: could we do better here about pushing down filters for inlinestats?
// See also https://github.com/elastic/elasticsearch/issues/127497
// Push down past INLINESTATS if the condition is on the groupings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can do better, but the stub relation approach of InlineJoin is getting in the way.

The StubRelation approach also hinges on the subtle idea that the same name ids occuring on the left and right hand side of the join is okay, but I have doubts about that.

Since we're not physically re-using the left hand side output to compute the stats on the right, but we instead run 2 separate (but dependent) queries, I believe it may be simpler to just "copy" the left hand side in place of the stub; this would allow for independent optimization more easily and be more honest, although I haven't thought through all the implications of what I just said.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, StubRelation needs a rethinking especially related to Mapper. Thanks @alex-spies

@astefan astefan merged commit 4bb4d4b into elastic:main May 5, 2025
17 checks passed
@astefan astefan deleted the skip_filters_pushdown_on_inlinestats_right_side branch May 5, 2025 11:50
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
9.0 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 127383

astefan added a commit to astefan/elasticsearch that referenced this pull request May 5, 2025
…elastic#127383)

* Don't push down filters on the right hand side of an inlinejoin.

(cherry picked from commit 4bb4d4b)
elasticsearchmachine pushed a commit that referenced this pull request May 5, 2025
…#127383) (#127712)

* Don't push down filters on the right hand side of an inlinejoin.

(cherry picked from commit 4bb4d4b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.0.2 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants