Skip to content

ESQL: Prevent search functions work with a non-STANDARD index #130638

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 4, 2025

Conversation

bpintea
Copy link
Contributor

@bpintea bpintea commented Jul 4, 2025

This introduces verifications to prevent search functions work on fields introduced by a LOOKUP JOIN righthand-side index.

This should be a temporary fix until we can either push these filters down also on the righthand-side of a JOIN or have these functions execute within the engine.

Closes #130561
Closes #129778

This introduces verifications to prevent search functions work on fields
introduced by a LOOKUP JOIN righthand-side index.
@bpintea bpintea added >bug auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v9.2.0 v9.1.1 v8.19.1 labels Jul 4, 2025
@bpintea bpintea changed the title ESQL: Prevent search functions work with a LOOKUP index ESQL: Prevent search functions work with a non-STANDARD index Jul 4, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @bpintea, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Hi @bpintea, I've updated the changelog YAML for you.

@bpintea bpintea marked this pull request as ready for review July 4, 2025 15:02
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jul 4, 2025
@carlosdelest
Copy link
Member

have these functions execute within the engine.

@bpintea search functions can be executed within the compute engine. Is there something specific we should do on the LOOKUP JOIN side for the engine to use them?

Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from the prevention side - Would like to know more about how we can enable these functions in LOOKUP JOIN 🙂

@bpintea
Copy link
Contributor Author

bpintea commented Jul 4, 2025

Is there something specific we should do on the LOOKUP JOIN side for the engine to use them?

@carlosdelest, it looks like it'll require dedicated work: from the quick look I had, the ShardContext that the evaluators use refer the lefthand-side of the JOIN index only. So the search functions queries end up as MatchNoDocsQuery. And that's why they do execute "fine", but never match.

PS. Since (some of?) these can be evaluated internally, should we enable the EVAL-uation on them? Or having them run on a computed field (ex. | EVAL foo = CONCAT(...) | WHERE MATCH(foo, ...))?

@carlosdelest
Copy link
Member

the ShardContext that the evaluators use refer the lefthand-side of the JOIN index only. So the search functions queries end up as MatchNoDocsQuery. And that's why they do execute "fine", but never match.

Got it, thanks! 👍

PS. Since (some of?) these can be evaluated internally, should we enable the EVAL-uation on them?

We could. There are a number of restrictions that we're looking to lift.

Or having them run on a computed field (ex. | EVAL foo = CONCAT(...) | WHERE MATCH(foo, ...))?

That's trickier - full text functions still depend on Lucene, even when being evaluated on the compute engine, so they depend on the field being indexed.

We could implement them on a computed field in case we created a Lucene in-memory index - something doable, but it hasn't been prioritized yet.

@bpintea
Copy link
Contributor Author

bpintea commented Jul 4, 2025

That's trickier - full text functions still depend on Lucene, even when being evaluated on the compute engine, so they depend on the field being indexed.

Right, right. I think EVAL'ing might be handier, this the computed-value idea can wait (also for broader consensus for such a need).

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @bpintea ! Only minor comments from me, please consider them at your own discretion.

plan.forEachDown(p -> {
if (p instanceof EsRelation esRelation && esRelation.indexMode() != IndexMode.STANDARD) {
// Check if this EsRelation supplies the field
if (esRelation.output().stream().anyMatch(attr -> attr.id().equals(fieldAttribute.id()))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more efficient to do esRelation.outputSet().contains(fieldAttribute)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would, ofc, thx!

};
}

public static void fieldVerifier(LogicalPlan plan, FullTextFunction function, Supplier<Expression> fieldSupplier, Failures failures) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a more natural placement would maybe be inside FullTextFunction rather than Match, as other non-inheritors of Match also reach for this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, was a lost intention, thanks.

};
}

public static void fieldVerifier(LogicalPlan plan, FullTextFunction function, Supplier<Expression> fieldSupplier, Failures failures) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: why is fieldSupplier a supplier, rather than using a simple Expression? Seems like we hand it () -> field every time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It no longer needs to, refactored.

@bpintea
Copy link
Contributor Author

bpintea commented Jul 4, 2025

Thanks Carlos, Alex!

@bpintea bpintea added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jul 4, 2025
@elasticsearchmachine elasticsearchmachine merged commit 5c4d17d into elastic:main Jul 4, 2025
32 checks passed
@bpintea bpintea deleted the fix/ftf_lj branch July 4, 2025 18:45
bpintea added a commit to bpintea/elasticsearch that referenced this pull request Jul 4, 2025
…c#130638)

This introduces verifications to prevent search functions work on fields
introduced by a LOOKUP JOIN righthand-side index.

This should be a temporary fix until we can either push these filters
down also on the righthand-side of a JOIN or have these functions
execute within the engine.

Closes elastic#130561 Closes elastic#129778
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
9.1
8.19 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 130638

elasticsearchmachine pushed a commit that referenced this pull request Jul 7, 2025
…130638) (#130646)

* ESQL: Prevent search functions work with a non-STANDARD index (#130638)

This introduces verifications to prevent search functions work on fields
introduced by a LOOKUP JOIN righthand-side index.

This should be a temporary fix until we can either push these filters
down also on the righthand-side of a JOIN or have these functions
execute within the engine.

Closes #130561 Closes #129778

* adapt test to 9.1
elasticsearchmachine pushed a commit that referenced this pull request Jul 7, 2025
…130638) (#130704)

* ESQL: Prevent search functions work with a non-STANDARD index (#130638)

This introduces verifications to prevent search functions work on fields
introduced by a LOOKUP JOIN righthand-side index.

This should be a temporary fix until we can either push these filters
down also on the righthand-side of a JOIN or have these functions
execute within the engine.

Closes #130561 Closes #129778

(cherry picked from commit 5c4d17d)

* 8.19 test addaptation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport pending >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.19.1 v9.1.1 v9.2.0
Projects
None yet
4 participants