Skip to content

ESQL - Remove restrictions for disjunctions in full text functions #118544

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

carlosdelest
Copy link
Member

Full Text functions are currently limited - they cannot be used as part of disjunctions, as there is not a reliable way of understanding if an expression is pushable to Lucene on the coordinator node.

Something we can do in order to lift that restriction is to ensure that when a full text function is used as part of a disjunction, then all the elements in the disjunction are full text functions, so we know for sure that they can be pushed to Lucene.

This PR checks the above, and adds testing to it.

@carlosdelest carlosdelest added >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.18.0 auto-backport Automatically create backport pull requests when merged v9.0.0 and removed v9.0.0 labels Dec 12, 2024
@carlosdelest carlosdelest marked this pull request as ready for review December 12, 2024 11:24
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine removed the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Dec 12, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @carlosdelest, I've created a changelog YAML for you.

…junction-restrictions' into enhancement/esql-match-disjunction-restrictions
@@ -102,6 +102,64 @@ book_no:keyword | title:text
7140 |The Lord of the Rings Poster Collection: Six Paintings by Alan Lee (No. 1)
;


matchWithDisjunction
required_capability: match_function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need to add a new esql capability if the bwc tests fail 🤔

Copy link
Member Author

@carlosdelest carlosdelest Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was too optimistic...

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this ! 🤟

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great improvement. I left a few small comments.

Copy link
Contributor

@tteofili tteofili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work Carlos!

@carlosdelest carlosdelest marked this pull request as ready for review December 16, 2024 06:45
Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Just a final thing to consider (which could be just my misunderstanding, I didn't double check!). Are we ok when the filter has a top-level conjunction, followed by a disjunction within the right-size clause? e.g.

| WHERE content:"fox" AND ( match(foo, "xx") OR to_upper(content) == "FOX")`

@carlosdelest
Copy link
Member Author

re we ok when the filter has a top-level conjunction, followed by a disjunction within the right-size clause? e.g.
| WHERE content:"fox" AND ( match(foo, "xx") OR to_upper(content) == "FOX")`

@ChrisHegarty , that's an error and is checked by this test.

// Exit early if we already have a failures
return;
}
Expression left = or.left();
Copy link
Member

@fang-xing-esql fang-xing-esql Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is my understanding right, that the goal of checkFullTextSearchDisjunctions is - if there is FullTextFunction exists under Or, all of the children of the Or have to be FullTextFunctions? If this is true, can this algorithm be simplified, instead of checking onlyFullTextFunctionsInExpression, can we check nonFullTextFunctionExists on either sides of the Or?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not following - can you please elaborate with an example?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking if we can simplify the logic here. Will something like this work? Do we need to check both sides separately?

            boolean hasFullText = or.anyMatch(FullTextFunction.class::isInstance);
            boolean hasOnlyFullText = onlyFullTextFunctionsInExpression(or);
            if (hasFullText) {
                if (hasOnlyFullText) {
                    // succeed
                } else {
                    // fail
                }
            } else {
                // succeed
            }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining, I see your point. I was trying to get the sub-expression that is at fault, so we can add that to the failure message. But maybe that's not super helpful and we can just error out with:

Invalid condition [first_name:"Anna" or starts_with(first_name, "Anne")]. Full text functions can be used in an OR condition, but only if just full text functions are used in the OR condition

instead of the previous:

Invalid condition [first_name:"Anna" or starts_with(first_name, "Anne")]. [:] operator can be used in an OR condition, but only if just full text functions are used in the OR condition

I have simplified this in 8901591, LMKWYT and I can keep or revert it to the previous one.

Copy link
Member

@fang-xing-esql fang-xing-esql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @carlosdelest, LGTM

@carlosdelest
Copy link
Member Author

@elasticmachine run elasticsearch-ci/part-4

@carlosdelest carlosdelest merged commit 3f1fed0 into elastic:main Dec 18, 2024
16 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 118544

@carlosdelest
Copy link
Member Author

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

carlosdelest added a commit to carlosdelest/elasticsearch that referenced this pull request Dec 18, 2024
…lastic#118544)

(cherry picked from commit 3f1fed0)

# Conflicts:
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LocalPhysicalPlanOptimizerTests.java
elasticsearchmachine pushed a commit that referenced this pull request Dec 18, 2024
…118544) (#118918)

(cherry picked from commit 3f1fed0)

# Conflicts:
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LocalPhysicalPlanOptimizerTests.java
rjernst pushed a commit to rjernst/elasticsearch that referenced this pull request Dec 18, 2024
navarone-feekery pushed a commit to navarone-feekery/elasticsearch that referenced this pull request Dec 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged backport pending >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants