-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Add multi_match function #121525 #125062
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multi_match function #121525 #125062
Conversation
TODO: add tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good! 💯
Things we need to check:
- QueryBuilder serialization / deserialization
- Testing:
- CSV tests, and include scoring test in
scoring.csv
- Add this function to the
ScoringIT
tests so scoring is checked - The usual suspects:
VerifierTests
,LocalPhysicalPlanOptimizerTests
,MultiMatchTests
(new),LogicalPlanOptimizerTests
)
- CSV tests, and include scoring test in
...sql/src/main/java/org/elasticsearch/xpack/esql/expression/function/EsqlFunctionRegistry.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/FullTextWritables.java
Outdated
Show resolved
Hide resolved
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/MultiMatch.java
Outdated
Show resolved
Hide resolved
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/MultiMatch.java
Outdated
Show resolved
Hide resolved
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/MultiMatch.java
Show resolved
Hide resolved
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/MultiMatch.java
Outdated
Show resolved
Hide resolved
…expression/function/fulltext/MultiMatch.java Co-authored-by: Carlos Delgado <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question on the handling of options, and some final testing that I'd like us to have. The rest LGTM 🙌
...ql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LocalPhysicalPlanOptimizerTests.java
Show resolved
Hide resolved
@@ -1656,6 +1657,33 @@ public void testQStrOptionsPushDown() { | |||
assertThat(expectedQStrQuery.toString(), is(planStr.get())); | |||
} | |||
|
|||
public void testMultiMatchOptionsPushDown() { | |||
String query = """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't see validation that the query retrieves options for zero_terms_query
or boost
- they should be easy to add and we get complete coverage that all options work as expected
x-pack/plugin/esql/qa/testFixtures/src/main/resources/multi-match-function.csv-spec
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/qa/testFixtures/src/main/resources/multi-match-function.csv-spec
Outdated
Show resolved
Hide resolved
required_capability: metadata_score | ||
|
||
from books metadata _score | ||
| where multi_match("Hobbit", description, title, {"type": "most_fields"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work on combining the scores via most_fields
and best_fields
👏
x-pack/plugin/esql/qa/testFixtures/src/main/resources/multi-match-function.csv-spec
Show resolved
Hide resolved
...ugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/Match.java
Outdated
Show resolved
Hide resolved
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/MultiMatch.java
Show resolved
Hide resolved
} | ||
|
||
private MultiMatch(Source source, Expression query, List<Expression> fields, Expression options, QueryBuilder queryBuilder) { | ||
super(source, query, initChildren(query, fields, options), queryBuilder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this would be better handled by just having a List fieldsAndOptions, instead of separating the two. This seems to be making us check for options in a variety of places and do some gymnastics for the constructor.
We could keep a single List and then check for the last element in the options() field to return that or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ioanatia any opinion on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why I wanted to keep them separate was because of the way we use the @interfaces
like @MapParam
from the constructor arguments to generate the docs.
If all we had was a list of children, that might mean changing the way the docs are generated for functions?
I am not sure that would be less awkward - neither way seems ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs generation comment is spot on. Let's keep it this way and iterate if needed.
…tch-function.csv-spec Co-authored-by: Carlos Delgado <[email protected]>
…optimizer/LocalPhysicalPlanOptimizerTests.java Co-authored-by: Carlos Delgado <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for iterating on this, @svilen-mihaylov-elastic !
I'd like to have the additional tests for phrase
and phrase-prefix
included, but this LGTM. Great work! 💯
Is there anything you need on top of this? 937963b |
@@ -1,6 +1,5 @@ | |||
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it. | |||
|
|||
### DATE TRUNC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like an unintended change - let's revert it
@@ -7378,6 +7379,24 @@ public void testFunctionNamedParamsAsFunctionArgument() { | |||
assertEquals(DataType.DOUBLE, ee.dataType()); | |||
} | |||
|
|||
public void testFunctionNamedParamsAsFunctionArgument1() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a side note I don't see a difference between this test and the one we add to AnalyzerTests
.
I guess we just follow an existing pattern - but as a follow up, we should discuss if both are needed.
Implement multi_match function for ESQL. Its currently available on snapshot builds pending refinement of the syntax.
I notice that the examples/multi_match.md file contains an example that does not match the example in the csv-spec file. Presumably the csv-spec was changed later, without re-generating the examples. The consequence is that someone else, later, will regenerate examples and get a surprise multi-match change to their PR. |
tracked in #121525