Skip to content

Add multi_match function #121525 #125062

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

svilen-mihaylov-elastic
Copy link
Contributor

@svilen-mihaylov-elastic svilen-mihaylov-elastic commented Mar 17, 2025

tracked in #121525

@svilen-mihaylov-elastic
Copy link
Contributor Author

TODO: add tests

Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good! 💯

Things we need to check:

  • QueryBuilder serialization / deserialization
  • Testing:
    • CSV tests, and include scoring test in scoring.csv
    • Add this function to the ScoringIT tests so scoring is checked
    • The usual suspects: VerifierTests, LocalPhysicalPlanOptimizerTests, MultiMatchTests (new), LogicalPlanOptimizerTests)

Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question on the handling of options, and some final testing that I'd like us to have. The rest LGTM 🙌

@@ -1656,6 +1657,33 @@ public void testQStrOptionsPushDown() {
assertThat(expectedQStrQuery.toString(), is(planStr.get()));
}

public void testMultiMatchOptionsPushDown() {
String query = """
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't see validation that the query retrieves options for zero_terms_query or boost - they should be easy to add and we get complete coverage that all options work as expected

required_capability: metadata_score

from books metadata _score
| where multi_match("Hobbit", description, title, {"type": "most_fields"})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on combining the scores via most_fields and best_fields 👏

}

private MultiMatch(Source source, Expression query, List<Expression> fields, Expression options, QueryBuilder queryBuilder) {
super(source, query, initChildren(query, fields, options), queryBuilder);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this would be better handled by just having a List fieldsAndOptions, instead of separating the two. This seems to be making us check for options in a variety of places and do some gymnastics for the constructor.

We could keep a single List and then check for the last element in the options() field to return that or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ioanatia any opinion on this?

Copy link
Contributor

@ioanatia ioanatia Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I wanted to keep them separate was because of the way we use the @interfaces like @MapParam from the constructor arguments to generate the docs.
If all we had was a list of children, that might mean changing the way the docs are generated for functions?
I am not sure that would be less awkward - neither way seems ideal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs generation comment is spot on. Let's keep it this way and iterate if needed.

Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for iterating on this, @svilen-mihaylov-elastic !

I'd like to have the additional tests for phrase and phrase-prefix included, but this LGTM. Great work! 💯

@svilen-mihaylov-elastic
Copy link
Contributor Author

Thank you for iterating on this, @svilen-mihaylov-elastic !

I'd like to have the additional tests for phrase and phrase-prefix included, but this LGTM. Great work! 💯

Is there anything you need on top of this? 937963b

@@ -1,6 +1,5 @@
% This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.

### DATE TRUNC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like an unintended change - let's revert it

@@ -7378,6 +7379,24 @@ public void testFunctionNamedParamsAsFunctionArgument() {
assertEquals(DataType.DOUBLE, ee.dataType());
}

public void testFunctionNamedParamsAsFunctionArgument1() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a side note I don't see a difference between this test and the one we add to AnalyzerTests.
I guess we just follow an existing pattern - but as a follow up, we should discuss if both are needed.

@svilen-mihaylov-elastic svilen-mihaylov-elastic merged commit 02f9af7 into elastic:main Apr 15, 2025
16 of 17 checks passed
afoucret pushed a commit to afoucret/elasticsearch that referenced this pull request Apr 16, 2025
Implement multi_match function for ESQL. Its currently available on snapshot builds pending refinement of the syntax.
@craigtaverner
Copy link
Contributor

I notice that the examples/multi_match.md file contains an example that does not match the example in the csv-spec file. Presumably the csv-spec was changed later, without re-generating the examples. The consequence is that someone else, later, will regenerate examples and get a surprise multi-match change to their PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :Search Relevance/Search Catch all for Search Relevance Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants