-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Fix behavior for _index LIKE for ESQL #130019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
a726c97
to
7dce974
Compare
@nik9000 This is the draft PR |
...r/multi-clusters/src/javaRestTest/java/org/elasticsearch/xpack/esql/ccq/MultiClustersIT.java
Outdated
Show resolved
Hide resolved
This reverts commit 7484410.
Hi @julian-elastic, I've updated the changelog YAML for you. |
server/src/main/java/org/elasticsearch/index/query/WildcardQueryBuilder.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/query/WildcardQueryBuilder.java
Outdated
Show resolved
Hide resolved
...ck/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/capabilities/TranslationAware.java
Show resolved
Hide resolved
@@ -148,7 +148,7 @@ public Query asQuery(LucenePushdownPredicates pushdownPredicates, TranslatorHand | |||
// TODO: Get the real FoldContext here | |||
var wildcardQuery = QueryParser.escape(BytesRefs.toString(prefix.fold(FoldContext.small()))) + "*"; | |||
|
|||
return new WildcardQuery(source(), fieldName, wildcardQuery); | |||
return new WildcardQuery(source(), fieldName, wildcardQuery, false, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be false, true
? So we force a string match in case folks are running START_WITH
against _index
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for ENDS_WITH
I suppose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the change you requested to false, true. Then I tried adding UTs to cover the ENDS_WITH/BEGINS_WITH on index. It seems like ENDS_WITH/BEGINS_WITH is broken on an index, similar to LIKE before the change, as they all used WildcardQuery. It is fixed in some cases with my change, broken with others. If we want comprehensive fix, I can work on it, but it will delay the LIKE PR and it is getting pretty big already.
Do we want to stick with the current broken behavior and handle ENDS_WITH/BEGINS_WITH in a separate PR? I want to get this one checked in first if possible.
To make matters worse QueryParser.escape() in the line right above is the wrong escape sequence for our PR. If you call it on "remote-index" it will escape the dash and change the wildcardQuery to "remote\-index" which is wrong I think and will produce no data. QueryParser.escape() does regular expression escaping, not wildcard escaping. It seems the correct escaping is in WildcardFieldMapper.escapeWildcardSyntax(). But changing it here, might break more cases/cause different data if we CCS to an older server that uses the old WildcardQuery workflow with the new escaping. So it is not trivial to change the false to true. I think keeping it as false should preserve the old wrong behavior, so no regressions and buys us time to investigate properly and handle all the cases. I think we need to discuss...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked about this and will fork it into an open issue.
...plugin/esql/src/main/java/org/elasticsearch/xpack/esql/io/stream/ExpressionQueryBuilder.java
Outdated
Show resolved
Hide resolved
🔍 Preview links for changed docs |
537d56c
to
8540550
Compare
@@ -148,7 +148,7 @@ public Query asQuery(LucenePushdownPredicates pushdownPredicates, TranslatorHand | |||
// TODO: Get the real FoldContext here | |||
var wildcardQuery = QueryParser.escape(BytesRefs.toString(prefix.fold(FoldContext.small()))) + "*"; | |||
|
|||
return new WildcardQuery(source(), fieldName, wildcardQuery); | |||
return new WildcardQuery(source(), fieldName, wildcardQuery, false, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked about this and will fork it into an open issue.
Let's get this in. Once it's landed I can explain the backport process. It's even more |
12107af
to
8540550
Compare
Fixes
_index LIKE <pattern>
to always have normal text matching semantics.Implement a generic ExpressionQuery and ExpressionQueryBuilder that can be serialized to the data node. Then the ExpressionQueryBuilder can build an Automaton using TranslationAware.asLuceneQuery() and execute it in Lucine.
Introduces a breaking change for LIKE on _index fields. The old like behavior is not correct and does not have normal like semantics from ESQL. Customers upgrading from old build to new build might see a regression, where the data changes due to the like filters on clustering produces different results, but the new results are correct.
Behavior for ESQL
New CCS to New => New behavior everywhere
Old CCS to New => Old behavior everywhere (the isForESQL flag is not passed in from old)
New CCS to Old => New behavior for new, old behavior for old (the isForESQL cannot be passed, old does not know about it).
Old CCS to Old => Old behavior everywhere
Closes #129511