Skip to content

ES|QL: fix join masking eval #126614

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

luigidellaquila
Copy link
Contributor

Fixes: #126419

What happened here is the following:

in a query like

from languag* 
| eval type = null 
| rename language_name as message 
| lookup join message_types_lookup on message 
| rename type as message 
| lookup join message_types_lookup on message 
| keep `language.name`
  • type is initially set to null
  • the first LOOKUP overwrites it with a keyword
  • the second LOOKUP uses it as a join key

At pre-analysis time, we try to minimize the number of attributes we require from field_caps, and we try to infer these attributes from the query itself.
From this list of attributes, we remove the aliases created by EVAL (type in this case, assuming it's a constant null value).
What we did not take into consideration is that JOIN could overwrite that attribute (potentially with a different type), so we did lookup index resolution only asking for message.
As long as we didn't have a KEEP at the end, the query worked fine, just because there is a shortcut that prevents that optimization in case we need all the attributes.
Adding a KEEP at the end, hence removing type from the result, we got a mapping for the lookup index with only one attribute (message) and we completely missed the fact that type actually had to be overwritten.

The fix consists in not removing aliases from the attributes list when there is a subsequent JOIN, since we don't know if it will overwrite them yet.

@luigidellaquila luigidellaquila added >bug auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v8.18.1 v8.19.0 v9.0.1 labels Apr 10, 2025
@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.1.0 labels Apr 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @luigidellaquila, I've created a changelog YAML for you.

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have time to think deeply about this, but at first glance, the fix looks like it should work.

@astefan
Copy link
Contributor

astefan commented Apr 11, 2025

I'll have a look as well.

| where x > 1
| keep emp_no, language_name
| limit 1""",
Set.of("emp_no", "emp_no.*", "languages", "languages.*", "language_name", "language_name.*", "x", "y", "x.*", "y.*")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And x appears here as well because x could come from enrich languages_policy on y?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, we don't know the structure of the enrich index yet, after the ENRICH x could become something else (eg. a keyword) and the WHERE could be invalid

joinMaskingEval
required_capability: join_lookup_v12
required_capability: fix_join_masking_eval
from languag*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, add this test and others (be creative) to IndexResolverFieldNamesTests.

* eg. EVAL, GROK, DISSECT can override an alias, but we know it in advance, ie. we don't need to resolve indices to know.
*/
private static boolean couldOverrideAliases(LogicalPlan p) {
return (p instanceof Aggregate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you listing all of these that cannot define an additional "hidden" Attribute instead of checking those that can?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid that it breaks when we add new commands that behave like JOIN.
As a follow-up, IMHO we should define an interface for commands that allow this optimization (similar to SortAgnostic) and replace this long list with a single instanceof

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add a command that doesn't behave like JOIN, then it could break the other way around. Imo, it makes much more sense to list those commands that are "special".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add a command that doesn't behave like JOIN, then it could break the other way around

This is interesting, I'm not sure I have completely clear the implications of this aspect.

If I got it right, we use these field names to limit the scope of the field_caps; if we add a field that does not exist in any of the involved indices, will field_caps fail?

My understanding is that it will work fine, just ignoring the additional fields.

If it wasn't the case, then we would be in a catch-22: we couldn't know which fields to send before validating the query, and we couldn't validate the query before sending the fields to field_caps.

The fact that everything works fine makes me think that it's safe, but maybe I'm missing something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about completion command (I see it implements GeneratingPlan), is this a command that can add a "hidden" attribute that can be overiden?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Completion is safe (the target attribute is defined at parsing time), but I have to double-check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I got it right, we use these field names to limit the scope of the field_caps; if we add a field that does not exist in any of the involved indices, will field_caps fail?
My understanding is that it will work fine, just ignoring the additional fields.

Yep, I see your point. Adding more fields is safer. Added less fields is problematic. Better more fields than less.

parsed.forEachDown(p -> {// go over each plan top-down
if (couldOverrideAliases(p)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do this check closer to the actual logic that uses it, I find the code easier to reason about (there is a lot of code between this check here and where this check is actually useful, code that doesn't use the result of canRemoveAliases).

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor comments, LGTM otherwise.

public void testEnrichMaskingEvalOn() {
assertFieldNames("""
from employees
| eval langague_name = null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you really want to use here an alias name that will not be recalled further in the query? langague_name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the intention was to let the first ENRICH overwrite it, but there is a typo in the query indeed, the EVAL was supposed to be

| eval languages = length(languages)

(I need a number rather than a string).
Let me fix it.

| eval langague_name = null
| enrich languages_policy on languages
| rename language_name as languages
| eval languages = length(language_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here length(language_name) is not actually valid because language_name has been renamed. Was this intended like this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly this one, see my comment above


public void testEnrichAndJoinMaskingEvalWh() {
assertFieldNames("""
from employees
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same strange "typos" here as well.

* eg. EVAL, GROK, DISSECT can override an alias, but we know it in advance, ie. we don't need to resolve indices to know.
*/
private static boolean couldOverrideAliases(LogicalPlan p) {
return (p instanceof Aggregate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I got it right, we use these field names to limit the scope of the field_caps; if we add a field that does not exist in any of the involved indices, will field_caps fail?
My understanding is that it will work fine, just ignoring the additional fields.

Yep, I see your point. Adding more fields is safer. Added less fields is problematic. Better more fields than less.

// If there are joins/enriches in the middle, these could override some of these fields.
// We don't know at this stage, so we have to keep all of them.
if (canRemoveAliases[0]) {
// If there are joins, enriches etc. in the middle, these could override some of these fields.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest a slightly different wording here, since this code in EsqlSession is tricky enough in what removes and what keeps and comments DO help a lot:

"
If the current node in the tree is of type JOIN (lookup join, inlinestats) or ENRICH or other type of command that we may add in the future which can override already defined Aliases with EVAL (for example "from test | eval ip = 123 | enrich ips_policy ON hostname | rename ip AS my_ip" and ips_policy enriches the results with the same name ip field), these aliases should be kept in the list of fields.
"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 it looks more clear, I'm changing it. Thanks!

LOADING_NON_INDEXED_IP_FIELDS,

/**
* During resolution (pre-analysis) we have to consider that joins can override EVALuated values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* During resolution (pre-analysis) we have to consider that joins can override EVALuated values
* During resolution (pre-analysis) we have to consider that joins or enriches can override EVALuated values

AttributeSet planRefs = p.references();
Set<String> fieldNames = planRefs.names();
p.forEachExpressionDown(Alias.class, alias -> {
// do not remove the UnresolvedAttribute that has the same name as its alias, ie "rename id = id"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// do not remove the UnresolvedAttribute that has the same name as its alias, ie "rename id = id"
// do not remove the UnresolvedAttribute that has the same name as its alias, ie "rename id AS id"

@luigidellaquila luigidellaquila enabled auto-merge (squash) April 15, 2025 15:42
@luigidellaquila luigidellaquila merged commit de42ba3 into elastic:main Apr 15, 2025
17 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.18 Commit could not be cherrypicked due to conflicts
8.x Commit could not be cherrypicked due to conflicts
9.0 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 126614

luigidellaquila added a commit to luigidellaquila/elasticsearch that referenced this pull request Apr 15, 2025
luigidellaquila added a commit to luigidellaquila/elasticsearch that referenced this pull request Apr 15, 2025
afoucret pushed a commit to afoucret/elasticsearch that referenced this pull request Apr 16, 2025
luigidellaquila added a commit to luigidellaquila/elasticsearch that referenced this pull request Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged backport pending >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.1 v8.19.0 v9.0.1 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ES|QL: JOIN left field is incompatible with right field of type
4 participants