-
Notifications
You must be signed in to change notification settings - Fork 25.2k
ESQL: incorrect planning of remote enrich #118531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Excellent find @bpintea ! This can be reproduced without CCQ by forcing the I played around and it seems that otherwise, our planner really tries to execute Local reproducer:
|
In this case,
|
It looks like the bug happens when mapping from logical to physical plan:
Note the However, it is actually incorrect on the logical level: the I think this means we can solve this in the following ways:
|
Indeed, I also think the
Well, the sorting should happen on the id that comes from the source (well, from the EVAL on top of it in your example). So in that sense, it's correct. But by the way ENRICH is planned, it shadows this attribute with its own. Ideally we'd plan such that this ENRICH attribute is just dropped -- I think the fix should go in this direction (so maybe along your 2nd point above). Edit:
Even if the query asks to KEEP it, we could still load it from the FROM-source "branch" (as it should be the same in both source and enrich policy/index). But maybe still not the right solution, after all. I guess we'd have to do some renaming dancing and let both attributes through the exchanges. |
I'd prefer we attempt to relax the dependency check to allow duplicate names in the physical plans, and make the physical plan's The headache which is shadowing is required to consider in logical plans' If we could stop caring about shadowing in physical plans, that would bring more simplicity into this part of the code base. |
This disables verifying the plans generated for remote ENRICHing. It also re-enables corresponding failing test. Related: elastic#118531 Fixes elastic#118307. (cherry picked from commit e7a4436)
) * ESQL: Enable physical plan verification (#118114) This enables the physical plan verification. For it, a couple of changes needed to be applied/corrected: * AggregateMapper creates attributes with unique names; * AggregateExec's verification needs not consider ordinal attribute(s); * LookupJoinExec needs to merge attributes of same name at output, "winning" the right child; * ExchangeExec does no input referencing, since it only outputs all synthetic attributes, "sourced" from remote exchanges; * FieldExtractExec doesn't reference the attributes it "produces". * ESQL: Disable remote enrich verification (#118534) This disables verifying the plans generated for remote ENRICHing. It also re-enables corresponding failing test. Related: #118531 Fixes #118307. (cherry picked from commit e7a4436)
…#118534) (elastic#118302) * ESQL: Enable physical plan verification (elastic#118114) This enables the physical plan verification. For it, a couple of changes needed to be applied/corrected: * AggregateMapper creates attributes with unique names; * AggregateExec's verification needs not consider ordinal attribute(s); * LookupJoinExec needs to merge attributes of same name at output, "winning" the right child; * ExchangeExec does no input referencing, since it only outputs all synthetic attributes, "sourced" from remote exchanges; * FieldExtractExec doesn't reference the attributes it "produces". * ESQL: Disable remote enrich verification (elastic#118534) This disables verifying the plans generated for remote ENRICHing. It also re-enables corresponding failing test. Related: elastic#118531 Fixes elastic#118307. (cherry picked from commit e7a4436)
…#118534) (elastic#118302) * ESQL: Enable physical plan verification (elastic#118114) This enables the physical plan verification. For it, a couple of changes needed to be applied/corrected: * AggregateMapper creates attributes with unique names; * AggregateExec's verification needs not consider ordinal attribute(s); * LookupJoinExec needs to merge attributes of same name at output, "winning" the right child; * ExchangeExec does no input referencing, since it only outputs all synthetic attributes, "sourced" from remote exchanges; * FieldExtractExec doesn't reference the attributes it "produces". * ESQL: Disable remote enrich verification (elastic#118534) This disables verifying the plans generated for remote ENRICHing. It also re-enables corresponding failing test. Related: elastic#118531 Fixes elastic#118307. (cherry picked from commit e7a4436)
Description
A remote enrich query using a policy that exports/enriches with a field already present in the query is planned slightly incorrectly -- it works, but fails verification.
Example: policy
hosts
matches onip
and has as enrich_fieldsip
andos
:This produces the optimised physical plan:
Note that in the fragment,
Project
outputsip{r}#3
(the node is produced byProjectAwayColumns
based on the TopN belowEnrich
), butEnrich
below it outputsip{r}#20
(since it also enriches with its ownip
field). So the verification fails later when remapping the fragment plan on ProjectExec, since its inputs don't provide anip{r}#3
. (If we KEEPip
too, the verification would also fail due to attributes with duplicate name.)Normally we would drop the
ip
after TopN, but the Enrich remote planning pushes it to the remote cluster and theip
is still needed for the coordinator TopN.Related: #118307.
The text was updated successfully, but these errors were encountered: