-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Fix: optimize projections for unnest logical plan. #16632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
2049964
to
f5f0172
Compare
Thanks @bert-beyondloops -- sorry for the delay in review I started the CI tests The changes in this PR make sense to me -- the only thing I think it needs is some test to ensure we don't regress Can you please add some tests, ideally in sqllogicteset format? Here are the instructions: https://github.com/apache/datafusion/tree/main/datafusion/sqllogictest Ideally you should be able to extend one of the existing test files in https://github.com/apache/datafusion/tree/main/datafusion/sqllogictest/test_files Perhaps in https://github.com/apache/datafusion/blob/main/datafusion/sqllogictest/test_files/unnest.slt Specifically, I think an explain plan test that shows the projection being pushed down would be about all we need |
Hi @alamb, Thanks for the review. There are already some tests verifying the explain plans for the unnest plan. The scheduled run will currently fail. |
Indeed the exising tests definitely seem to cover it -- you can see the reduction in projections
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @bert-beyondloops -- this looks good to me
05)--------CoalesceBatchesExec: target_batch_size=8192 | ||
06)----------FilterExec: column1@0 = 2 | ||
06)----------FilterExec: column1@0 = 2, projection=[column2@1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those look like better plans to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed.
I'm still looking into the panicking issue.
0a0e45c
to
072dc05
Compare
I moved the logic around the unnest column detection into a new Unnest::try_new method. The projection pushdown optimization code has been improved/fixed All tests are green now. |
@@ -3995,6 +3995,211 @@ impl PartialOrd for Unnest { | |||
} | |||
} | |||
|
|||
impl Unnest { | |||
pub fn try_new( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this looks much nicer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good -- thank you @bert-beyondloops
options: into_required!(unnest.options)?, | ||
})) | ||
|
||
LogicalPlanBuilder::from(input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a nice improvement too
072dc05
to
87b512c
Compare
branch has been rebased. |
commits are squashed on merge so no need to do it on the branch Pushing commits rather than rebasing also helps make it easier to see what has changed since the last review, in my opinion, though I know there are other opinions on this matter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bert-beyondloops
Which issue does this PR close?
Rationale for this change
See issue
What changes are included in this PR?
Are these changes tested?
This PR serves as a starting point for discussion in the community.
Are there any user-facing changes?
No