-
Notifications
You must be signed in to change notification settings - Fork 488
feat: introduce VacuumMode::Full for cleaning up orphaned files #3368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d56badf
to
1cc4289
Compare
Pull request was converted to draft
1cc4289
to
c734263
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3368 +/- ##
=======================================
Coverage 72.01% 72.01%
=======================================
Files 148 148
Lines 46082 46137 +55
Branches 46082 46137 +55
=======================================
+ Hits 33184 33225 +41
- Misses 10791 10794 +3
- Partials 2107 2118 +11 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
crates/core/src/operations/vacuum.rs
Outdated
/// The `lite` mode will only remove files which are referenced in the `_delta_log` associagted | ||
/// with `remove` action |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The databricks docs on this says something different and I still can't decipher what they truly mean, is there some spec on this?
/// The `lite` mode will only remove files which are referenced in the `_delta_log` associagted | ||
/// with `remove` action | ||
#[default] | ||
Lite, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also expose the mode to python 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be up for adding this exposure after we merge? 😈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
c734263
to
bcd4960
Compare
bcd4960
to
a868406
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
crates/core/src/operations/vacuum.rs
Outdated
/// Type of Vacuum operation to perform | ||
#[derive(Debug, Default, Clone, PartialEq)] | ||
pub enum VacuumMode { | ||
/// The `lite` mode will only remove files which are referenced in the `_delta_log` associagted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously not a big deal, but "associagted" -> "associated"
b5a9b0e
to
9d2ffd8
Compare
This allows an optional but not-on-by-default mode of removing untracked files in the delta table directory. Delta/Spark supports a "lite" and "full" mode for [vacuum]. This change is intentionally not making "full" the default as it is for Delta/Spark since that may have unintended consequences for our users who have become accustomed to "lite" being the default. Fixes delta-io#2349 [vacuum]: https://docs.delta.io/latest/delta-utility.html#remove-files-no-longer-referenced-by-a-delta-table Signed-off-by: R. Tyler Croy <[email protected]>
9d2ffd8
to
010fb04
Compare
This allows an optional but not-on-by-default mode of removing untracked
files in the delta table directory. Delta/Spark supports a "lite" and
"full" mode for vacuum. This change is intentionally not making "full"
the default as it is for Delta/Spark since that may have unintended
consequences for our users who have become accustomed to "lite" being
the default.
Fixes #2349
Signed-off-by: R. Tyler Croy [email protected]