-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Fix critical np.timedelta64
encoding bugs
#10469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Spencer!
One other strategic question -- do we need this separate logic for decoding into an integer dtype that exactly matches the precision of the datetime64 units? Lines 1456 to 1487 in 4a581f4
I am wondering because it seems like this logic is already covered in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick feedback @shoyer!
I am wondering because it seems like this logic is already covered in
encode_cf_timedelta()
, and adding this explicit check makes means thattimedelta64[ns]
data cannot be saved directly to netCDF3 files unless the units are converted first.timedelta64[ns]
worked viaencode_cf_timedelta
as long as nanosecond precision isn't really needed, similar to howdatetime64[ns]
is OK. This was convenient for users of "human" time units.
I agree this is annoying, and it seems like we should be able to relax it some. I'll think about it some more and see what I can do.
np.timedelta64
encoding bugsnp.timedelta64
encoding bugs
e189b66
to
624625b
Compare
Always write a dtype attribute to disk regardless of how the timedeltas were decoded.
624625b
to
bdda733
Compare
I made the similar proposal for Intervals here: #8005 (comment) |
Thanks @dcherian—Stephan's suggestion of always writing a xarray/xarray/coding/variables.py Lines 582 to 594 in 55dc766
|
Thank Spencer, this looks great. I can confirm this fixes a few Google related projects that were using timedelta64 serialization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @spencerkclark! I'm always hoping that it will be the last time we have to touch these lines of code. Great! 🎉
Thanks @shoyer and @kmuehlbauer for the quick reviews, and confirming that it works in downstream projects. I think this is in a much better state (not least because it now actually works as advertised...). I will go ahead and merge. |
Guys this was merged with failing tests, and the failures appear to have somehow been introduced in this commit of this PR. Please see zarr-developers/zarr-python#3199, ad let us know if you have any idea why changes to encoding would affect zarr like this... 😕 |
Sorry about that! I wrongfully assumed those failures were unrelated, since they passed earlier in the development of this PR. I can reproduce locally, but I'm still a bit puzzled. It seems related to me parametrizing this test with a Ah as I'm writing this it looks like you've sorted it out (#10492) — thanks! |
This PR fixes the critical
np.timedelta64
encoding bugs introduced in #10101. We now always encodenp.timedelta64
values with adtype
attribute corresponding to the in-memorydtype
, and use the same encoding path that we did previously, which by default selects the coarsest units that support integer serialization. This enables storing"timedelta64[ns]"
values in netCDF3 format, which was not supported by the "literal" encoding approach implemented in #10101.For consistency with the previous
units
-based decoding approach, this update also now enables controlling the decoded resolution ofdtype
-decoded values via thetime_unit
parameter of theCFTimedeltaCoder
class. By default thetime_unit
parameter is nowNone
. If thetime_unit
isNone
thedtype
attribute determines thedtype
the data is decoded to; if thetime_unit
is notNone
it takes precedence.whats-new.rst
cc: @shoyer @kmuehlbauer; sorry again for the trouble.