how to maintain same format and datatype storing and loading data using; to_netcdf() and open_dataset() #6478
Unanswered
oswald1234
asked this question in
Q&A
Replies: 1 comment 3 replies
-
Thanks for raising this! This would be a serious problem if you can demonstrate a consistent pattern of this behavior. That said, this is something xarray tests explicitly check for, and I can't reproduce the problem using a simple example: In [4]: cube = xr.Dataset(
...: {
...: 'a': (('y', 'x'), np.random.randint(0, 100, size=(12, 10), dtype='uint8')),
...: 'b': (('x', ), np.arange(0, 50000, 5000, dtype='uint16')),
...: },
...: coords={
...: 'x': np.arange(10),
...: 'y': np.arange(12),
...: 'lat': (('y', 'x'), (np.arange(12).reshape(-1, 1) - 0.2 * np.arange(10))),
...: },
...: )
In [5]: cube
Out[5]:
<xarray.Dataset>
Dimensions: (y: 12, x: 10)
Coordinates:
* x (x) int64 0 1 2 3 4 5 6 7 8 9
* y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11
lat (y, x) float64 0.0 -0.2 -0.4 -0.6 -0.8 ... 10.0 9.8 9.6 9.4 9.2
Data variables:
a (y, x) uint8 95 86 28 57 66 75 61 41 44 ... 17 59 99 67 86 9 15 78
b (x) uint16 0 5000 10000 15000 20000 25000 30000 35000 40000 45000
In [6]: cube.to_netcdf(path='local-test.nc', format='NETCDF4', mode='w')
In [7]: xr.open_dataset('local-test.nc').load()
Out[7]:
<xarray.Dataset>
Dimensions: (y: 12, x: 10)
Coordinates:
* x (x) int64 0 1 2 3 4 5 6 7 8 9
* y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11
lat (y, x) float64 0.0 -0.2 -0.4 -0.6 -0.8 ... 10.0 9.8 9.6 9.4 9.2
Data variables:
a (y, x) uint8 95 86 28 57 66 75 61 41 44 ... 17 59 99 67 86 9 15 78
b (x) uint16 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 A lot of times, when people run into something like this, there is a step in their workflow which is changing the data type just before they write to disk. Can you see if you're able to create a very simple Minimal, Complete, Verifiable Example like the above? Make sure you're using the latest version of xarray and include the output of |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
When I store my dataset, the datatypes for the variables are uint8 and uint16, but when I open it again, the datatypes are float32, and some coordinates are represented as variables.
How do I prevent this?
cube.to_netcdf(path=os.path.join(savedir,filename),format='NETCDF4',mode='w') xr.open_dataset(os.path.join(savedir,filename))
Should I specify something in the dataset before saving or reopening the file?
Beta Was this translation helpful? Give feedback.
All reactions