Open
Description
What is your issue?
Hi! Is there a way to specify Unicode normalization (e.g., NFD, NFC) when using to_netcdf()
? I have variable names with Unicode characters and want to ensure consistent normalization.
Here's a simple example of the issue:
import xarray as xr
# Create dataset with Unicode variable name and save it
original_name = "ā"
ds = xr.Dataset({original_name: ([], 1)}).to_netcdf("test.nc")
ds_loaded = xr.open_dataset("test.nc")
loaded_name = list(ds_loaded.variables.keys())[0]
original_name == loaded_name # Returns false
Doing some digging, it seems that when writing to NetCDF, everything gets normalized with the NFC normalization. However, I'd like to normalize them with the NFD option, since it matches the characters I can compose on IPython, vim, vscode, etc.
Thanks!