Skip to content

Regular Grid fixes #39

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Regular Grid fixes #39

wants to merge 3 commits into from

Conversation

Karinon
Copy link
Collaborator

@Karinon Karinon commented Jun 27, 2025

@wachsylon came up with a few examples of regular grids which did not work with the current implementation for various reasons. This PR aims to fix this:

  • NaN in data points are now ignored and data is rendered now properly. Example: https://eerie.cloud.dkrz.de/datasets/icon-esm-er.hist-1950.v20240618.ocean.gr025.2d_monthly_mean/stac

  • GlobeRegular can now deal with data which has missing lat and lon-information in _ARRAY_DIMENSIONS and therefore the shape-object has only a time and a value-information. We are now getting the information directly from the lat and lon-variables

  • GlobeRegular can now also deal with data with reverted lats, i.e. with lats starting from 90 to -90 instead of the other way round

The last two fixes can both be seen in the following example, however it will still not work because of getGridType. It will only recognize a regular grid by datavar.attrs._ARRAY_DIMENSIONS as unknown[]).length >= 3, which does not work with this data. Any suggestions? Right now you can only see it in action when hard coding REGULAR_GRID in getGridType.

https://eerie.cloud.dkrz.de/datasets/icon-esm-er.hist-1950.v20240618.atmos.gr025.2d_daily_mean/stac

@Karinon Karinon requested a review from d70-t June 27, 2025 09:36
Copy link

cloudflare-workers-and-pages bot commented Jun 27, 2025

Deploying gridlook with  Cloudflare Pages  Cloudflare Pages

Latest commit: 139d141
Status: ✅  Deploy successful!
Preview URL: https://49fd06df.gridlook.pages.dev
Branch Preview URL: https://regular-fix.gridlook.pages.dev

View logs

Copy link
Owner

@d70-t d70-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes for NaN and the ordering are fine (although I'm wondering if we shouldn't have such a fix for longitude as well, just in case?).

I'm not sure I get the second point, what do you mean by "shape has only time and value information"? If the dataset doesn't have distinct lat and lon dimensions, it is not a regular grid, because having latitude and longitude dimensions (not necessarily by the names lat and lon) is exactly the property that makes a grid regular instead of irregular.
But probably I'm missunderstanding things here, as I've trouble finding my thoughts in the code...

@Karinon
Copy link
Collaborator Author

Karinon commented Jun 27, 2025

I'm not sure I get the second point, what do you mean by "shape has only time and value information"?

The issue was this code:

  // get all data and all dimensions for a specific time step      
   const rawData = await zarr.get(datavar, [
        currentTimeIndexSliderValue,
        ...Array(datavar.shape.length - 1).fill(null),
      ]);
....
  const textures = await getRegularData(
        rawData.data as Float64Array,
        rawData.shape[0],
        rawData.shape[1]
      );

This code leaves me with a rawData-object, where shape has all the dimensions, assuming shape[0] is lat and shape[1] is lon. For starters, this doesn't work if the data has more dimensions, e.g. the following dataset has a silly looking dimension height with only one value sitting at shape[0]

https://eerie.cloud.dkrz.de/datasets/icon-esm-er.hist-1950.v20240618.atmos.gr025.2d_daily_mean/stac

And then there is this dataset here (sorry, I wanted to posted this one in the initial message and mixed them up)

https://eerie.cloud.dkrz.de/datasets/ifs-amip-tco1279.hist.v20240901.atmos.gr025.2D_monthly/stac

Which has two dimensions time and value and shape[0] is already the data and shape[1] does not even exist. What works, however, is asking for lats and lons explicitly, even when they are missing in the shape. I am not familiar enough with Zarr to to tell you why there are explicit variables which I can query but nothing in the shape, sorry.

I just commited another condition for regular grids in getGridType which works now with the shown examples.

@Karinon
Copy link
Collaborator Author

Karinon commented Jun 27, 2025

(although I'm wondering if we shouldn't have such a fix for longitude as well, just in case?).

You are very likely right. It could also be a problem when the data goes from -180 to 180 instead of 0 to 360. I will create issues for this and we can fix it later.

@d70-t
Copy link
Owner

d70-t commented Jul 2, 2025

And then there is this dataset here (sorry, I wanted to posted this one in the initial message and mixed them up)

https://eerie.cloud.dkrz.de/datasets/ifs-amip-tco1279.hist.v20240901.atmos.gr025.2D_monthly/stac

Which has two dimensions time and value and shape[0] is already the data and shape[1] does not even exist. What works, however, is asking for lats and lons explicitly, even when they are missing in the shape. I am not familiar enough with Zarr to to tell you why there are explicit variables which I can query but nothing in the shape, sorry.

Ok, I see what the issue is. The grid-file specifies a lat/lon grid, i.e.:

In [2]: xr.open_dataset("https://swift.dkrz.de/v1/dkrz_7fa6baba-db43-4d12-a295-8e3ebb1a01ed/grids/gr025_descending.zarr", engine="zarr")
Out[2]: 
<xarray.Dataset> Size: 17kB
Dimensions:  (lat: 721, lon: 1440)
Coordinates:
  * lat      (lat) float64 6kB 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
  * lon      (lon) float64 12kB 0.0 0.25 0.5 0.75 ... 359.0 359.2 359.5 359.8

but the data-file only has the value dimension:

In [3]: xr.open_dataset("https://eerie.cloud.dkrz.de/datasets/ifs-amip-tco1279.hist.v20240901.atmos.gr025.2D_monthly/zarr", engine="zarr")
Out[3]: 
<xarray.Dataset> Size: 118GB
Dimensions:       (time: 528, value: 1038240, level: 4)
Coordinates:
    lat           (value) float64 8MB ...
  * level         (level) int64 32B 1 2 3 4
    lon           (value) float64 8MB ...
  * time          (time) datetime64[ns] 4kB 1980-01-15T12:00:00 ... 2023-12-1...

I'd consider this as a broken setup, because the grid doesn't match the data (and probably shouldn't even be there, as the data already contains lat and lon). So I guess we should reject this form of grid+data combination.

Also, when loading the dataset directly with gridlook (using auto-detection), i.e..: https://gridlook.pages.dev/#https://eerie.cloud.dkrz.de/datasets/ifs-amip-tco1279.hist.v20240901.atmos.gr025.2D_monthly/zarr , gridlook (correctly) detects this dataset as unstructured and displays it correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants