Skip to content

apply_ufunc with dask="parallelized" and "allowed" #315

Open
@dougiesquire

Description

@dougiesquire

A large number of xskillscore methods use xarray's apply_ufunc with dask="parallelized" for dask array support. A preferred option if the wrapped function natively supports dask arrays is to use dask="allowed". See here for details.

This issues list all current methods within xskillscore that use apply_ufunc and tries to summarise for each method how much work is involved in enabling dask="allowed".

xskillscore.contingency

  • gerrity_score : already dask="allowed"

xskillscore.deterministic

  • linslope : all numpy functions used can ingest and return dask arrays so can simply switch dask="parallelized" for dask="allowed"
  • pearson_r : all numpy functions used can ingest and return dask arrays so can simply switch dask="parallelized" for dask="allowed"
  • pearson_r_p_value : requires slight refactor to how masked nans are reset here since dask doesn't seem to be able to take the length of a empty array
  • effective_sample_size : all numpy functions used can ingest and return dask arrays so can simply switch dask="parallelized" for dask="allowed"
  • pearson_r_eff_p_value : requires slight refactor to how masked nans are reset here since dask doesn't seem to be able to take the length of a empty array
  • spearman_r : need to wrap bottleneck.nanrankdata with dask.map_blocks or equivalent, although I don't know if this is any better than using dask="parallelized"
  • spearman_r_p_value : as above
  • spearman_r_eff_p_value : as above
  • r2 : all numpy functions used can ingest and return dask arrays so can simply switch dask="parallelized" for dask="allowed"
  • me : all numpy functions used can ingest and return dask arrays so can simply switch dask="parallelized" for dask="allowed"
  • rmse : all numpy functions used can ingest and return dask arrays so can simply switch dask="parallelized" for dask="allowed"
  • mse : all numpy functions used can ingest and return dask arrays so can simply switch dask="parallelized" for dask="allowed"
  • mae : all numpy functions used can ingest and return dask arrays so can simply switch dask="parallelized" for dask="allowed"
  • median_absolute_error : all numpy functions used can ingest and return dask arrays so can simply switch dask="parallelized" for dask="allowed"
  • mape : all numpy functions used can ingest and return dask arrays so can simply switch dask="parallelized" for dask="allowed"
  • smape : all numpy functions used can ingest and return dask arrays so can simply switch dask="parallelized" for dask="allowed"

xskillscore.probabilistic

  • crps_gaussian : this is a wrapper on properscoring which expects numpy arrays (and actually triggers compute with dask="allowed"). Getting dask="allowed" working properly would require a full refactor of properscoring
  • crps_quadrature : this is a wrapper on properscoring which expects numpy arrays (and actually triggers compute with dask="allowed"). Getting dask="allowed" working properly would require a full refactor of properscoring
  • crps_ensemble : this is a wrapper on properscoring which expects numpy arrays (and actually triggers compute with dask="allowed"). Getting dask="allowed" working properly would require a full refactor of properscoring
  • brier_score : this is a wrapper on properscoring which expects numpy arrays (and actually triggers compute with dask="allowed"). Getting dask="allowed" working properly would require a full refactor of properscoring
  • threshold_brier_score : this is a wrapper on properscoring which expects numpy arrays (and actually triggers compute with dask="allowed"). Getting dask="allowed" working properly would require a full refactor of properscoring
  • rank_histogram : need to wrap bottleneck.nanrankdata with dask.map_blocks or equivalent, although I don't know if this is any better than using dask="parallelized"
  • reliability : already dask="allowed"

xskillscore.resampling

  • resample_iterations_idx : use dask moveaxis when dask array. This would be easily handled with duck array ops - see below.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions