Skip to content

Result of halfwidth_ci_test may not correct. #413

Open
@Tianjie-Wu

Description

@Tianjie-Wu

In document of xskillscore, result of halfwidth_ci_test is described as follow: difference in scores (score(f2) - score(f1))

However, I discovered that this is only partially right.
For example, when the metric is RMSE, this method returns the difference score(f2) - score(f1), but when the metric is ME, this function returns the difference score(f1) - score(f2).

Code Sample, a copy-pastable example if possible

length = 100
obs_1d = xarray.DataArray(
       numpy.random.rand(length),
       coords=[
           numpy.arange(length),
       ],
       dims=["time"],
       name='var'
   )
fct_1d = obs_1d.copy()
fct_1d.values = numpy.random.rand(length)

# create a worse forecast with high but different to perfect correlation and choose RMSE as the distance metric
fct_1d_worse = fct_1d.copy()
step = 3
fct_1d_worse[::step] = fct_1d[::step].values + 0.3

# half-with of the confidence interval at level alpha is larger than the RMSE differences,
# therefore not significant
metric = "rmse"
alpha = 0.05
significantly_different, diff, hwci = xskillscore.halfwidth_ci_test(
    fct_1d, fct_1d_worse, obs_1d, metric, time_dim="time", dim=[], alpha=alpha
)
print("RMSE DIFF:",diff.values)
print("RMSE HWCI:",hwci.values)
print(
    f"RMSEs significantly different at level {alpha} : {bool(significantly_different)}"
)

metric = "me"
alpha = 0.05
significantly_different, diff, hwci = xskillscore.halfwidth_ci_test(
    fct_1d, fct_1d_worse, obs_1d, metric, time_dim="time", dim=[], alpha=alpha
)
print("ME DIFF:",diff.values)
print("ME HWCI:",hwci.values)
print(
    f"MEs significantly different at level {alpha} : {bool(significantly_different)}"
)

print ( "RMSE 2,1: ",xskillscore.rmse(fct_1d_worse,obs_1d, dim=[]).values.mean(),xskillscore.rmse(fct_1d,obs_1d, dim=[]).values.mean() )
print ( "RMSE DIFF 2-1:", (xskillscore.rmse(fct_1d_worse,obs_1d, dim=[]).values-xskillscore.rmse(fct_1d,obs_1d, dim=[]).values).mean() )
print ( "ME 2,1: ",(fct_1d_worse-obs_1d).mean().values,(fct_1d-obs_1d).mean().values )
print ( "ME DIFF 2-1: ",((fct_1d_worse-obs_1d).values-(fct_1d-obs_1d).values).mean() )

print ("xskillscore version:",xskillscore.__version__)

the result is listed as follow:

RMSE DIFF: 0.002145380792967111
RMSE HWCI: 0.03084942672029623
RMSEs significantly different at level 0.05 : False
ME DIFF: -0.10199999999999998
ME HWCI: 0.027772820244038265
MEs significantly different at level 0.05 : True
RMSE 2,1:  0.35575854107274524 0.3536131602797781
RMSE DIFF 2-1: 0.0021453807929670793
ME 2,1:  0.02998835594288012 -0.07201164405711986
ME DIFF 2-1:  0.102
xskillscore version: 0.0.24

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions