Skip to content

Commit e433d20

Browse files
jzwincklarsmans
authored andcommitted
FIX use float64 in metrics.r2_score() to prevent overflow
Without this, if the input arrays are of type np.float32, their sums may be computed with an large accumulated error, resulting in the wrong scor with very long arrays (millions of elements). The "1 - numerator / denominator" calculation at the very end produces a float64 anyway, so the returned type does not change--only the accuracy. Fixes scikit-learn#2158.
1 parent 22dbecc commit e433d20

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

sklearn/metrics/metrics.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2392,8 +2392,8 @@ def r2_score(y_true, y_pred):
23922392
if len(y_true) == 1:
23932393
raise ValueError("r2_score can only be computed given more than one"
23942394
" sample.")
2395-
numerator = ((y_true - y_pred) ** 2).sum()
2396-
denominator = ((y_true - y_true.mean(axis=0)) ** 2).sum()
2395+
numerator = ((y_true - y_pred) ** 2).sum(dtype=np.float64)
2396+
denominator = ((y_true - y_true.mean(axis=0)) ** 2).sum(dtype=np.float64)
23972397

23982398
if denominator == 0.0:
23992399
if numerator == 0.0:

0 commit comments

Comments
 (0)