Label-wise accuracy metrics implemented for multilabel classification. #516

jphdotam · 2019-05-02T18:59:36Z

Signed-off-by: James P Howard [email protected]

Fixes: Feature added - label-wise accuracy option for Accuracy metric.

Description: Accuracy() metrics with is_multilabel=True can now be passed labelwise=True. When present, the metric returns a tensor of accuracies for each class. For example:

evaluator = create_supervised_evaluator(model,
    metrics={'loss': Loss(loss),
    'accuracy': Accuracy(output_transform=thresholded_output_transform, is_multilabel=True),
    'precision': Precision(output_transform=thresholded_output_transform, is_multilabel=True, average=True),
    'label_acc':Accuracy(output_transform=thresholded_output_transform, is_multilabel=True, labelwise=True)},
    device=device)

@trainer.on(Events.EPOCH_COMPLETED)
def log_training_results(engine):
    evaluator.run(train_loader)
    metrics = evaluator.state.metrics
    acc, loss, precision, label_acc = metrics['accuracy'], metrics['loss'], metrics['precision'], metrics['label_acc']
    print(f"\rEnd of epoch {engine.state.epoch:03d}")
    print(f"TRAINING Accuracy: {acc:.3f} | Loss: {loss:.3f} | Precision: {precision:.3f} | Label-wise accuracy: {label_acc}")
    writer.add_scalar("training/accuracy", acc, engine.state.epoch)

@trainer.on(Events.EPOCH_COMPLETED)
def log_validation_results(engine):
    evaluator.run(test_loader)
    metrics = evaluator.state.metrics
    acc, loss, precision, label_acc = metrics['accuracy'], metrics['loss'], metrics['precision'], metrics['label_acc']
    print(f"TESTING  Accuracy: {acc:.3f} | Loss: {loss:.3f} | Precision: {precision:.3f} | Label-wise accuracy: {label_acc}\n")
    writer.add_scalar("testing/loss", loss, engine.state.epoch)
    writer.add_scalar("testing/accuracy", acc, engine.state.epoch)

trainer.run(train_loader, max_epochs=30)

Yields:

End of epoch 001
TRAINING Accuracy: 0.753 | Loss: 0.334 | Precision: 0.212 | Label-wise accuracy: tensor([0.8662, 0.8662], device='cuda:0')
TESTING  Accuracy: 0.725 | Loss: 0.341 | Precision: 0.221 | Label-wise accuracy: tensor([0.8302, 0.8755], device='cuda:0')

End of epoch 002
TRAINING Accuracy: 0.748 | Loss: 0.363 | Precision: 0.160 | Label-wise accuracy: tensor([0.8134, 0.9022], device='cuda:0')
TESTING  Accuracy: 0.672 | Loss: 0.537 | Precision: 0.087 | Label-wise accuracy: tensor([0.7057, 0.8792], device='cuda:0')

Check list:

New tests are added (if a new feature is added)
New doc strings: description and/or example code are in RST format
Documentation is updated (if required)

Signed-off-by: James P Howard <[email protected]>

anmolsjoshi · 2019-05-02T19:24:18Z

@jphdotam thanks for the PR! Could you add a few tests?

Have a look here, we tests accuracy against scikit-learn's implementation.

Let us know if you get stuck or have any questions!

PS: It seems that Travis CI failed due to flake8 errors. See here

…sification. Signed-off-by: James P Howard <[email protected]>

…se-accuracy

…sification & text8 clean-up. Signed-off-by: James P Howard <[email protected]>

jphdotam · 2019-05-02T20:15:55Z

Thanks @anmolsjoshi - I've written some tests and hopefully fixed the flake8 errors.

There unfortunately is no scikit-learn equivalent of labelwise accuracy, so I have written an analogous way of doing it in numpy.

…sification & text8 clean-up. Signed-off-by: James P Howard <[email protected]>

vfdev-5 · 2019-05-02T20:53:59Z

@jphdotam thanks for the PR ! To merge it I think we need to discuss about the API. I'm not a fan of introducing another flag. Maybe we can opt something like in torch the arguments of nn.CrossEntropy: deprecated reduce and new arg reduction which can have text values.
Can we generalize this PR to cover two issues : #513 and #467 ?

@jphdotam could you please provide a very simple example of manually computing such accuracy score labelwise. For example, I have y_true = [(1, 1, 0), (0, 0, 0,), (1, 1, 1)] and y_pred=[(1, 0, 1), (0, 0, 1), (0, 1, 1)] what is the score and how it is computed in details ?

jphdotam · 2019-05-02T21:52:07Z

Hi @vfdev-5.

Your example shows a batch size of 3 for a binary classifier with 3 labels.
The label-wise accuracy is essentially an accuracy for each position within the tuples.
If I expand your example to a batch size of 4 (just to make samples versus labels clearer), this is essentially how it works:

y_true = np.array([(1, 1, 0), (0, 0, 0), (1, 1, 1), (0, 1 ,0)])
y_pred = np.array([(1, 0, 1), (0, 0, 1), (0, 1, 1), (0, 1, 0)])
correct = y_true == y_pred
correct
Out[38]: 
array([[ True, False, False],
       [ True,  True, False],
       [False,  True,  True],
       [ True,  True,  True]])
np.mean(correct, axis=0)
Out[39]: array([0.75, 0.75, 0.5 ])

So there is 75% accuracy for the 1st category, 75% accuracy for the second, and 50% for the third.

It's very useful if one wishes to see which label in a multi-label classifier is compromising the overall accuracy.

Re: merging, would you rather I instead created a new separate metric from Accuracy called MultilabelAccuracy, or something? And submit it to either ignite.metrics or ignite.contrib.metrics?

vfdev-5 · 2019-05-02T22:02:04Z

@jphdotam thanks for the explanation. Now it is clear that we speak about the same computation method.

Re: merging, would you rather I instead created a new separate metric from Accuracy called MultilabelAccuracy, or something?

Previously, we had BinaryAccuracy and CategorialAccuracy that we merged into a single class. Next we added the support of multilabel same as in sklearn. IMO we should keep a single class.

Let me think about the new API and I'll comment out here. If you have other ideas on the API we can discuss about.

jphdotam · 2019-05-03T07:27:47Z

Ok great. In the mean time I will just use it as a new class as I've posted here #513 , since that's probably easier until we decide.

anmolsjoshi · 2019-05-13T04:57:01Z

@jphdotam thanks for providing the code base. In discussion with @vfdev-5, we were thinking the following:

~~Add a labelwise parameter, so the constructor would be Accuracy(is_multilabel=True, labelwise=True)~~ Already handled
~~Add a check that labelwise can only be True with multilabel cases, maybe with a warning and no error raising~~ Already handled
The same would need to be applied to Precision and Recall, as these metrics are
closely related in the way they are written.

Would you be interested in continuing this PR?

We might be working towards a minor release for now, so we shouldn't do major API changes. For the next major release (0.3.0), we can introduce a new_multilabel_arg with options None (binary/multiclass), multilabel (single accuracy value), labelwise (each label for multilabel).

What are your thoughts?

Oktai15 · 2019-11-26T12:16:41Z

Description: Accuracy() metrics with is_multilabel=True can now be passed labelwise=True

@jphdotam why do you want to add this feature only for multilabel case? It can be useful also with multiclass case too, isn't it?

Label-wise accuracy metrics implemented for multilabel classification.

224273f

Signed-off-by: James P Howard <[email protected]>

jphdotam mentioned this pull request May 2, 2019

Label-wise metrics (Accuracy etc.) for multi-label problems #513

Open

Merge branch 'master' into labelwise-accuracy

f488c33

jphdotam added 3 commits May 2, 2019 21:07

Tests for label-wise accuracy metrics implemented for multilabel clas…

531b29e

…sification. Signed-off-by: James P Howard <[email protected]>

Merge remote-tracking branch 'origin/labelwise-accuracy' into labelwi…

1221dc2

…se-accuracy

Tests for label-wise accuracy metrics implemented for multilabel clas…

42ee021

…sification & text8 clean-up. Signed-off-by: James P Howard <[email protected]>

Tests for label-wise accuracy metrics implemented for multilabel clas…

bb1fe45

…sification & text8 clean-up. Signed-off-by: James P Howard <[email protected]>

anmolsjoshi self-requested a review May 13, 2019 05:05

anmolsjoshi mentioned this pull request Jun 5, 2019

[WIP ]Added labelwise metrics with tests #542

Open

3 tasks

julien-blanchon mentioned this pull request Mar 17, 2023

Top-K precision/recall multilabel metrics for ranking task #467

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Label-wise accuracy metrics implemented for multilabel classification. #516

Label-wise accuracy metrics implemented for multilabel classification. #516

Uh oh!

jphdotam commented May 2, 2019 •

edited

Loading

Uh oh!

anmolsjoshi commented May 2, 2019 •

edited

Loading

Uh oh!

jphdotam commented May 2, 2019

Uh oh!

vfdev-5 commented May 2, 2019

Uh oh!

jphdotam commented May 2, 2019

Uh oh!

vfdev-5 commented May 2, 2019 •

edited

Loading

Uh oh!

jphdotam commented May 3, 2019 •

edited

Loading

Uh oh!

anmolsjoshi commented May 13, 2019 •

edited

Loading

Uh oh!

Oktai15 commented Nov 26, 2019 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Label-wise accuracy metrics implemented for multilabel classification. #516

Are you sure you want to change the base?

Label-wise accuracy metrics implemented for multilabel classification. #516

Uh oh!

Conversation

jphdotam commented May 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anmolsjoshi commented May 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jphdotam commented May 2, 2019

Uh oh!

vfdev-5 commented May 2, 2019

Uh oh!

jphdotam commented May 2, 2019

Uh oh!

vfdev-5 commented May 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jphdotam commented May 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anmolsjoshi commented May 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Oktai15 commented Nov 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jphdotam commented May 2, 2019 •

edited

Loading

anmolsjoshi commented May 2, 2019 •

edited

Loading

vfdev-5 commented May 2, 2019 •

edited

Loading

jphdotam commented May 3, 2019 •

edited

Loading

anmolsjoshi commented May 13, 2019 •

edited

Loading

Oktai15 commented Nov 26, 2019 •

edited

Loading