Skip to content

Conversation

@dnola
Copy link

@dnola dnola commented Feb 5, 2020

Follow up to discussion in #493

Separated out chicham's ComposeOps, TransformPipeline, and DALIIterator classes and built a single-GPU Ignite+DALI example .ipynb.

Let me know what you think!

Google Colab friendly version of the updated notebook here:
https://colab.research.google.com/drive/1F_7DihE8YUzirvWV8xn1aMe0EMAP9iB6

Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dnola looks OK. We need to fix flake8 and some other modifications I proposed.
What is your training rig : 4xV100 in DGX to train "dogs vs cats" ? :)

from ignite.contrib.handlers import ProgressBar
from ignite.metrics import Accuracy, Loss, RunningAverage

def create_custom_supervised_trainer(model, optimizer, loss_fn, metrics={}, device=None , prepare_batch=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can reuse create_supervised_trainer in the implementation of this create_custom_supervised_trainer instead of repeating _update

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow - I did have to make a small change to _update in order to substitute out a custom prepare_batch (that if statement), the default implementation always uses _prepare_batch right?

Would be happy to find a more elegant solution though, as that one feels like a monkey patch

Copy link
Collaborator

@vfdev-5 vfdev-5 Feb 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here idea is to do something like that :

from ignite.engine import create_supervised_trainer


def create_custom_supervised_trainer(model, optimizer, loss_fn, metrics={}, device=None , prepare_batch=None):
    """
    We need to make some changes to the default trainer so we can use running metrics and consume Tensors from DALI
    """

    trainer = create_supervised_trainer(model, optimizer, loss_fn, 
        device=device, non_blocking=non_blocking, 
        prepare_batch=prepare_batch, 
        output_transform=lambda x, y, y_pred, loss: loss.item(), y_pred, y)

    def _metrics_transform(output):
        return output[1], output[2]

    for name, metric in metrics.items():
        metric._output_transform = _metrics_transform
        metric.attach(trainer, name)

    return trainer

and even we can avoid writing _metrics_transform if create_supervised_trainer returns a dictionary {‘y_pred’: y_pred, ‘y’: y, …} like described here : https://pytorch.org/ignite/metrics.html

@dnola
Copy link
Author

dnola commented Feb 7, 2020

Awesome thanks for your comments!

Yes, I will for sure work on making my code prettier.

And hey, I take cat vs dog classification very seriously. So seriously that I needed a DGX Station to do it! They are 4xV100 NVLink desktop rigs, handy for parallelization - even for tasks that really don't need it haha

They also work as desk heaters in a pinch, whenever you get cold you can just train some models on it!

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 7, 2020

They also work as desk heaters in a pinch, whenever you get cold you can just train some models on it!

@dnola how about benchmarking imagenet training with DALI and without using the scripts from ignite's reproducible trainings ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants