Skip to content

Conversation

@zippeurfou
Copy link
Contributor

Fixes #

Description:
This add exponential annealing to the contrib handler. The goal being to be able to mimic fast.ai learning rate ability. The LRFinder uses annealing_exp to find the optimal learning rate.
This PR allows to do it.

Check list:

  • New tests are added (if a new feature is modified) => Not done but could not find any for cosineannealing
  • New doc strings: text and/or example code are in RST format => TODO, since it is a new suggestion I'd like to wait before doing the work
  • Documentation is updated (if required)

@zippeurfou
Copy link
Contributor Author

The error from travis seems independent from the code.

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 21, 2019

@zippeurfou can we provide a generic annealing with any scheduler ?

For example, I have a basic multi-step scheduler which I would like replicate periodically with value-length scaling...

@zippeurfou
Copy link
Contributor Author

Maybe something like a lambda annealing?

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 21, 2019

Sorry, I do not know a lambda annealing. If you are speaking about the implementation we can do it probably like:

def create_lr_scheduler_with_warmup(lr_scheduler, warmup_start_value, warmup_end_value, warmup_duration,

@zippeurfou
Copy link
Contributor Author

I am sorry I am unsure to understand the question/problem.
Are you asking if the exponential annealing can be used with the concat scheduler?

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 21, 2019

Actually, yes you are right, I was speaking about a sort of lambda annealing, and exp annealing is a special case.

@zippeurfou
Copy link
Contributor Author

Sadly, looking at the formulas I don't think we can really make something generic.
We could potentially make a function that is called Annealing that have a parameter that can take "cos" or "exp" and build on top of it when we want to add more annealing.
Looking at fast.ai the other one they have is poly.
However, I am unsure this would really be a "clean" approach..

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 22, 2019

@zippeurfou I tried to prototype such class quickly and it's true that this can be more complex to support all type of schedulers...

Idea is something like that:

class LambdaAnnealing(ParamScheduler):
    
    
    def __init__(self, scheduler, cycle_size, cycle_mult=1.0, save_history=False):
        super(LambdaAnnealing, self).__init__(optimizer={}, param_name="", save_history=save_history)        
        self.scheduler = scheduler 
        self.optimizer_param_groups = self.scheduler.optimizer_param_groups
        self.param_name = self.scheduler.param_name
        
        self.cycle_size = cycle_size
        self.cycle_mult = cycle_mult
        self.cycle = 0

    def get_param(self):
        
        if self.event_index > 0 and (self.event_index % self.cycle_size) == 0:
            # Restart wrapped scheduler
            # !!! THIS WONT WORK FOR EVERY SCHEDULER !!!
            self.scheduler.event_index = 0
            self.cycle_size *= self.cycle_mult
            self.cycle += 1
        
        value = self.scheduler.get_param()
        # Maybe we could scale the value here
        return value        

@zippeurfou
Copy link
Contributor Author

How can I help with this one?
I wonder if after this one I should do a lr_finder as a contrib. thought?

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 26, 2019

How can I help with this one?

@zippeurfou you can go on with the prototype class LambdaAnnealing I provided in the previous message. However, this will be tricky.
Here is a code to check the implementation:

scheduler = PiecewiseLinear(optimizer, "lr",
                            milestones_values=[(0, 0.0), (10, 0.4), (29, 0.0)])

la_scheduler = LambdaAnnealing(scheduler, cycle_size=25)

from ignite.engine import Engine, Events

lrs = []

def save_lr(engine):
    lrs.append(optimizer.param_groups[0]['lr'])

trainer = Engine(lambda engine, batch: None)
trainer.add_event_handler(Events.ITERATION_STARTED, la_scheduler)
trainer.add_event_handler(Events.ITERATION_COMPLETED, save_lr)
trainer.run([0] * 10, max_epochs=5)


import numpy as np
import matplotlib.pylab as plt

plt.title("Lambda annealing")
plt.plot(lrs, label="learning rate")
plt.xlabel("events")
plt.ylabel("values")
plt.legend()

After that we need also the tests to write.

I wonder if after this one I should do a lr_finder as a contrib. thought?

I have no idea of any usefullness of this lr_finder, as everybody I read about it at fast.ai but never tested explicitly. If you have more experience with that we can follow your advice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants