Open
Description
As we grow, and add more and more backbones and tasks, our modeling testing is quickly growing beyond what our infrastructure can handle today. I think this is going to be a general pain point for scaling, and it may be worth investing in some smarter solutions here.
One option would be to only run our "large"
tests when we update model code in question. We could do this for our accelerator testing with something like this (pseudocode).
pytest keras_nlp/ --ignore=keras_nlp/models --run_large
for dir in model dirs:
if $(git diff --quiet HEAD master -- $dir):
pytest keras_nlp/models/$dir --run_large
This could be a relatively lightweight way to avoid the fundamental scaling problem we are facing. We would also need some way to manually invoke a "test everything" command for specific PRs we are worried about (for example, a change to TransformerDecoder
).