We have added a few more models, deberta, f_net, and albert, that are meant for classification and could do with GLUE evaluation. Let's run some testing on them and make sure they are performance in the right ballpark!