Skip to content

PFI doesn't work with uncalibrated binary classifiers #4517

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yaeldekel opened this issue Dec 3, 2019 · 1 comment · Fixed by #4587
Closed

PFI doesn't work with uncalibrated binary classifiers #4517

yaeldekel opened this issue Dec 3, 2019 · 1 comment · Fixed by #4587

Comments

@yaeldekel
Copy link

Some binary classification estimators automatically return a calibrated model (for example, FastTree and LogisticRegression), but some don't - for example, FastForest. When trying to pass such a model to PFI, there is an exception thrown saying that the probability column was not found.

var ml = new MLContext();
var ff = ml.BinaryClassification.Trainers.FastForest();
var data = ml.Data.LoadFromTextFile(@"breast-cancer.txt",
                new[] { new TextLoader.Column("Label", DataKind.Boolean, 0),
                            new TextLoader.Column("Features", DataKind.Single, 1, 9) });
var model = ff.Fit(data);
var pfi = ml.BinaryClassification.PermutationFeatureImportance(model, data);

There are actually two issues here: The first is what I described above, and the second is that there is no workaround for this problem. I tried adding a calibrator manually:

var ff = ml.BinaryClassification.Trainers.FastForest();
var ffmodel = ff.Fit(data);
var calibrator = ml.BinaryClassification.Calibrators.Platt();
var calibratormodel = calibrator.Fit(ffmodel.Transform(data));
var pfi = ml.BinaryClassification.PermutationFeatureImportance(calibratormodel, ffmodel.Transform(data));

The reason I could not train these two models as a pipeline is because the resulting model is of type TransformerChain so I cannot pass it to PFI. However, this code doesn't work either, because even though calibratormodel is indeed an ISingleFeaturePredictionTransformer, the features column of calibratormodel, is the score column of the output of ffmodel, so PFI doesn't do the right thing. As far as I can tell, there is no way to pass a model where the calibrator was trained separately to PFI. It might be worth opening a separate issue for this, not sure.

@oluatte
Copy link

oluatte commented Dec 11, 2019

+1

Just ran into this.

antoniovs1029 pushed a commit that referenced this issue Jan 8, 2020
This change adds support for running PFI on binary classification models that do not contain a calibrator. Fixes #4517 .
@ghost ghost locked as resolved and limited conversation to collaborators Mar 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants