You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some binary classification estimators automatically return a calibrated model (for example, FastTree and LogisticRegression), but some don't - for example, FastForest. When trying to pass such a model to PFI, there is an exception thrown saying that the probability column was not found.
var ml = new MLContext();
var ff = ml.BinaryClassification.Trainers.FastForest();
var data = ml.Data.LoadFromTextFile(@"breast-cancer.txt",
new[] { new TextLoader.Column("Label", DataKind.Boolean, 0),
new TextLoader.Column("Features", DataKind.Single, 1, 9) });
var model = ff.Fit(data);
var pfi = ml.BinaryClassification.PermutationFeatureImportance(model, data);
There are actually two issues here: The first is what I described above, and the second is that there is no workaround for this problem. I tried adding a calibrator manually:
var ff = ml.BinaryClassification.Trainers.FastForest();
var ffmodel = ff.Fit(data);
var calibrator = ml.BinaryClassification.Calibrators.Platt();
var calibratormodel = calibrator.Fit(ffmodel.Transform(data));
var pfi = ml.BinaryClassification.PermutationFeatureImportance(calibratormodel, ffmodel.Transform(data));
The reason I could not train these two models as a pipeline is because the resulting model is of type TransformerChain so I cannot pass it to PFI. However, this code doesn't work either, because even though calibratormodel is indeed an ISingleFeaturePredictionTransformer, the features column of calibratormodel, is the score column of the output of ffmodel, so PFI doesn't do the right thing. As far as I can tell, there is no way to pass a model where the calibrator was trained separately to PFI. It might be worth opening a separate issue for this, not sure.
The text was updated successfully, but these errors were encountered:
Some binary classification estimators automatically return a calibrated model (for example, FastTree and LogisticRegression), but some don't - for example, FastForest. When trying to pass such a model to PFI, there is an exception thrown saying that the probability column was not found.
There are actually two issues here: The first is what I described above, and the second is that there is no workaround for this problem. I tried adding a calibrator manually:
The reason I could not train these two models as a pipeline is because the resulting model is of type
TransformerChain
so I cannot pass it to PFI. However, this code doesn't work either, because even thoughcalibratormodel
is indeed anISingleFeaturePredictionTransformer
, the features column ofcalibratormodel
, is the score column of the output offfmodel
, so PFI doesn't do the right thing. As far as I can tell, there is no way to pass a model where the calibrator was trained separately to PFI. It might be worth opening a separate issue for this, not sure.The text was updated successfully, but these errors were encountered: