Using PFI with AutoML, possible? #3972

famschopman · 2019-07-08T21:01:50Z

Playing with AutoML and so far having much fun with it.

I have a trained model and now trying to retrieve the feature weights. None of the objects returned expose a LastTransformer object that I need to

Code snippet:

var mlContext = new MLContext();
var _appPath = AppDomain.CurrentDomain.BaseDirectory;
 var _dataPath = Path.Combine(_appPath, "Datasets", "dataset.csv");
var _modelPath = Path.Combine(_appPath, "Datasets", "TrainedModels");


ColumnInferenceResults columnInference = mlContext.Auto().InferColumns(_dataPath, LabelColumnName, groupColumns: false);
            ColumnInformation columnInformation = columnInference.ColumnInformation;

            TextLoader textLoader = mlContext.Data.CreateTextLoader(columnInference.TextLoaderOptions);
            IDataView data = textLoader.Load(_dataPath);

            DataOperationsCatalog.TrainTestData dataSplit = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);
            IDataView trainData = dataSplit.TrainSet;
            IDataView testData = dataSplit.TestSet;

            var cts = new CancellationTokenSource();
            var experimentSettings = CreateExperimentSettings(mlContext, cts);

            var progressHandler = new BinaryExperimentProgressHandler();

            ExperimentResult<BinaryClassificationMetrics> experimentResult = mlContext.Auto()
                .CreateBinaryClassificationExperiment(experimentSettings)
                .Execute(trainData, labelColumnName: "Attrition", progressHandler: new BinaryExperimentProgressHandler());

            RunDetail<BinaryClassificationMetrics> bestRun = experimentResult.BestRun;
            ITransformer trainedModel = bestRun.Model;
            var predictions = trainedModel.Transform(testData);
            var metrics = mlContext.BinaryClassification.EvaluateNonCalibrated(data: predictions, labelColumnName: "Attrition", scoreColumnName: "Score");

            mlContext.Model.Save(trainedModel, trainData.Schema, _modelPath);

Then I want to get the PFI information and I get stuck. There appears no way to get the LastTransformer object from the trainedModel.

            var transformedData = trainedModel.Transform(trainData);
            var linearPredictor = model.LastTransformer; 

            var permutationMetrics = mlContext.BinaryClassification.PermutationFeatureImportance(
                linearPredictor, transformedData, permutationCount: 30);

Hope someone can help me with some guidance.

The text was updated successfully, but these errors were encountered:

jedsmallwood · 2019-08-13T18:09:50Z

I'm interested in a solution to this also. It seems like a good way to reduce the number of features if you can identify which features are important.

justinormont · 2019-08-13T19:06:05Z

@daholste: Do you think this simply needs to be cast into the right type which has .LastTransformer as a property?

Possibly related comic: https://blog.toggl.com/build-horse-programming/

daholste · 2019-08-13T19:51:03Z

First and foremost, I love that comic, @justinormont

+1, the C# segment of the comic feels apropos. If you inspect the model in the debugger GUI, you should be able to navigate to the last transformer. Thru casting C# objects as you see them in the debugger, you could write lines of C# code that correspond to the navigation in the GUI

Of course, this is terribly hacky. Off-hand, I'm not aware of an officially supported / less hacky way to do this. It could be a great area of focus for future development

jedsmallwood · 2019-08-14T15:11:21Z

The following cast lets me access the LastTransformer, however I cannot use it for PFI until I provide a better type for predictor. Debugging I can see it is of type Microsoft.ML.Data.RegressionPredictionTransformer<Microsoft.ML.IPredictorProducing> but I am unable to cast to that because Microsoft.ML.IPredictorProducing is not visible, so it seems like we're still stuck.

//setup code similar to famschopman 
RegressionExperiment experiment = mlContext.Auto().CreateRegressionExperiment(experimentSettings);

var experimentResults = experiment.Execute(split.TrainSet, split.TestSet);
var predictor = ((TransformerChain<ITransformer>)experimentResults.BestRun.Model).LastTransformer;
          
//this will not compile.
var permutationMetrics = mlContext.Regression.PermutationFeatureImportance(predictor, transformedData, permutationCount: 30);

The following compile error is produced.

The type arguments for method 'PermutationFeatureImportanceExtensions.PermutationFeatureImportance<TModel>(RegressionCatalog, ISingleFeaturePredictionTransformer<TModel>, IDataView, string, bool, int?, int)' cannot be inferred from the usage. Try specifying the type arguments explicitly.

eerhardt · 2019-09-13T17:37:15Z

See my analysis on #3976 as well. These two issues feel like they are the same thing.

antoniovs1029 · 2020-06-04T14:56:28Z

The only thing that was needed to make this build and run was to add the (TransformerChain<ITransformer>) cast to the BestRun.Model (recommended in #3972 (comment) , and then add another cast to (ISingleFeaturePredictionTransformer<object>) for the LinearPredictor, and that would have been enough to let you run PFI:

            RunDetail<BinaryClassificationMetrics> bestRun = experimentResult.BestRun;
            TransformerChain<ITransformer> trainedModel = (TransformerChain <ITransformer>) bestRun.Model;
            var predictions = trainedModel.Transform(testData);

            var linearPredictor = (ISingleFeaturePredictionTransformer<object>)trainedModel.LastTransformer;

            var permutationMetrics = mlContext.BinaryClassification.PermutationFeatureImportance(
                linearPredictor, predictions, permutationCount: 30);

PS: There was a bug (#4517) when running PFI particularly with Binary classification models, so even after getting this running, if AutoML had returned a non-calibrated binary model, then running PFI would have thrown an exception. This bug got fixed on #4587 , which got included in ML.NET 1.5.0-preview2 and 1.5.0, so that is fixed.

See my analysis on #3976 as well. These two issues feel like they are the same thing.

The problem described there got fixed on #4262 and #4292. Still, that problem wasn't really causing this problem, as the solution I mentioned above would have worked even then. The problem you refer to is not being able to cast a model loaded from disk to their actual type (e.g. BinaryPredictionTransformer<ParameterMixingCalibratedModelParameters<IPredictorProducing<float>, ICalibrator>> ). After that problem got fixed, users can now cast to the actual type, but they could always cast to (ISingleFeaturePredictionTransformer<object>) (which is more appropriate when using AutoML.NET since users won't know in advance the actual type of the model being returned by the experiment). So the point is that it was always possible to use PFI with AutoML by using the (ISingleFeaturePredictionTransformer<object>) cast I described above.

famschopman mentioned this issue Sep 10, 2019

Automl: feature importance for regression models #4196

Closed

eerhardt mentioned this issue Sep 18, 2019

Using PFI with AutoML, possible? #4227

Closed

gvashishtha added the P2 Priority of the issue for triage purpose: Needs to be fixed at some point. label Jan 9, 2020

antoniovs1029 closed this as completed Jun 4, 2020

antoniovs1029 self-assigned this Jun 4, 2020

This was referenced Jun 18, 2020

PermutationFeatureImportance not working with AutoML API #5247

Closed

Need for a sample or clarification on how to use PFI with AutoML in ML.NET dotnet/docs#19006

Open

houghj16 mentioned this issue May 25, 2021

API Proposal: Update PFI API to be easier to use #5625

Closed

ghost locked as resolved and limited conversation to collaborators Mar 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using PFI with AutoML, possible? #3972

Using PFI with AutoML, possible? #3972

famschopman commented Jul 8, 2019

jedsmallwood commented Aug 13, 2019

justinormont commented Aug 13, 2019

daholste commented Aug 13, 2019

jedsmallwood commented Aug 14, 2019

eerhardt commented Sep 13, 2019

antoniovs1029 commented Jun 4, 2020 •

edited

Loading

Using PFI with AutoML, possible? #3972

Using PFI with AutoML, possible? #3972

Comments

famschopman commented Jul 8, 2019

jedsmallwood commented Aug 13, 2019

justinormont commented Aug 13, 2019

daholste commented Aug 13, 2019

jedsmallwood commented Aug 14, 2019

eerhardt commented Sep 13, 2019

antoniovs1029 commented Jun 4, 2020 • edited Loading

antoniovs1029 commented Jun 4, 2020 •

edited

Loading