-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Use PFI with Binary Prediction Transformer and CalibratedModelParametersBase loaded from disk #4292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
antoniovs1029
added a commit
that referenced
this issue
Nov 12, 2019
*Changes in PredictionTransformer.cs and Calibrator.cs to fix the problem of the create methods not being called, to make CMP load its internal calibrator and predictor first so to assign the correct paramaters types and runtimes, and added a PredictionTransformerLoadTypeAttribute so that the binary prediction transformer knows what type to assign when loading a CMP as its internal model. *Added a working sample for using PFI with BPT and CMPB while loading a model from disk. This is based entirely in the original sample. *Added file CalibratedModelParametersTests.cs with tests that the CMPs modified in this PR are now being correctly loaded from disk. *Changed a couple of tests in LbfgsTests.cs that failed because they used casts that now return 'null'.
Lynx1820
added a commit
to Lynx1820/machinelearning
that referenced
this issue
Nov 14, 2019
commit a5e274ef8869576190bbb794360a5f56d998b470 Merge: b7db4fa d7f9996 Author: Keren Fuentes <[email protected]> Date: Thu Nov 14 14:51:21 2019 -0800 Merge branch 'onnx_bin_classifiers' of https://github.com/Lynx1820/machinelearning into onnx_bin_classifiers commit b7db4fa Author: Harish Kulkarni <[email protected]> Date: Thu Nov 14 17:41:12 2019 +0000 Added onnx export support for KeyToValueMappingTransformer (dotnet#4455) commit f3e0f6b Author: Eric Erhardt <[email protected]> Date: Thu Nov 14 07:22:12 2019 -0600 Fix a flaky Extensions.ML test. (dotnet#4458) * Fix a flaky Extensions.ML test. Make the reload model tests more resistant to timing changes. * PR feedback. commit c1e190a Author: Harish Kulkarni <[email protected]> Date: Thu Nov 14 05:24:14 2019 +0000 Added onnx export support for OptionalColumnTransform (dotnet#4454) * Initial work for adding onnx export support for OptionalColumnTransform * Implemented support for optional initializers in OnnxTranformer to support OptionalColumnTransform * Fixed handling of double values and non-long numeric types * Removed redundant line * Updated review comment commit f96761b Author: Harish Kulkarni <[email protected]> Date: Thu Nov 14 03:17:12 2019 +0000 Fixed model saving and loading of OneVersusAllTrainer to include SoftMax (dotnet#4472) * Fixed model saving and loading of OneVersusAllTrainer to include SoftMax * Modified existing test to include SoftMax option * Modified test to verify both cases: when UseSoftmax is true and false commit d45cc8a Author: Jake <[email protected]> Date: Wed Nov 13 17:26:49 2019 -0800 Add InternalsVisibleTo in AutoML and CodeGenerator for the assembly Microsoft.ML.ModelBuilder.AutoMLService.Gpu (dotnet#4474) commit 5e83e23 Author: Eric Erhardt <[email protected]> Date: Wed Nov 13 16:09:05 2019 -0600 CpuMathNative assembly is not getting copied when using packages.config. (dotnet#4465) When we refactored CpuMath to support netcoreapp3.0, we broke the packages.config support to copy the native assembly. This fixes it again by copying the file from the correct location. Fix dotnet#93 commit 693250b Author: Harish Kulkarni <[email protected]> Date: Wed Nov 13 21:58:07 2019 +0000 Added onnx export support for WordTokenizingTransformer and NgramExtractingTransformer (dotnet#4451) * Added onnx export support for string related transforms * Updated baseline test files A large portion of this commit is upgrading the baseline test files. The rest of the fixes deal with build breaks resulting from the upgrade of ORT version. * Fixed bugs in ValueToKeyMappingTransformer and added additional tests commit 5910910 Author: Antonio Velázquez <[email protected]> Date: Mon Nov 11 17:19:39 2019 -0800 Fixes dotnet#4292 about using PFI with BPT and CMPB (dotnet#4306) *Changes in PredictionTransformer.cs and Calibrator.cs to fix the problem of the create methods not being called, to make CMP load its internal calibrator and predictor first so to assign the correct paramaters types and runtimes, and added a PredictionTransformerLoadTypeAttribute so that the binary prediction transformer knows what type to assign when loading a CMP as its internal model. *Added a working sample for using PFI with BPT and CMPB while loading a model from disk. This is based entirely in the original sample. *Added file CalibratedModelParametersTests.cs with tests that the CMPs modified in this PR are now being correctly loaded from disk. *Changed a couple of tests in LbfgsTests.cs that failed because they used casts that now return 'null'. commit bcdac55 Author: Brian Stark <[email protected]> Date: Mon Nov 11 13:42:42 2019 -0800 Stabilize the LR test (dotnet#4446) * Stabilize the LR test Found issue with how we were using random for our ImageClassificationTrainer. This caused instability in our unit test, as we were not able to control the random seed. Modified the code to now use the same random object throughout, the trainer, thus allowing us to control the seed and therefor have predictable output. commit d7f9996 Author: Keren Fuentes <[email protected]> Date: Mon Nov 11 11:33:17 2019 -0800 workaround Scores commit 7fba31c Merge: 93388b6 c96d690 Author: Keren Fuentes <[email protected]> Date: Mon Nov 11 11:25:28 2019 -0800 merging changes commit 93388b6 Author: Keren Fuentes <[email protected]> Date: Mon Nov 11 11:19:59 2019 -0800 Added extraction of score column before node creation commit ea71828 Author: Keren Fuentes <[email protected]> Date: Fri Nov 8 15:53:11 2019 -0800 fix for binary classification trainers export to onnx commit 6fad293 Author: Keren Fuentes <[email protected]> Date: Thu Oct 31 15:26:43 2019 -0700 Revert "draft regression test" This reverts commit 1ad45c995516b9d39fc05aca855ce2abe96c407b. commit 83c1c80 Author: Keren Fuentes <[email protected]> Date: Thu Oct 31 15:24:23 2019 -0700 draft regression test commit 8884161 Author: frank-dong-ms <[email protected]> Date: Fri Nov 8 20:20:53 2019 -0800 nightly build pipeline (dotnet#4444) * nightly build pipeline commit c96d690 Author: Keren Fuentes <[email protected]> Date: Fri Nov 8 15:53:11 2019 -0800 fix for binary classification trainers export to onnx commit 8100364 Author: Keren Fuentes <[email protected]> Date: Thu Oct 31 15:26:43 2019 -0700 Revert "draft regression test" This reverts commit 1ad45c995516b9d39fc05aca855ce2abe96c407b. commit 81381e2 Author: Keren Fuentes <[email protected]> Date: Thu Oct 31 15:24:23 2019 -0700 draft regression test
Lynx1820
added a commit
to Lynx1820/machinelearning
that referenced
this issue
Dec 2, 2019
commit a5e274ef8869576190bbb794360a5f56d998b470 Merge: b7db4fa d7f9996 Author: Keren Fuentes <[email protected]> Date: Thu Nov 14 14:51:21 2019 -0800 Merge branch 'onnx_bin_classifiers' of https://github.com/Lynx1820/machinelearning into onnx_bin_classifiers commit b7db4fa Author: Harish Kulkarni <[email protected]> Date: Thu Nov 14 17:41:12 2019 +0000 Added onnx export support for KeyToValueMappingTransformer (dotnet#4455) commit f3e0f6b Author: Eric Erhardt <[email protected]> Date: Thu Nov 14 07:22:12 2019 -0600 Fix a flaky Extensions.ML test. (dotnet#4458) * Fix a flaky Extensions.ML test. Make the reload model tests more resistant to timing changes. * PR feedback. commit c1e190a Author: Harish Kulkarni <[email protected]> Date: Thu Nov 14 05:24:14 2019 +0000 Added onnx export support for OptionalColumnTransform (dotnet#4454) * Initial work for adding onnx export support for OptionalColumnTransform * Implemented support for optional initializers in OnnxTranformer to support OptionalColumnTransform * Fixed handling of double values and non-long numeric types * Removed redundant line * Updated review comment commit f96761b Author: Harish Kulkarni <[email protected]> Date: Thu Nov 14 03:17:12 2019 +0000 Fixed model saving and loading of OneVersusAllTrainer to include SoftMax (dotnet#4472) * Fixed model saving and loading of OneVersusAllTrainer to include SoftMax * Modified existing test to include SoftMax option * Modified test to verify both cases: when UseSoftmax is true and false commit d45cc8a Author: Jake <[email protected]> Date: Wed Nov 13 17:26:49 2019 -0800 Add InternalsVisibleTo in AutoML and CodeGenerator for the assembly Microsoft.ML.ModelBuilder.AutoMLService.Gpu (dotnet#4474) commit 5e83e23 Author: Eric Erhardt <[email protected]> Date: Wed Nov 13 16:09:05 2019 -0600 CpuMathNative assembly is not getting copied when using packages.config. (dotnet#4465) When we refactored CpuMath to support netcoreapp3.0, we broke the packages.config support to copy the native assembly. This fixes it again by copying the file from the correct location. Fix dotnet#93 commit 693250b Author: Harish Kulkarni <[email protected]> Date: Wed Nov 13 21:58:07 2019 +0000 Added onnx export support for WordTokenizingTransformer and NgramExtractingTransformer (dotnet#4451) * Added onnx export support for string related transforms * Updated baseline test files A large portion of this commit is upgrading the baseline test files. The rest of the fixes deal with build breaks resulting from the upgrade of ORT version. * Fixed bugs in ValueToKeyMappingTransformer and added additional tests commit 5910910 Author: Antonio Velázquez <[email protected]> Date: Mon Nov 11 17:19:39 2019 -0800 Fixes dotnet#4292 about using PFI with BPT and CMPB (dotnet#4306) *Changes in PredictionTransformer.cs and Calibrator.cs to fix the problem of the create methods not being called, to make CMP load its internal calibrator and predictor first so to assign the correct paramaters types and runtimes, and added a PredictionTransformerLoadTypeAttribute so that the binary prediction transformer knows what type to assign when loading a CMP as its internal model. *Added a working sample for using PFI with BPT and CMPB while loading a model from disk. This is based entirely in the original sample. *Added file CalibratedModelParametersTests.cs with tests that the CMPs modified in this PR are now being correctly loaded from disk. *Changed a couple of tests in LbfgsTests.cs that failed because they used casts that now return 'null'. commit bcdac55 Author: Brian Stark <[email protected]> Date: Mon Nov 11 13:42:42 2019 -0800 Stabilize the LR test (dotnet#4446) * Stabilize the LR test Found issue with how we were using random for our ImageClassificationTrainer. This caused instability in our unit test, as we were not able to control the random seed. Modified the code to now use the same random object throughout, the trainer, thus allowing us to control the seed and therefor have predictable output. commit d7f9996 Author: Keren Fuentes <[email protected]> Date: Mon Nov 11 11:33:17 2019 -0800 workaround Scores commit 7fba31c Merge: 93388b6 c96d690 Author: Keren Fuentes <[email protected]> Date: Mon Nov 11 11:25:28 2019 -0800 merging changes commit 93388b6 Author: Keren Fuentes <[email protected]> Date: Mon Nov 11 11:19:59 2019 -0800 Added extraction of score column before node creation commit ea71828 Author: Keren Fuentes <[email protected]> Date: Fri Nov 8 15:53:11 2019 -0800 fix for binary classification trainers export to onnx commit 6fad293 Author: Keren Fuentes <[email protected]> Date: Thu Oct 31 15:26:43 2019 -0700 Revert "draft regression test" This reverts commit 1ad45c995516b9d39fc05aca855ce2abe96c407b. commit 83c1c80 Author: Keren Fuentes <[email protected]> Date: Thu Oct 31 15:24:23 2019 -0700 draft regression test commit 8884161 Author: frank-dong-ms <[email protected]> Date: Fri Nov 8 20:20:53 2019 -0800 nightly build pipeline (dotnet#4444) * nightly build pipeline commit c96d690 Author: Keren Fuentes <[email protected]> Date: Fri Nov 8 15:53:11 2019 -0800 fix for binary classification trainers export to onnx commit 8100364 Author: Keren Fuentes <[email protected]> Date: Thu Oct 31 15:26:43 2019 -0700 Revert "draft regression test" This reverts commit 1ad45c995516b9d39fc05aca855ce2abe96c407b. commit 81381e2 Author: Keren Fuentes <[email protected]> Date: Thu Oct 31 15:24:23 2019 -0700 draft regression test
Lynx1820
added a commit
that referenced
this issue
Dec 2, 2019
* Squashed commit of the following: commit a5e274ef8869576190bbb794360a5f56d998b470 Merge: b7db4fa d7f9996 Author: Keren Fuentes <[email protected]> Date: Thu Nov 14 14:51:21 2019 -0800 Merge branch 'onnx_bin_classifiers' of https://github.com/Lynx1820/machinelearning into onnx_bin_classifiers commit b7db4fa Author: Harish Kulkarni <[email protected]> Date: Thu Nov 14 17:41:12 2019 +0000 Added onnx export support for KeyToValueMappingTransformer (#4455) commit f3e0f6b Author: Eric Erhardt <[email protected]> Date: Thu Nov 14 07:22:12 2019 -0600 Fix a flaky Extensions.ML test. (#4458) * Fix a flaky Extensions.ML test. Make the reload model tests more resistant to timing changes. * PR feedback. commit c1e190a Author: Harish Kulkarni <[email protected]> Date: Thu Nov 14 05:24:14 2019 +0000 Added onnx export support for OptionalColumnTransform (#4454) * Initial work for adding onnx export support for OptionalColumnTransform * Implemented support for optional initializers in OnnxTranformer to support OptionalColumnTransform * Fixed handling of double values and non-long numeric types * Removed redundant line * Updated review comment commit f96761b Author: Harish Kulkarni <[email protected]> Date: Thu Nov 14 03:17:12 2019 +0000 Fixed model saving and loading of OneVersusAllTrainer to include SoftMax (#4472) * Fixed model saving and loading of OneVersusAllTrainer to include SoftMax * Modified existing test to include SoftMax option * Modified test to verify both cases: when UseSoftmax is true and false commit d45cc8a Author: Jake <[email protected]> Date: Wed Nov 13 17:26:49 2019 -0800 Add InternalsVisibleTo in AutoML and CodeGenerator for the assembly Microsoft.ML.ModelBuilder.AutoMLService.Gpu (#4474) commit 5e83e23 Author: Eric Erhardt <[email protected]> Date: Wed Nov 13 16:09:05 2019 -0600 CpuMathNative assembly is not getting copied when using packages.config. (#4465) When we refactored CpuMath to support netcoreapp3.0, we broke the packages.config support to copy the native assembly. This fixes it again by copying the file from the correct location. Fix #93 commit 693250b Author: Harish Kulkarni <[email protected]> Date: Wed Nov 13 21:58:07 2019 +0000 Added onnx export support for WordTokenizingTransformer and NgramExtractingTransformer (#4451) * Added onnx export support for string related transforms * Updated baseline test files A large portion of this commit is upgrading the baseline test files. The rest of the fixes deal with build breaks resulting from the upgrade of ORT version. * Fixed bugs in ValueToKeyMappingTransformer and added additional tests commit 5910910 Author: Antonio Velázquez <[email protected]> Date: Mon Nov 11 17:19:39 2019 -0800 Fixes #4292 about using PFI with BPT and CMPB (#4306) *Changes in PredictionTransformer.cs and Calibrator.cs to fix the problem of the create methods not being called, to make CMP load its internal calibrator and predictor first so to assign the correct paramaters types and runtimes, and added a PredictionTransformerLoadTypeAttribute so that the binary prediction transformer knows what type to assign when loading a CMP as its internal model. *Added a working sample for using PFI with BPT and CMPB while loading a model from disk. This is based entirely in the original sample. *Added file CalibratedModelParametersTests.cs with tests that the CMPs modified in this PR are now being correctly loaded from disk. *Changed a couple of tests in LbfgsTests.cs that failed because they used casts that now return 'null'. commit bcdac55 Author: Brian Stark <[email protected]> Date: Mon Nov 11 13:42:42 2019 -0800 Stabilize the LR test (#4446) * Stabilize the LR test Found issue with how we were using random for our ImageClassificationTrainer. This caused instability in our unit test, as we were not able to control the random seed. Modified the code to now use the same random object throughout, the trainer, thus allowing us to control the seed and therefor have predictable output. commit d7f9996 Author: Keren Fuentes <[email protected]> Date: Mon Nov 11 11:33:17 2019 -0800 workaround Scores commit 7fba31c Merge: 93388b6 c96d690 Author: Keren Fuentes <[email protected]> Date: Mon Nov 11 11:25:28 2019 -0800 merging changes commit 93388b6 Author: Keren Fuentes <[email protected]> Date: Mon Nov 11 11:19:59 2019 -0800 Added extraction of score column before node creation commit ea71828 Author: Keren Fuentes <[email protected]> Date: Fri Nov 8 15:53:11 2019 -0800 fix for binary classification trainers export to onnx commit 6fad293 Author: Keren Fuentes <[email protected]> Date: Thu Oct 31 15:26:43 2019 -0700 Revert "draft regression test" This reverts commit 1ad45c995516b9d39fc05aca855ce2abe96c407b. commit 83c1c80 Author: Keren Fuentes <[email protected]> Date: Thu Oct 31 15:24:23 2019 -0700 draft regression test commit 8884161 Author: frank-dong-ms <[email protected]> Date: Fri Nov 8 20:20:53 2019 -0800 nightly build pipeline (#4444) * nightly build pipeline commit c96d690 Author: Keren Fuentes <[email protected]> Date: Fri Nov 8 15:53:11 2019 -0800 fix for binary classification trainers export to onnx commit 8100364 Author: Keren Fuentes <[email protected]> Date: Thu Oct 31 15:26:43 2019 -0700 Revert "draft regression test" This reverts commit 1ad45c995516b9d39fc05aca855ce2abe96c407b. commit 81381e2 Author: Keren Fuentes <[email protected]> Date: Thu Oct 31 15:24:23 2019 -0700 draft regression test * update name and remove some dependency code * baseline file changes * Refactored function
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
In my last accepted pull request (#4262 ) I addressed issue #3976 and was able to provide working samples and tests for using PFI with models loaded from disk except for the case of Binary Prediction Transformer. Here I open this issue about that specific problem.
Problem
In the sample using PFI with binary classification the last transformer of the model (i.e. the linearPredictor) is of type
BinaryPredictionTransformer<CalibratedModelParametersBase<LinearBinaryModelParameters, PlattCalibrator>>
.Problem is that when saving and then loading that model from disk, a null reference is returned when trying to access the last transformer by casting it to the original type.
Having a null linearPredictor makes it unusable with PFI.
In version 1.3 of ML.Net the last transformer of the loaded model would actually be of type
BinaryPredictionTransformer<IPredictorProducing<float>>
With the changes I made in my last PR (which will be available in version 1.4.0 preview 2) the loaded model's last transformer would be of type
BinaryPredictionTransformer<ParameterMixingCalibratedModelParameters<IPredictorProducing<float>, ICalibrator>>
which is a step forward in solving the problem, but is not yet enough.As stated, in both cases, a cast to the original type would return null. In general, it would be expected that the user tries to make that cast in order to use PFI, failing to accomplish it.
This problem would be solved if the loaded model actually had a lastTransformer of the original type, or something castable to it.
Workaround
Based on this comment made by @yaeldekel I've just made this working sample of using PFI with a binary prediction transformer loaded from disk. It is pretty much the same as the original sample, only that it works with a model loaded from disk.
The key of the workaround is that the user should cast the lastTransformer not into a binary prediction transformer but rather into a
ISingleFeaturePredictionTransformer<object>
, and then do a series of casts to get whatever other object s/he may want to get from inside the lastTransformer.In the sample I've just provided it works pretty much in this way:
Notice that this workaround worked even in ML.Net 1.3, and also works with the changes that I introduced in 1.4.0 preview 2.
Notice that a similar workaround might help a user that tries to use PFI with any kind of prediction transformer loaded from disk. This would come useful if the user, for whatever reason, can not extract the linearPredictor by casting to the same type used in the original model.
Cause of the Problem
There are 3 main points that are related to the cause of this problem, all of which pertain the
Calibrator.cs
file and aren't related to the binary prediction transformer itself:ParameterMixingCalibratedModelParameters<>
its Create method isn't called. I discovered this while debugging, and what actually happens is that, during loading, inside the CreateInstanceCore method, it first looks for a constructor, and so it calls the constructor ofParameterMixingCalibratedModelParameters<>
instead of the Create method.ParameterMixingCalibratedModelParameters<>
model, aParameterMixingCalibratedModelParameters<IPredictorProducing<float>, ICalibrator>
is always loaded, no matter what the actual submodel and calibrator are. This doesn't change by fixing point 1). This point is similar to the original problem found on the prediction transformers, which I fixed in my last pull request; using a similar approach in this case would fix this point... that is, loading first the submodel and calibrator to then create a generic type at runtime with the correct parameter types.ParameterMixingCalibratedModelParameters<LinearBinaryModelParameters, PlattCalibrator>
(which I will now refer to as "PMCMP") but returns it as aCalibratedModelParametersBase<LinearBinaryModelParameters, PlattCalibrator>
(let's call it "CMPB") this then is what makes the last transformer of the model to be aBinaryPredictionTransformer<CMPB>
whereas the internal model of the last transformer is actually a PMCMP. When saving it to disk, it's saved as a PMCMP (i.e. it's saved using a LoaderSignature of "PMixCaliPredExec"), so when loading occurs, it calls the constructor of PMCMP but it doesn't cast it to a CMPB. This is different from the problem fixed in my last pull request; there, if a Regression prediction transformer was saved, then we expected to load a regression prediction transformer... whereas in here if a BPT is saved we actually want to load a PMCMP with the correct type parameters, but actually create a BPT where the CMPB should also have the correct type parameters.Trying to solve the problem
So far I've been able to solve problems 1) and 2) described above, but after trying out different approaches I haven't been able to solve problem 3). To solve those problems I've changed different things in the
Calibrator.cs
file. My attempt to solve this problem can be found in PR #4306With those changes (along with the ones of my last PR), the loaded model's last transformer becomes a
BinaryPredictionTransformer<ParameterMixingCalibratedModelParameters<LinearBinaryModelParameters, PlattCalibrator>>
. Notice that even here a cast toBPT<CMPB>
would be null, so it doesn't solve the problem. Also notice that since PMCMP is an internal class the user wouldn't be able to cast the last transformer toBPT<PMCMP>
either, since s/he wouldn't have access to that class.Further problems
Here I've explained the specific case of loading a
BPT<CMPB>
with the specific problems that arise in CMPB and PMCMP classes because that is what is used in the sample of PFI with BPT, and in the tests of PFI with BPT. It could be possible that the problems here described are also present in other classes (for example in the other classes of Calibrator.cs) but they might not become a problem unless the user tries to access the last transformer of a model loaded from disk. In such a case the described workaround might help.The text was updated successfully, but these errors were encountered: