Skip to content

Added the assembly name of the custom transform to the model file #4989

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 1, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Added the assembly name of the custom transform to the model file
  • Loading branch information
harishsk committed Apr 1, 2020
commit 2a2b6557b901fcfde44f2a80890a8093fff51b3c
5 changes: 4 additions & 1 deletion src/Microsoft.ML.Transforms/CustomMappingTransformer.cs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

using System;
using System.Linq;
using System.Reflection;
using Microsoft.ML.Data;
using Microsoft.ML.Internal.Utilities;
using Microsoft.ML.Runtime;
Expand All @@ -22,6 +23,7 @@ public sealed class CustomMappingTransformer<TSrc, TDst> : ITransformer
private readonly IHost _host;
private readonly Action<TSrc, TDst> _mapAction;
private readonly string _contractName;
private readonly string _contractAssembly;

internal InternalSchemaDefinition AddedSchema { get; }
internal SchemaDefinition InputSchemaDefinition { get; }
Expand Down Expand Up @@ -58,6 +60,7 @@ internal CustomMappingTransformer(IHostEnvironment env, Action<TSrc, TDst> mapAc
: InternalSchemaDefinition.Create(typeof(TDst), outputSchemaDefinition);

_contractName = contractName;
_contractAssembly = _mapAction.Method.DeclaringType.Assembly.FullName;
Copy link
Member

@antoniovs1029 antoniovs1029 Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any case where the loaded model would actually require having a different name registered from the "FullName" retrieved from here? #Resolved

Copy link
Member

@antoniovs1029 antoniovs1029 Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or any case where trying to access that member of _mapAction would throw? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to Method can throw a MemberAccessException. But that would be up to the caller to fix in their code and the exception would help with that.


In reply to: 401790783 [](ancestors = 401790783)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should enforce the idea that the same transform that is used in training should be used in prediction as well. If they were to be different, then they are not the same pipelines and not the same models.


In reply to: 401431899 [](ancestors = 401431899)

AddedSchema = outSchema;
}

Expand All @@ -67,7 +70,7 @@ internal void SaveModel(ModelSaveContext ctx)
{
if (_contractName == null)
throw _host.Except("Empty contract name for a transform: the transform cannot be saved");
LambdaTransform.SaveCustomTransformer(_host, ctx, _contractName);
LambdaTransform.SaveCustomTransformer(_host, ctx, _contractName, _contractAssembly);
}

/// <summary>
Expand Down
21 changes: 17 additions & 4 deletions src/Microsoft.ML.Transforms/LambdaTransform.cs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@
// See the LICENSE file in the project root for more information.

using System;
using System.Diagnostics.Contracts;
Copy link
Member

@antoniovs1029 antoniovs1029 Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why do we need System.Diagnostics.Contracts in here? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


In reply to: 401444214 [](ancestors = 401444214)

using System.IO;
using System.Reflection;
using System.Text;
using Microsoft.ML;
using Microsoft.ML.Data;
Expand Down Expand Up @@ -40,14 +42,17 @@ private static VersionInfo GetVersionInfo()
{
return new VersionInfo(
modelSignature: "CUSTOMXF",
verWrittenCur: 0x00010001,
verReadableCur: 0x00010001,
//verWrittenCur: 0x00010001, // Initial
verWrittenCur: 0x00010002, // Added name of assembly in which the contractName is present
verReadableCur: 0x00010002,
verWeCanReadBack: 0x00010001,
loaderSignature: LoaderSignature,
loaderAssemblyName: typeof(LambdaTransform).Assembly.FullName);
}

internal static void SaveCustomTransformer(IExceptionContext ectx, ModelSaveContext ctx, string contractName)
private const uint VerAssemblyNameSaved = 0x00010002;

internal static void SaveCustomTransformer(IExceptionContext ectx, ModelSaveContext ctx, string contractName, string contractAssembly)
{
ectx.CheckValue(ctx, nameof(ctx));
ectx.CheckValue(contractName, nameof(contractName));
Expand All @@ -56,16 +61,24 @@ internal static void SaveCustomTransformer(IExceptionContext ectx, ModelSaveCont
ctx.SetVersionInfo(GetVersionInfo());

ctx.SaveString(contractName);
ctx.SaveString(contractAssembly);
}

// Factory for SignatureLoadModel.
private static ITransformer Create(IHostEnvironment env, ModelLoadContext ctx)
{
Contracts.CheckValue(env, nameof(env));
env.CheckValue(ctx, nameof(ctx));
ctx.CheckAtModel(GetVersionInfo());
var versionInfo = GetVersionInfo();
ctx.CheckAtModel(versionInfo);

var contractName = ctx.LoadString();
if (ctx.Header.ModelVerWritten >= VerAssemblyNameSaved)
{
var contractAssembly = ctx.LoadString();
Assembly assembly = Assembly.Load(contractAssembly);
env.ComponentCatalog.RegisterAssembly(assembly);
}

object factoryObject = env.ComponentCatalog.GetExtensionValue(env, typeof(CustomMappingFactoryAttributeAttribute), contractName);
if (!(factoryObject is ICustomMappingFactory mappingFactory))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,10 @@ public void RegisterTypeWithAttribute()
var tribeTransformed = model.Transform(tribeDataView);
var tribeEnumerable = ML.Data.CreateEnumerable<SuperAlienHero>(tribeTransformed, false).ToList();

// save and reload the model
ML.Model.Save(model, tribeDataView.Schema, "customTransform.zip");
var modelSaved = ML.Model.Load("customTransform.zip", out var tribeDataViewSaved);

// Make sure the pipeline output is correct.
Assert.Equal(tribeEnumerable[0].Name, "Super " + tribe[0].Name);
Assert.Equal(tribeEnumerable[0].Merged.Age, tribe[0].One.Age + tribe[0].Two.Age);
Expand All @@ -192,7 +196,7 @@ public void RegisterTypeWithAttribute()
Assert.Equal(tribeEnumerable[0].Merged.HandCount, tribe[0].One.HandCount + tribe[0].Two.HandCount);

// Build prediction engine from the trained pipeline.
var engine = ML.Model.CreatePredictionEngine<AlienHero, SuperAlienHero>(model);
var engine = ML.Model.CreatePredictionEngine<AlienHero, SuperAlienHero>(modelSaved);
Copy link
Member

@antoniovs1029 antoniovs1029 Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to have this test both for the model before saving it, and for the model that is loaded from disk. Ideally both models should behave the same, but if a bug is introduced that only appears in one of both cases, it might be good to have both of the tests.

The InlineDataAttribute could become handy for this to avoid code duplication, as used by the PermutationFeatureImportanceTests:

[Theory]
[InlineData(true)]
[InlineData(false)]
public void TestPfiRegressionOnDenseFeatures(bool saveModel)
{
var data = GetDenseDataset();
var model = ML.Regression.Trainers.OnlineGradientDescent().Fit(data);
ImmutableArray<RegressionMetricsStatistics> pfi;
if(saveModel)
{
var modelAndSchemaPath = GetOutputPath("TestPfiRegressionOnDenseFeatures.zip");
ML.Model.Save(model, data.Schema, modelAndSchemaPath);
var loadedModel = ML.Model.Load(modelAndSchemaPath, out var schema);
var castedModel = loadedModel as RegressionPredictionTransformer<LinearRegressionModelParameters>;
pfi = ML.Regression.PermutationFeatureImportance(castedModel, data);
}
else
{
pfi = ML.Regression.PermutationFeatureImportance(model, data);
}
#Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


In reply to: 401428872 [](ancestors = 401428872)

var alien = new AlienHero("TEN.LM", 1, 2, 3, 4, 5, 6, 7, 8);
var superAlien = engine.Predict(alien);

Expand Down