-
Notifications
You must be signed in to change notification settings - Fork 1.9k
StopWordsRemovingEstimator export to Onnx #5279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
{ | ||
var opType = "Squeeze"; | ||
var squeezeOutput = ctx.AddIntermediateVariable(null, "SqueezeOutput", true); | ||
var node = ctx.CreateNode(opType, srcVariableName, squeezeOutput, ctx.GetNodeName(opType), ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to avoid skipping the shape and type addition? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I skipped the shape is because the shape of the tokenized word vector is not known prior to inference time. The number of words tokenized may be any number. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other text transformers that also infer shape:
var opType = "Squeeze"; | ||
var squeezeOutput = ctx.AddIntermediateVariable(null, "SqueezeOutput", true); | ||
var node = ctx.CreateNode(opType, srcVariableName, squeezeOutput, ctx.GetNodeName(opType), ""); | ||
node.AddAttribute("axes", new long[] { 0 }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in this PR, but in a different PR it maybe worth considering changing the default domain for CreateNode to be "ai.onnx" and not "ai.onnx.ml". The latter has very few ops and we mostly use operators from "ai.onnx" and it makes sense to retain that as the default.
Can you check whether the number of elements returned is the same? |
Codecov Report
@@ Coverage Diff @@
## master #5279 +/- ##
=======================================
Coverage 73.68% 73.68%
=======================================
Files 1022 1022
Lines 190366 190348 -18
Branches 20474 20472 -2
=======================================
+ Hits 140265 140267 +2
+ Misses 44568 44548 -20
Partials 5533 5533
|
I think this already done as part of the onnx testing framework: |
Can you reshape the returned vector in SaveAsOnnx to be compatible with ML.NET? In reply to: 653235748 [](ancestors = 653235748) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: The decision was address the issue mentioned above in another PR.