-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Expression estimator/transformer #4548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4548 +/- ##
==========================================
+ Coverage 75.12% 75.48% +0.35%
==========================================
Files 913 934 +21
Lines 160855 167656 +6801
Branches 17303 18129 +826
==========================================
+ Hits 120848 126550 +5702
- Misses 35165 36067 +902
- Partials 4842 5039 +197
|
|
||
// A pipeline that applies various expressions to the input columns. | ||
var pipeline = mlContext.Transforms.Expression("Expr1", "(x,y)=>log(y)+x", "Float", "FloatVector") | ||
.Append(mlContext.Transforms.Expression("Expr2", "(b,s,i)=>b ? len(s) : i", "Boolean", "StringVector", "Int")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When first reading this I thought (incorrectly) we had to note the data types for inputs/outputs. Can we use non-datatypes (eg. HasSiblings
) or longer names (eg. BooleanExampleColumn
)?
.Append(mlContext.Transforms.Expression("Expr2", "(b,s,i)=>b ? len(s) : i", "Boolean", "StringVector", "Int")) | |
.Append(mlContext.Transforms.Expression("Expr2", "(b,s,i)=>b ? len(s) : i", "HasSiblings", "PreferredGreetings", "Age")) | |
``` #Resolved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing
#resolved #Resolved
if (a == 0) | ||
{ | ||
if (b == 0) | ||
return 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems inline with other implementations.
Had to look up: https://en.wikipedia.org/wiki/Zero_to_the_power_of_zero#IEEE_floating-point_standard #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is documented in the md file (line 57). Do you think it needs to be made clearer?
In reply to: 360956933 [](ancestors = 360956933)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, looks good to me.
Mainly was noting so others don't have to look up the normal output for 0^0
. #Resolved
} | ||
catch (OverflowException) | ||
{ | ||
throw Contracts.Except("Overflow"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we be better with saturation arithmetic instead of throwing on overflow? Might be more of a PM-ish question of: is it better for a ML package to throw on semi-bad input or should we just do our best? I'm worried that this will throw in production when a model receives an unexpectedly large input (ML data is dirty).
throw Contracts.Except("Overflow"); | |
res = (neg ? Int64.MinValue : Int64.MaxValue); |
// If someone is peeking at ich, they should have peeked everything up to ich. | ||
Contracts.Assert(0 < dich && dich <= _ichLim - _ichNext + 1); | ||
|
||
int ich = dich + _ichNext - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are our variables names in German?
- "ich" => "I"
- "dich" => "you"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing to fix. Just an interesting note.
From, | ||
All, | ||
Where, | ||
Convolve, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the Net# tokens being kept for internal compatibility? Otherwise we could remove the non-used tokens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved in follow-up PR: https://github.com/dotnet/machinelearning/pull/4614/files#r362319272
#Resolved
Assert.Equal(7, transformed.Schema["Expr6"].Type.GetValueCount()); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the riveting 406 page PR! Truly epic.
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
public static R4 Exp(R4 a) | ||
{ | ||
return (R4)Math.Exp(a); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How difficult would it be to output an ONNX graph instead of (or in addition to) building up .NET?
Background:
The AutoML team is looking to support basic math functionality like: LN(x)
, x-y
, and EXP(x)
. They could use the expression transform, though they need ONNX support. I'm hoping to avoid creating three new transforms for these operations by using the expression transform. This will avoid an endlessly growling list of tiny, single function, math transforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue tracked: #4615
|
||
## Operators | ||
|
||
The operators of the expression language are listed in the following table, in precendence order. Unless otherwise noted, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should mention how the expression handles vector columns. Hopefully we still operate on them :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#Resolved
|
||
| **Name** | **Meaning** | **Comments** | | ||
| --- | --- | --- | | ||
| bool | convert to BL | The operand must be text or boolean. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BL [](start = 22, length = 2)
we don't have these types in our public API I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved in follow-up PR: https://github.com/dotnet/machinelearning/pull/4614/files#r362319131
#Resolved
/// <param name="outputColumnName">Name of the column resulting from the transformation of <paramref name="inputColumnNames"/>. | ||
/// This column's data type will be the same as that of the input column.</param> | ||
/// <param name="expression">The expression to apply to <paramref name="inputColumnNames"/> to create the column <paramref name="outputColumnName"/>.</param> | ||
/// <param name="inputColumnNames">The names of the input columns.</param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// The names of the input columns. [](start = 7, length = 75)
please add link to the sample
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved in follow-up PR: https://github.com/dotnet/machinelearning/pull/4614/files#r362319272
#Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixes #4015 .