Skip to content

Changed Binarizer node to be cast to the type of the predicted label … #4818

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 11, 2020
Merged

Changed Binarizer node to be cast to the type of the predicted label … #4818

merged 3 commits into from
Feb 11, 2020

Conversation

harishsk
Copy link
Contributor

@harishsk harishsk commented Feb 8, 2020

…column's data type

In BinaryClassifierScorer's SaveAsOnnxCore function we were always casting the output of the Binarizer to a bool. But in some cases BinaryClassifierScorer can output a key value (uint) and in this case we should cast the output to a uint. This fix changes the cast to be dependent on the output type of the predicted label.

@harishsk harishsk requested a review from Lynx1820 February 8, 2020 09:10
@harishsk harishsk requested a review from a team as a code owner February 8, 2020 09:10
@harishsk harishsk requested a review from ganik February 8, 2020 09:10
var t = InternalDataKindExtensions.ToInternalDataKind(DataKind.Boolean).ToType();
node.AddAttribute("to", t);
var predictedLabelCol = OutputSchema.GetColumnOrNull(outColumnNames[0]);
node.AddAttribute("to", predictedLabelCol.HasValue ? predictedLabelCol.Value.Type.RawType : typeof(bool));
Copy link

@yaeldekel yaeldekel Feb 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

predictedLabelCol.HasValue [](start = 36, length = 26)

Doesn't it always have a value?
Or, if it doesn't - should we be adding the ONNX node at all? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added an Assert to capture that and fixed the next line.


In reply to: 376929923 [](ancestors = 376929923)

@yaeldekel
Copy link

yaeldekel commented Feb 10, 2020

         */

I know this isn't related to the change in this PR, but is this correct? The binary classifier scorer has a _threshold field, and in the constructor you can specify which column (score/probability) you want to apply the threshold to.

While I could not find a way in the public API to change which column is used (by default it is actually the score column, even if the probability column exists), there is a public API to change the threshold value: ChangeModelThreshold #Resolved


Refers to: src/Microsoft.ML.Data/Scorers/BinaryClassifierScorer.cs:203 in fc1925c. [](commit_id = fc1925c, deletion_comment = False)

@ganik
Copy link
Member

ganik commented Feb 10, 2020

         */

Yep, this is not correct. Predicted label has to be based off threshold.
ONNX code should mimic what GetPredictedLabelCore(..) does on Line 274


In reply to: 584023200 [](ancestors = 584023200)


Refers to: src/Microsoft.ML.Data/Scorers/BinaryClassifierScorer.cs:203 in fc1925c. [](commit_id = fc1925c, deletion_comment = False)

@harishsk
Copy link
Contributor Author

         */

You are right! I have fixed it in the next commit.


In reply to: 584286522 [](ancestors = 584286522,584023200)


Refers to: src/Microsoft.ML.Data/Scorers/BinaryClassifierScorer.cs:203 in fc1925c. [](commit_id = fc1925c, deletion_comment = False)

@harishsk harishsk merged commit dc4e5f8 into dotnet:master Feb 11, 2020
@harishsk harishsk deleted the binarizerBug branch April 21, 2020 23:59
@ghost ghost locked as resolved and limited conversation to collaborators Mar 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants