Fairlearn #6539

LittleLittleCloud · 2023-01-13T20:23:31Z

We are excited to review your PR.

So we can do the best job, please check:

There's a descriptive title that will make sense to other developers some time from now.
There's associated issues. All PR's should have issue(s) associated - unless a trivial self-evident change such as fixing a typo. You can use the format Fixes #nnnn in your description to cause GitHub to automatically close the issue(s) when your PR is merged.
Your change description explains what the change does, why you chose your approach, and anything else that reviewers should know.
You have included any necessary tests in the same PR.

Might be helpful to resolve #1912

Integreate Fairness .NET APIs #6512

…) function

Added `inPlace:true` to the append to fix the bug that no columns are added; Added unit testing to test the basic funcitonality of Metrc.Regression.ByGroup()

Added MSE and MAE (MeanAbsoluteError), and included tests for MSE and RMS

RMS and MSE fully supported by Fairlearn.Metric.Regression now accross all three functions

created the moment class and utilityParity class. Initial commit

Passed the initial unit tests for Demographic Parity. Every class is made public which needs to be changed in the future. Utility Parity still needs to be changed for other parities to work

…yClassificationSearchSpaceGenerator update

…lculating the signed weights

There is still a lot of development needed to be done with the moment class

…odels

…search getting sensitive feature column names directly from a getter function, adjusted default value for the option to a random value

built the first prototype for the gridSearchTrialRunner. We have enabled a seperate training set for training and testing set for validating the result, which is a different approach from the original implementation.

Added a new feature so that users can look up row item by the name of the column instead of the raw row index

…ridsearch" This reverts commit a9f07a9.

… fairlearn In fairlearn AutoML, we have to add in a singleton to the serviceCollection called moment, which will be later extracted to calculate the fairness parity.

Added an if statement to output fairlearn metric if using fairlearn

…nsion in fairlearn" This reverts commit 328f718.

the experiment is able to add a moment to its serviceCollection which is later used to calculate fairlearn parity.

Created a tuner so that we can go through the search space through the gridsearch algorithm

src/Microsoft.ML.Fairlearn/AutoML/AutoMLExperimentExtension.cs

src/Microsoft.ML.Fairlearn/MLContextExtension.cs

src/Microsoft.ML.Fairlearn/metrics/FairlearnMetricCatalog.cs

michaelgsharp · 2023-05-04T23:18:25Z

src/Microsoft.ML.Fairlearn/metrics/FairlearnMetricCatalog.cs

+            // get all the columns of the schema
+            DataViewSchema columns = _eval.Schema;
+
+            // TODO: is converting IDataview to DataFrame the best practice?


Its not the best practice, especially when input files (depending on the trainer) can be streamed in and can be larger than the memory on the machine. Its probably fine to start out this way, but it will be something that is needed to change later.

src/Microsoft.ML.Fairlearn/metrics/FairlearnMetricCatalog.cs

michaelgsharp · 2023-05-04T23:19:33Z

src/Microsoft.ML.Fairlearn/reductions/Utilities.cs

+using Microsoft.ML.SearchSpace;
+using Microsoft.ML.SearchSpace.Option;
+
+namespace Microsoft.ML.Fairlearn.reductions


Did you mean to have the namespace be reductions with a lowercase r? Would be better to be a capital.

@LittleLittleCloud looks like its still lowecase r. Is there a reason for it to be that way?

Nope there's no reason for that but a mistake. I thought I fixed it but maybe not. Let me check

src/Microsoft.ML.Fairlearn/reductions/UtilityParity.cs

test/Microsoft.ML.Fairlearn.Tests/MetricTest.cs

test/Microsoft.ML.Fairlearn.Tests/UtilityTest.cs

michaelgsharp

Mostly minor formatting changes, but the namespace needs to be addressed.

…inelearning into fairlearn

codecov · 2023-05-10T23:19:56Z

Codecov Report

Merging #6539 (dda3785) into main (247f3a0) will increase coverage by 0.19%.
The diff coverage is 82.51%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6539      +/-   ##
==========================================
+ Coverage   68.59%   68.78%   +0.19%     
==========================================
  Files        1201     1215      +14     
  Lines      250296   251523    +1227     
  Branches    26094    26227     +133     
==========================================
+ Hits       171681   173006    +1325     
+ Misses      71802    71705      -97     
+ Partials     6813     6812       -1

Flag	Coverage Δ
Debug	`68.78% <82.51%> (+0.19%)`	⬆️
production	`63.30% <80.75%> (+0.22%)`	⬆️
test	`88.87% <87.01%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/Microsoft.ML.Fairlearn/AutoML/TunerFactory.cs	`0.00% <0.00%> (ø)`
...Microsoft.ML.AutoML.Tests/AutoMLExperimentTests.cs	`83.93% <0.00%> (-4.93%)`	⬇️
src/Microsoft.Data.Analysis/DataFrameRow.cs	`66.66% <50.00%> (-3.71%)`	⬇️
...t.ML.Fairlearn/AutoML/AutoMLExperimentExtension.cs	`66.66% <66.66%> (ø)`
src/Microsoft.ML.Fairlearn/Reductions/Moment.cs	`68.18% <68.18%> (ø)`
...Microsoft.ML.Fairlearn/Reductions/UtilityParity.cs	`79.71% <79.71%> (ø)`
...oft.ML.Fairlearn/Metrics/FairlearnMetricCatalog.cs	`88.02% <88.02%> (ø)`
src/Microsoft.ML.Fairlearn/FairlearnCatalog.cs	`100.00% <100.00%> (ø)`
src/Microsoft.ML.Fairlearn/MLContextExtension.cs	`100.00% <100.00%> (ø)`
...t.ML.Fairlearn/Reductions/GridSearchTrialRunner.cs	`100.00% <100.00%> (ø)`
... and 4 more

... and 44 files with indirect coverage changes

Jordi Ramos and others added 30 commits June 13, 2022 14:07

same as last commit message

96ac1ea

using xiaoyun\'s .rows.gropy() implementation method for the ByGroup(…

88eb579

…) function

adding the regression catalog

6dad117

added the fairlearn extension

f76e866

Fixed the dataframe column adding bug; added unit testing

61956ab

Added `inPlace:true` to the append to fix the bug that no columns are added; Added unit testing to test the basic funcitonality of Metrc.Regression.ByGroup()

Added a Metric Test; Added difference between groups in regression

56ea872

Added more metrics to the regression

bac073d

Added MSE and MAE (MeanAbsoluteError), and included tests for MSE and RMS

RMS and MSE support

16aedd8

RMS and MSE fully supported by Fairlearn.Metric.Regression now accross all three functions

Moment and UtilityParity

b626f67

created the moment class and utilityParity class. Initial commit

Demographic Parity initial unit tests passed

707cfdc

Passed the initial unit tests for Demographic Parity. Every class is made public which needs to be changed in the future. Utility Parity still needs to be changed for other parities to work

update

99a0f4c

Merge pull request dotnet#2 from LittleLittleCloud/u/xiaoyun/addBinar…

9681062

…yClassificationSearchSpaceGenerator update

Added the rest of the binary metrics at byGroup()

77db51c

Fixed a typo in the comment

cc383e7

Started on GridSearch, finished implementing the first approach to ca…

e0d4769

…lculating the signed weights

Turned the moment class to an abstract class

048eaa8

There is still a lot of development needed to be done with the moment class

Added signed weights, fixed ordering in Gamma

e2e0352

Added assembly reference for the AutoML to gain access to different m…

be1919d

…odels

fix sensitive feature column name bug, change default value for grid …

c690290

…search getting sensitive feature column names directly from a getter function, adjusted default value for the option to a random value

Created a trail runner for grid search

426385f

built the first prototype for the gridSearchTrialRunner. We have enabled a seperate training set for training and testing set for validating the result, which is a different approach from the original implementation.

Added AutoMLExperimentExtension to enable AutoML methods on gridsearch

a9f07a9

Added row item lookup by columnname

168a808

Added a new feature so that users can look up row item by the name of the column instead of the raw row index

Revert "Added AutoMLExperimentExtension to enable AutoML methods on g…

4514678

…ridsearch" This reverts commit a9f07a9.

made serviceCollection internal to be accessed by AutoML extension in…

328f718

… fairlearn In fairlearn AutoML, we have to add in a singleton to the serviceCollection called moment, which will be later extracted to calculate the fairness parity.

Updated IMonitor for fairlearn

07e2cb0

Added an if statement to output fairlearn metric if using fairlearn

added FairnessTrialResulst for fairnessMetric

b857dc1

Revert "made serviceCollection internal to be accessed by AutoML exte…

debd1e6

…nsion in fairlearn" This reverts commit 328f718.

Added an extension for AutoML experiment

373463e

the experiment is able to add a moment to its serviceCollection which is later used to calculate fairlearn parity.

Added a Fairlearn AutoMLTuner

edf913f

Created a tuner so that we can go through the search space through the gridsearch algorithm

updated Project reference

c633360

LittleLittleCloud and others added 6 commits April 20, 2023 16:24

Merge branch 'main' into fairlearn

a528634

disable fairlearn test in automl as it's too costy

e5a9d21

Merge branch 'main' into fairlearn

88efd00

fix build issue

5c4b49a

fix test

c4e5f05

Update GridSearchTest.cs

5eb713a